627 lines
16 KiB
Markdown
627 lines
16 KiB
Markdown
# 社交媒体自动发布器 - 项目方案
|
|
|
|
## 1. 项目概述
|
|
|
|
### 1.1 项目目标
|
|
构建一个统一的社交媒体自动化发布平台,支持以下功能:
|
|
- **登录管理**:小红书、抖音 账号登录与Cookie管理
|
|
- **内容发布**:小红书笔记/视频、抖音视频的自动化发布
|
|
- **反检测机制**:模仿人类操作,降低自动化风险
|
|
- **批量操作**:支持多账号、多内容批量发布
|
|
|
|
### 1.2 支持平台与功能
|
|
|
|
| 平台 | 登录 | 图文笔记 | 视频发布 | 批量发布 |
|
|
|------|------|----------|----------|----------|
|
|
| 小红书 | ✅ | ✅ | ✅ | ✅ |
|
|
| 抖音 | ✅ | ❌ | ✅ | ✅ |
|
|
|
|
## 2. 项目架构设计
|
|
|
|
### 2.1 整体架构
|
|
```
|
|
social_media_auto_publisher/
|
|
├── core/ # 核心引擎
|
|
│ ├── publisher.py # 统一发布管理器
|
|
│ ├── session_manager.py # 会话管理
|
|
│ └── task_scheduler.py # 任务调度器
|
|
├── platforms/ # 平台适配层
|
|
│ ├── xiaohongshu/ # 小红书适配器
|
|
│ └── douyin/ # 抖音适配器
|
|
├── auth/ # 认证模块
|
|
│ ├── base_auth.py # 基础认证类
|
|
│ ├── xiaohongshu_auth.py # 小红书认证
|
|
│ └── douyin_auth.py # 抖音认证
|
|
├── utils/ # 工具模块
|
|
│ ├── browser.py # 浏览器管理
|
|
│ ├── human_behavior.py # 人类行为模拟
|
|
│ ├── media_handler.py # 媒体文件处理
|
|
│ └── logger.py # 日志管理
|
|
├── config/ # 配置管理
|
|
│ ├── settings.py # 系统配置
|
|
│ └── platform_config.py # 平台配置
|
|
├── tests/ # 测试用例
|
|
├── examples/ # 使用示例
|
|
├── docs/ # 文档
|
|
└── media/ # 媒体文件存储
|
|
├── images/
|
|
└── videos/
|
|
```
|
|
|
|
### 2.2 核心设计模式
|
|
|
|
#### 2.2.1 适配器模式 (Adapter Pattern)
|
|
```python
|
|
# 统一的平台接口
|
|
class PlatformAdapter:
|
|
def login(self) -> bool
|
|
def publish_content(self, content: Content) -> bool
|
|
def get_upload_status(self) -> UploadStatus
|
|
|
|
# 各平台适配器
|
|
class XiaoHongShuAdapter(PlatformAdapter)
|
|
class DouyinAdapter(PlatformAdapter)
|
|
```
|
|
|
|
#### 2.2.2 工厂模式 (Factory Pattern)
|
|
```python
|
|
class PlatformAdapterFactory:
|
|
@staticmethod
|
|
def create_adapter(platform: PlatformType) -> PlatformAdapter
|
|
```
|
|
|
|
#### 2.2.3 策略模式 (Strategy Pattern)
|
|
```python
|
|
class HumanBehaviorStrategy:
|
|
def typing_behavior(self, text: str) -> None
|
|
def clicking_behavior(self, element) -> None
|
|
def waiting_behavior(self, duration: float) -> None
|
|
```
|
|
|
|
### 2.3 数据流设计
|
|
```
|
|
用户输入 → 内容解析 → 平台选择 → 认证检查 → 内容发布 → 状态反馈
|
|
↓ ↓ ↓ ↓ ↓ ↓
|
|
API ContentModel Factory Auth Publisher Logger
|
|
```
|
|
|
|
## 3. 模块详细设计
|
|
|
|
### 3.1 核心引擎 (core/)
|
|
|
|
#### 3.1.1 统一发布管理器
|
|
```python
|
|
class Publisher:
|
|
"""统一发布管理器"""
|
|
def __init__(self):
|
|
self.adapters = {}
|
|
self.session_manager = SessionManager()
|
|
|
|
async def publish(self,
|
|
platform: PlatformType,
|
|
content: Content,
|
|
account: AccountInfo) -> PublishResult
|
|
|
|
async def batch_publish(self,
|
|
tasks: List[PublishTask]) -> List[PublishResult]
|
|
```
|
|
|
|
#### 3.1.2 会话管理器
|
|
```python
|
|
class SessionManager:
|
|
"""浏览器会话管理"""
|
|
def __init__(self):
|
|
self.active_sessions = {}
|
|
self.cookie_store = CookieStore()
|
|
|
|
async def get_session(self, platform: PlatformType, account: str) -> Browser
|
|
async def save_session(self, platform: PlatformType, account: str) -> None
|
|
```
|
|
|
|
#### 3.1.3 任务调度器
|
|
```python
|
|
class TaskScheduler:
|
|
"""异步任务调度"""
|
|
def __init__(self, max_concurrent: int = 3):
|
|
self.semaphore = asyncio.Semaphore(max_concurrent)
|
|
|
|
async def schedule_tasks(self, tasks: List[Task]) -> List[TaskResult]
|
|
```
|
|
|
|
### 3.2 平台适配层 (platforms/)
|
|
|
|
#### 3.2.1 小红书适配器
|
|
```python
|
|
class XiaoHongShuAdapter(PlatformAdapter):
|
|
"""小红书平台适配器"""
|
|
|
|
async def login(self, account: AccountInfo) -> bool:
|
|
"""扫码登录"""
|
|
|
|
async def publish_image_note(self, note: ImageNote) -> bool:
|
|
"""发布图文笔记"""
|
|
|
|
async def publish_video_note(self, note: VideoNote) -> bool:
|
|
"""发布视频笔记"""
|
|
```
|
|
|
|
#### 3.2.2 抖音适配器
|
|
```python
|
|
class DouyinAdapter(PlatformAdapter):
|
|
"""抖音平台适配器"""
|
|
|
|
async def login(self, account: AccountInfo) -> bool:
|
|
"""扫码登录"""
|
|
|
|
async def publish_video(self, video: VideoContent) -> bool:
|
|
"""发布视频内容"""
|
|
```
|
|
|
|
### 3.3 认证模块 (auth/)
|
|
|
|
#### 3.3.1 基础认证类
|
|
```python
|
|
class BaseAuth:
|
|
"""基础认证类"""
|
|
|
|
def __init__(self, platform: PlatformType):
|
|
self.platform = platform
|
|
self.cookie_path = f"auth/cookies/{platform.value}/"
|
|
|
|
async def check_login_status(self, page: Page) -> bool:
|
|
"""检查登录状态"""
|
|
|
|
async def save_cookies(self, page: Page, account: str) -> None:
|
|
"""保存Cookie"""
|
|
|
|
async def load_cookies(self, page: Page, account: str) -> bool:
|
|
"""加载Cookie"""
|
|
```
|
|
|
|
#### 3.3.2 平台认证实现
|
|
```python
|
|
class XiaoHongShuAuth(BaseAuth):
|
|
"""小红书认证实现"""
|
|
|
|
# 登录检测元素
|
|
LOGIN_INDICATORS = [
|
|
"[data-testid='avatar']",
|
|
".creator-center-avatar",
|
|
".user-info"
|
|
]
|
|
|
|
class DouyinAuth(BaseAuth):
|
|
"""抖音认证实现"""
|
|
|
|
# 登录检测元素
|
|
LOGIN_INDICATORS = [
|
|
".user-avatar",
|
|
".creator-workspace",
|
|
".publish-btn"
|
|
]
|
|
```
|
|
|
|
### 3.4 工具模块 (utils/)
|
|
|
|
#### 3.4.1 浏览器管理器
|
|
```python
|
|
class BrowserManager:
|
|
"""浏览器实例管理"""
|
|
|
|
def __init__(self, headless: bool = False):
|
|
self.headless = headless
|
|
self.playwright = None
|
|
self.browsers = {}
|
|
|
|
async def get_browser(self,
|
|
platform: PlatformType,
|
|
user_data_dir: str = None) -> Browser
|
|
|
|
async def inject_stealth_script(self, page: Page) -> None:
|
|
"""注入反检测脚本"""
|
|
```
|
|
|
|
#### 3.4.2 人类行为模拟器
|
|
```python
|
|
class HumanBehaviorSimulator:
|
|
"""人类行为模拟"""
|
|
|
|
def __init__(self, config: HumanBehaviorConfig):
|
|
self.typing_speed = config.typing_speed # 80-150ms/字符
|
|
self.pause_probability = config.pause_probability # 15%
|
|
self.random_delay_range = config.random_delay_range # 0.3-0.8s
|
|
|
|
async def human_type(self, page: Page, selector: str, text: str) -> None:
|
|
"""模拟人类输入"""
|
|
|
|
async def human_click(self, page: Page, selector: str) -> None:
|
|
"""模拟人类点击"""
|
|
|
|
async def random_scroll(self, page: Page, distance: int) -> None:
|
|
"""随机滚动"""
|
|
```
|
|
|
|
#### 3.4.3 媒体处理器
|
|
```python
|
|
class MediaHandler:
|
|
"""媒体文件处理"""
|
|
|
|
@staticmethod
|
|
def validate_image(file_path: str) -> bool:
|
|
"""验证图片格式和大小"""
|
|
|
|
@staticmethod
|
|
def validate_video(file_path: str) -> bool:
|
|
"""验证视频格式和大小"""
|
|
|
|
@staticmethod
|
|
def compress_image(file_path: str, target_size: int) -> str:
|
|
"""压缩图片"""
|
|
|
|
@staticmethod
|
|
def compress_video(file_path: str, target_size: int) -> str:
|
|
"""压缩视频"""
|
|
```
|
|
|
|
### 3.5 配置管理 (config/)
|
|
|
|
#### 3.5.1 系统配置
|
|
```python
|
|
class Settings:
|
|
"""系统配置"""
|
|
|
|
# 浏览器配置
|
|
BROWSER_HEADLESS = False
|
|
BROWSER_TIMEOUT = 30000 # 30秒
|
|
|
|
# 并发配置
|
|
MAX_CONCURRENT_TASKS = 3
|
|
TASK_RETRY_COUNT = 3
|
|
|
|
# 反检测配置
|
|
HUMAN_TYPING_SPEED = (80, 150) # ms/字符
|
|
RANDOM_PAUSE_PROBABILITY = 0.15
|
|
RANDOM_DELAY_RANGE = (0.3, 0.8) # 秒
|
|
|
|
# 文件大小限制
|
|
MAX_IMAGE_SIZE = 10 * 1024 * 1024 # 10MB
|
|
MAX_VIDEO_SIZE = 500 * 1024 * 1024 # 500MB
|
|
```
|
|
|
|
#### 3.5.2 平台配置
|
|
```python
|
|
class PlatformConfig:
|
|
"""平台特定配置"""
|
|
|
|
XIAOHONGSHU = {
|
|
'login_url': 'https://creator.xiaohongshu.com/',
|
|
'image_note_url': 'https://creator.xiaohongshu.com/publish/publish',
|
|
'video_note_url': 'https://creator.xiaohongshu.com/publish/video',
|
|
'max_images': 9,
|
|
'max_video_duration': 300, # 5分钟
|
|
}
|
|
|
|
DOUYIN = {
|
|
'login_url': 'https://creator.douyin.com/creator-micro/home',
|
|
'upload_url': 'https://creator.douyin.com/creator-micro/content/upload',
|
|
'max_video_duration': 600, # 10分钟
|
|
'supported_formats': ['mp4', 'mov', 'avi'],
|
|
}
|
|
```
|
|
|
|
## 4. 数据模型设计
|
|
|
|
### 4.1 内容模型
|
|
```python
|
|
@dataclass
|
|
class Content:
|
|
"""内容基类"""
|
|
title: str
|
|
description: str
|
|
tags: List[str]
|
|
visibility: str = "public" # public, private, friends
|
|
|
|
@dataclass
|
|
class ImageNote(Content):
|
|
"""图文笔记"""
|
|
images: List[str] # 图片路径列表
|
|
cover_image: str = None
|
|
|
|
@dataclass
|
|
class VideoContent(Content):
|
|
"""视频内容"""
|
|
video_path: str
|
|
cover_image: str = None
|
|
duration: int = None
|
|
```
|
|
|
|
### 4.2 账号模型
|
|
```python
|
|
@dataclass
|
|
class AccountInfo:
|
|
"""账号信息"""
|
|
platform: PlatformType
|
|
username: str
|
|
cookie_file: str
|
|
is_active: bool = True
|
|
|
|
@dataclass
|
|
class PublishTask:
|
|
"""发布任务"""
|
|
id: str
|
|
platform: PlatformType
|
|
account: AccountInfo
|
|
content: Content
|
|
scheduled_time: datetime = None
|
|
|
|
@dataclass
|
|
class PublishResult:
|
|
"""发布结果"""
|
|
task_id: str
|
|
success: bool
|
|
message: str
|
|
published_url: str = None
|
|
timestamp: datetime = None
|
|
```
|
|
|
|
## 5. API 设计
|
|
|
|
### 5.1 简单 API 接口
|
|
```python
|
|
# 初始化发布器
|
|
publisher = Publisher(headless=False)
|
|
|
|
# 登录小红书
|
|
await publisher.login_platform("xiaohongshu", "user1")
|
|
|
|
# 发布图文笔记
|
|
note = ImageNote(
|
|
title="我的旅行分享",
|
|
description="今天去了很美的地方...",
|
|
images=["path/to/image1.jpg", "path/to/image2.jpg"],
|
|
tags=["旅行", "美景", "分享"]
|
|
)
|
|
|
|
result = await publisher.publish("xiaohongshu", note, "user1")
|
|
|
|
# 批量发布
|
|
tasks = [
|
|
PublishTask("xiaohongshu", note, "user1"),
|
|
PublishTask("douyin", video, "user2"),
|
|
]
|
|
results = await publisher.batch_publish(tasks)
|
|
```
|
|
|
|
### 5.2 高级 API 接口
|
|
```python
|
|
# 配置化发布
|
|
config = {
|
|
"platforms": ["xiaohongshu", "douyin"],
|
|
"accounts": {
|
|
"xiaohongshu": ["user1", "user2"],
|
|
"douyin": ["user1"]
|
|
},
|
|
"schedule": {
|
|
"type": "interval", # interval, fixed_time
|
|
"interval": 3600, # 每小时发布一次
|
|
"start_time": "09:00",
|
|
"end_time": "21:00"
|
|
}
|
|
}
|
|
|
|
scheduler = TaskScheduler(config)
|
|
await scheduler.start_auto_publish(content_queue)
|
|
```
|
|
|
|
## 6. 反自动化检测策略
|
|
|
|
### 6.1 多层检测规避
|
|
1. **浏览器指纹伪装**
|
|
- 使用 stealth.min.js 脚本
|
|
- 修改 webdriver、navigator 等属性
|
|
- 随机化浏览器版本和插件信息
|
|
|
|
2. **行为模拟**
|
|
- 随机打字速度 (80-150ms/字符)
|
|
- 模拟真实点击和滚动
|
|
- 随机暂停和操作间隔
|
|
|
|
3. **环境伪装**
|
|
- 使用真实的 User-Agent
|
|
- 随机化视口大小
|
|
- 模拟真实网络环境
|
|
|
|
### 6.2 智能重试机制
|
|
```python
|
|
class RetryStrategy:
|
|
"""智能重试策略"""
|
|
|
|
def __init__(self):
|
|
self.max_retries = 3
|
|
self.backoff_factor = 2
|
|
self.retry_conditions = [
|
|
NetworkError,
|
|
TimeoutError,
|
|
ElementNotFoundError,
|
|
LoginRequiredError
|
|
]
|
|
|
|
async def execute_with_retry(self, func, *args, **kwargs):
|
|
"""带重试的执行"""
|
|
```
|
|
|
|
## 7. 错误处理与日志
|
|
|
|
### 7.1 异常类型定义
|
|
```python
|
|
class SocialMediaError(Exception):
|
|
"""基础异常类"""
|
|
|
|
class LoginFailedError(SocialMediaError):
|
|
"""登录失败异常"""
|
|
|
|
class UploadFailedError(SocialMediaError):
|
|
"""上传失败异常"""
|
|
|
|
class ContentRejectedError(SocialMediaError):
|
|
"""内容被拒绝异常"""
|
|
|
|
class RateLimitError(SocialMediaError):
|
|
"""频率限制异常"""
|
|
```
|
|
|
|
### 7.2 结构化日志
|
|
```python
|
|
# 日志级别和格式
|
|
LOG_FORMAT = (
|
|
"<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
|
|
"<level>{level: <8}</level> | "
|
|
"<cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - "
|
|
"<level>{message}</level>"
|
|
)
|
|
|
|
# 日志文件配置
|
|
logger.add(
|
|
"logs/{time:YYYY-MM-DD}.log",
|
|
rotation="1 day",
|
|
retention="30 days",
|
|
compression="zip",
|
|
level="INFO"
|
|
)
|
|
```
|
|
|
|
## 8. 安全考虑
|
|
|
|
### 8.1 数据安全
|
|
- Cookie 文件加密存储
|
|
- 敏感信息不记录到日志
|
|
- 定期清理临时文件
|
|
|
|
### 8.2 访问控制
|
|
- API 访问频率限制
|
|
- 账号操作间隔控制
|
|
- 异常访问检测
|
|
|
|
## 9. 部署与使用
|
|
|
|
### 9.1 环境要求
|
|
```txt
|
|
Python >= 3.8
|
|
Playwright >= 1.40.0
|
|
Chrome/Chromium 浏览器
|
|
```
|
|
|
|
### 9.2 安装步骤
|
|
```bash
|
|
# 克隆项目
|
|
git clone https://github.com/your-repo/social_media_auto_publisher.git
|
|
|
|
# 安装依赖
|
|
pip install -r requirements.txt
|
|
playwright install chromium
|
|
|
|
# 配置环境
|
|
cp config/settings.example.py config/settings.py
|
|
```
|
|
|
|
### 9.3 快速开始
|
|
```python
|
|
from social_media_auto_publisher import Publisher
|
|
|
|
async def main():
|
|
# 初始化发布器
|
|
publisher = Publisher(headless=False)
|
|
|
|
# 登录平台
|
|
await publisher.setup_platform("xiaohongshu")
|
|
|
|
# 发布内容
|
|
result = await publisher.publish_image_note(
|
|
title="测试笔记",
|
|
description="这是一个测试笔记",
|
|
images=["test.jpg"],
|
|
tags=["测试", "自动化"]
|
|
)
|
|
|
|
print(f"发布结果: {result}")
|
|
|
|
if __name__ == "__main__":
|
|
asyncio.run(main())
|
|
```
|
|
|
|
## 10. 开发计划
|
|
|
|
### 10.1 第一阶段 (2周)
|
|
- [ ] 基础架构搭建
|
|
- [ ] 小红书登录模块
|
|
- [ ] 小红书图文笔记发布
|
|
- [ ] 基础反检测机制
|
|
|
|
### 10.2 第二阶段 (2周)
|
|
- [ ] 抖音登录模块
|
|
- [ ] 抖音视频发布
|
|
- [ ] 批量发布功能
|
|
- [ ] 错误处理优化
|
|
|
|
### 10.3 第三阶段 (1周)
|
|
- [ ] 任务调度系统
|
|
- [ ] 配置管理完善
|
|
- [ ] 文档和示例
|
|
- [ ] 测试用例
|
|
|
|
### 10.4 第四阶段 (1周)
|
|
- [ ] 性能优化
|
|
- [ ] 安全加固
|
|
- [ ] 部署脚本
|
|
- [ ] 用户手册
|
|
|
|
## 11. 技术难点与解决方案
|
|
|
|
### 11.1 平台反爬虫机制
|
|
**难点**: 各平台频繁更新反自动化策略
|
|
**解决方案**:
|
|
- 建立快速响应机制,及时更新选择器和脚本
|
|
- 多策略备选方案,提高容错性
|
|
- 与社区保持同步,共享最新的反检测方案
|
|
|
|
### 11.2 跨平台兼容性
|
|
**难点**: 不同平台的页面结构和API差异
|
|
**解决方案**:
|
|
- 使用适配器模式隔离平台差异
|
|
- 抽象统一的操作接口
|
|
- 灵活的配置系统支持平台特性
|
|
|
|
### 11.3 稳定性和可靠性
|
|
**难点**: 网络波动、平台变更等导致的失败
|
|
**解决方案**:
|
|
- 完善的重试机制和错误恢复
|
|
- 详细的日志记录和监控
|
|
- 优雅的降级策略
|
|
|
|
## 12. 扩展性设计
|
|
|
|
### 12.1 新平台接入
|
|
通过实现 `PlatformAdapter` 接口,可以轻松添加新的社交平台支持:
|
|
|
|
```python
|
|
class BilibiliAdapter(PlatformAdapter):
|
|
"""B站适配器"""
|
|
async def login(self, account: AccountInfo) -> bool: ...
|
|
async def publish_video(self, video: VideoContent) -> bool: ...
|
|
```
|
|
|
|
### 12.2 功能扩展
|
|
- 内容智能生成 (AI文案、图片处理)
|
|
- 数据分析和报表
|
|
- 定时发布和内容规划
|
|
- 多账号统一管理
|
|
|
|
### 12.3 API扩展
|
|
- RESTful API 服务
|
|
- Web 管理界面
|
|
- 移动端支持
|
|
- 第三方集成
|
|
|
|
---
|
|
|
|
这个项目方案基于现有的 `xiaohongshu_note_publisher` 项目经验,设计了一个更加完善、可扩展的多平台社交媒体自动发布系统。通过模块化设计和现代化的架构,既能满足当前需求,又具备良好的扩展性和维护性。 |