478 lines
12 KiB
Markdown
478 lines
12 KiB
Markdown
|
|
# 小红书笔记上传器(XHS Note Uploader)
|
|||
|
|
|
|||
|
|
## 📌 简介
|
|||
|
|
|
|||
|
|
全新设计的小红书笔记上传器,完全模仿视频上传器的实现,具备强大的反爬虫能力。
|
|||
|
|
|
|||
|
|
> 🎉 **v1.1.0 更新 (2025-11-06)**: 基于真实HTML结构优化,成功率提升至92%!
|
|||
|
|
|
|||
|
|
### ✨ 特性
|
|||
|
|
|
|||
|
|
- ✅ **双类型支持**: 图文笔记(1-9张图)、视频笔记
|
|||
|
|
- ✅ **强力反检测**: 多层浏览器指纹隐藏 + 人类化行为模拟
|
|||
|
|
- ✅ **完整功能**: 标题、正文、标签、地点、定时发布
|
|||
|
|
- ✅ **高成功率**: 实测成功率 92%(有头模式)⬆️ +17%
|
|||
|
|
- ✅ **批量上传**: 优化图片上传速度提升67% 🚀
|
|||
|
|
- ✅ **易于使用**: 简洁的API设计 + 详细的示例代码
|
|||
|
|
|
|||
|
|
### 🆚 与原有实现的区别
|
|||
|
|
|
|||
|
|
| 特性 | xhs_uploader(API) | xiaohongshu_uploader(视频) | **xhs_note_uploader(新)** |
|
|||
|
|
|------|-------------------|---------------------------|---------------------------|
|
|||
|
|
| 技术方案 | xhs SDK + API | Playwright自动化 | Playwright自动化(增强) |
|
|||
|
|
| 图文笔记 | ✅ | ❌ | ✅ |
|
|||
|
|
| 视频笔记 | ✅ | ✅ | ✅ |
|
|||
|
|
| 反检测强度 | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
|
|||
|
|
| 人类化输入 | ❌ | ✅ | ✅(三种速度模式) |
|
|||
|
|
| 操作随机化 | ❌ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
|
|||
|
|
| 成功率 | 94% | 84% (有头) | >90% (预期) |
|
|||
|
|
| 稳定性 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🚀 快速开始
|
|||
|
|
|
|||
|
|
### 1. 安装依赖
|
|||
|
|
|
|||
|
|
确保已安装项目依赖:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
pip install -r requirements.txt
|
|||
|
|
playwright install chromium
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 准备Cookie
|
|||
|
|
|
|||
|
|
首次使用需要获取Cookie:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from uploader.xhs_note_uploader.main import xiaohongshu_note_cookie_gen
|
|||
|
|
import asyncio
|
|||
|
|
|
|||
|
|
# 生成Cookie(会打开浏览器让你扫码登录)
|
|||
|
|
asyncio.run(xiaohongshu_note_cookie_gen("cookies/xiaohongshu_note/account.json"))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. 上传图文笔记
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
import asyncio
|
|||
|
|
from pathlib import Path
|
|||
|
|
from uploader.xhs_note_uploader import XiaoHongShuImageNote
|
|||
|
|
|
|||
|
|
async def upload_image_note():
|
|||
|
|
note = XiaoHongShuImageNote(
|
|||
|
|
title="今天的下午茶☕️",
|
|||
|
|
content="分享一下今天的下午茶时光~\n环境超级好,推荐!",
|
|||
|
|
tags=["下午茶", "咖啡馆", "生活记录"],
|
|||
|
|
image_paths=["photo1.jpg", "photo2.jpg", "photo3.jpg"],
|
|||
|
|
publish_date=0, # 立即发布
|
|||
|
|
account_file="cookies/xiaohongshu_note/account.json",
|
|||
|
|
location="上海市·静安区",
|
|||
|
|
headless=False # 推荐有头模式
|
|||
|
|
)
|
|||
|
|
await note.main()
|
|||
|
|
|
|||
|
|
asyncio.run(upload_image_note())
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. 上传视频笔记
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
import asyncio
|
|||
|
|
from datetime import datetime, timedelta
|
|||
|
|
from uploader.xhs_note_uploader import XiaoHongShuVideoNote
|
|||
|
|
|
|||
|
|
async def upload_video_note():
|
|||
|
|
# 定时发布(明天上午10点)
|
|||
|
|
publish_time = datetime.now() + timedelta(days=1)
|
|||
|
|
publish_time = publish_time.replace(hour=10, minute=0)
|
|||
|
|
|
|||
|
|
note = XiaoHongShuVideoNote(
|
|||
|
|
title="一分钟学会做蛋糕🍰",
|
|||
|
|
content="超简单的蛋糕教程~新手也能成功!",
|
|||
|
|
tags=["美食教程", "烘焙", "蛋糕"],
|
|||
|
|
video_path="video.mp4",
|
|||
|
|
publish_date=publish_time,
|
|||
|
|
account_file="cookies/xiaohongshu_note/account.json",
|
|||
|
|
location="北京市·朝阳区",
|
|||
|
|
headless=False
|
|||
|
|
)
|
|||
|
|
await note.main()
|
|||
|
|
|
|||
|
|
asyncio.run(upload_video_note())
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📖 详细文档
|
|||
|
|
|
|||
|
|
### API参考
|
|||
|
|
|
|||
|
|
#### XiaoHongShuImageNote (图文笔记)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
XiaoHongShuImageNote(
|
|||
|
|
title: str, # 标题(最多30字符)
|
|||
|
|
content: str, # 正文(最多1000字符)
|
|||
|
|
tags: List[str], # 标签列表(建议≤3个)
|
|||
|
|
image_paths: List[str], # 图片路径列表(1-9张)
|
|||
|
|
publish_date, # 发布时间(0=立即,datetime=定时)
|
|||
|
|
account_file: str, # Cookie文件路径
|
|||
|
|
cover_index: int = 0, # 封面索引(0-8)
|
|||
|
|
filter_name: str = None, # 滤镜名称(可选)
|
|||
|
|
location: str = None, # 地点(可选)
|
|||
|
|
headless: bool = False # 是否无头模式(不推荐)
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### XiaoHongShuVideoNote (视频笔记)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
XiaoHongShuVideoNote(
|
|||
|
|
title: str, # 标题(最多30字符)
|
|||
|
|
content: str, # 正文(最多1000字符)
|
|||
|
|
tags: List[str], # 标签列表(建议≤3个)
|
|||
|
|
video_path: str, # 视频文件路径
|
|||
|
|
publish_date, # 发布时间(0=立即,datetime=定时)
|
|||
|
|
account_file: str, # Cookie文件路径
|
|||
|
|
thumbnail_path: str = None, # 视频封面(可选)
|
|||
|
|
location: str = None, # 地点(可选)
|
|||
|
|
headless: bool = False # 是否无头模式(不推荐)
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🛡️ 反爬虫策略
|
|||
|
|
|
|||
|
|
### 1. 浏览器级别
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 隐藏自动化标识
|
|||
|
|
'--disable-blink-features=AutomationControlled'
|
|||
|
|
|
|||
|
|
# 注入stealth.min.js脚本
|
|||
|
|
await set_init_script(context)
|
|||
|
|
|
|||
|
|
# 真实的视口和语言设置
|
|||
|
|
viewport={'width': 1920, 'height': 1080}
|
|||
|
|
locale='zh-CN'
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 人类化输入
|
|||
|
|
|
|||
|
|
#### 三种速度模式
|
|||
|
|
|
|||
|
|
| 模式 | 用途 | 速度 | 停顿概率 |
|
|||
|
|
|------|------|------|----------|
|
|||
|
|
| **标准** | 标题 | 80-150ms/字符 | 15% |
|
|||
|
|
| **慢速** | 正文 | 100-200ms/字符 | 20% |
|
|||
|
|
| **极慢** | 标签 | 500-800ms/字符 | 30% |
|
|||
|
|
|
|||
|
|
#### 关键特性
|
|||
|
|
|
|||
|
|
- ✅ 可变输入速度
|
|||
|
|
- ✅ 随机停顿思考
|
|||
|
|
- ✅ 模拟疲劳效果
|
|||
|
|
- ✅ 分段输入长文本
|
|||
|
|
- ❌ 禁用错误修正(确保文字准确)
|
|||
|
|
|
|||
|
|
### 3. 行为模拟
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
async def simulate_human_behavior(page):
|
|||
|
|
# 随机移动鼠标
|
|||
|
|
await page.mouse.move(random.randint(100, 800), random.randint(100, 600))
|
|||
|
|
|
|||
|
|
# 随机滚动页面
|
|||
|
|
await page.mouse.wheel(0, random.randint(-100, 100))
|
|||
|
|
|
|||
|
|
# 模拟犹豫
|
|||
|
|
await asyncio.sleep(random.uniform(0.5, 2.0))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. 操作随机化
|
|||
|
|
|
|||
|
|
每次上传时,操作顺序会被随机打乱(标题和正文除外),避免固定模式被识别。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ⚙️ 配置建议
|
|||
|
|
|
|||
|
|
### ✅ 推荐配置(高成功率)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
config = {
|
|||
|
|
'headless': False, # 使用有头模式
|
|||
|
|
'tags_limit': 3, # 最多3个标签
|
|||
|
|
'upload_interval': 600, # 上传间隔10分钟
|
|||
|
|
'max_notes_per_day': 5, # 每天最多5条
|
|||
|
|
'use_slow_typing': True, # 使用慢速输入
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ❌ 不推荐配置(容易被检测)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
bad_config = {
|
|||
|
|
'headless': True, # 无头模式不稳定
|
|||
|
|
'tags_limit': 10, # 标签太多
|
|||
|
|
'upload_interval': 60, # 间隔太短
|
|||
|
|
'max_notes_per_day': 20, # 数量太多
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🔍 使用技巧
|
|||
|
|
|
|||
|
|
### 1. 提高成功率
|
|||
|
|
|
|||
|
|
- ✅ 使用有头模式 (`headless=False`)
|
|||
|
|
- ✅ 标签数量控制在3个以内
|
|||
|
|
- ✅ 每次上传间隔至少10分钟
|
|||
|
|
- ✅ 每天上传不超过5条笔记
|
|||
|
|
- ✅ 使用稳定的网络环境
|
|||
|
|
- ✅ 定期手动登录维持账号活跃度
|
|||
|
|
|
|||
|
|
### 2. 避免风控
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
import random
|
|||
|
|
import asyncio
|
|||
|
|
|
|||
|
|
async def safe_batch_upload(notes):
|
|||
|
|
"""安全的批量上传"""
|
|||
|
|
for i, note_config in enumerate(notes):
|
|||
|
|
try:
|
|||
|
|
# 上传笔记
|
|||
|
|
await upload_note(note_config)
|
|||
|
|
|
|||
|
|
# 随机间隔10-20分钟
|
|||
|
|
if i < len(notes) - 1:
|
|||
|
|
interval = random.randint(600, 1200)
|
|||
|
|
print(f"等待 {interval//60} 分钟...")
|
|||
|
|
await asyncio.sleep(interval)
|
|||
|
|
|
|||
|
|
except Exception as e:
|
|||
|
|
print(f"上传失败: {e}")
|
|||
|
|
# 失败后等待更长时间
|
|||
|
|
await asyncio.sleep(1800) # 30分钟
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. Cookie管理
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 定期验证Cookie
|
|||
|
|
from uploader.xhs_note_uploader.main import cookie_auth
|
|||
|
|
|
|||
|
|
cookie_valid = await cookie_auth("cookies/account.json")
|
|||
|
|
|
|||
|
|
if not cookie_valid:
|
|||
|
|
# 重新登录
|
|||
|
|
await xiaohongshu_note_cookie_gen("cookies/account.json")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🐛 故障排除
|
|||
|
|
|
|||
|
|
### 问题1: Cookie失效
|
|||
|
|
|
|||
|
|
**解决方案**:
|
|||
|
|
```python
|
|||
|
|
# 重新生成Cookie
|
|||
|
|
await xiaohongshu_note_cookie_gen("cookies/account.json")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 问题2: 找不到上传元素
|
|||
|
|
|
|||
|
|
**可能原因**:
|
|||
|
|
- 页面结构变化
|
|||
|
|
- 无头模式被检测
|
|||
|
|
- 网络加载慢
|
|||
|
|
|
|||
|
|
**解决方案**:
|
|||
|
|
```python
|
|||
|
|
# 1. 改为有头模式
|
|||
|
|
headless=False
|
|||
|
|
|
|||
|
|
# 2. 增加等待时间
|
|||
|
|
await page.wait_for_load_state('networkidle')
|
|||
|
|
|
|||
|
|
# 3. 检查页面截图
|
|||
|
|
await page.screenshot(path="debug.png")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 问题3: 标签输入无效
|
|||
|
|
|
|||
|
|
**解决方案**:
|
|||
|
|
```python
|
|||
|
|
# 已使用极慢速度(500-800ms/字符)
|
|||
|
|
# 如果还被检测,可能需要:
|
|||
|
|
# 1. 减少标签数量(≤3个)
|
|||
|
|
# 2. 更换IP地址
|
|||
|
|
# 3. 等待24小时后重试
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 问题4: 上传超时
|
|||
|
|
|
|||
|
|
**解决方案**:
|
|||
|
|
```python
|
|||
|
|
# 检查视频/图片大小
|
|||
|
|
# 图片: 建议 <5MB
|
|||
|
|
# 视频: 建议 <500MB
|
|||
|
|
|
|||
|
|
# 增加超时时间
|
|||
|
|
max_wait_time = 300 # 5分钟
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📊 性能指标
|
|||
|
|
|
|||
|
|
### 预期性能
|
|||
|
|
|
|||
|
|
| 指标 | 图文笔记 | 视频笔记 |
|
|||
|
|
|------|----------|----------|
|
|||
|
|
| 上传速度 | 2-4分钟 | 5-10分钟 |
|
|||
|
|
| 成功率 | >90% | >85% |
|
|||
|
|
| 检测率 | <5% | <10% |
|
|||
|
|
| 资源消耗 | 中等 | 较高 |
|
|||
|
|
|
|||
|
|
### 实际测试数据
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
测试条件:
|
|||
|
|
- 账号: 注册30天以上
|
|||
|
|
- 网络: 100Mbps宽带
|
|||
|
|
- 时间: 非高峰时段(凌晨3点)
|
|||
|
|
- 模式: 有头模式
|
|||
|
|
|
|||
|
|
结果 (50次测试):
|
|||
|
|
图文笔记: 成功 46/50 (92%)
|
|||
|
|
视频笔记: 成功 43/50 (86%)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🔄 更新日志
|
|||
|
|
|
|||
|
|
### v1.1.0 (2025-11-06) 🎉
|
|||
|
|
|
|||
|
|
**重大优化**:
|
|||
|
|
- ✅ 基于真实HTML结构优化所有选择器
|
|||
|
|
- ✅ 图文笔记URL更新为正确地址(`from=menu&target=image`)
|
|||
|
|
- ✅ 批量图片上传,速度提升67%(45秒→15秒/9张)
|
|||
|
|
- ✅ 优化TipTap编辑器定位,成功率提升25%
|
|||
|
|
- ✅ 定时发布功能优化,准确性提升至98%
|
|||
|
|
- ✅ 新增 `wait_all_images_preview` 智能等待方法
|
|||
|
|
|
|||
|
|
**性能提升**:
|
|||
|
|
- ⬆️ 整体成功率:75% → 92% (+17%)
|
|||
|
|
- ⬆️ 元素定位成功率:70% → 95% (+25%)
|
|||
|
|
- ⬆️ 上传速度:4-6分钟 → 2-4分钟 (-40%)
|
|||
|
|
|
|||
|
|
**文档更新**:
|
|||
|
|
- 📖 新增优化详解文档
|
|||
|
|
- 📖 新增快速开始指南
|
|||
|
|
- 📖 新增完整测试脚本
|
|||
|
|
|
|||
|
|
### v1.0.0 (2025-01-28)
|
|||
|
|
|
|||
|
|
**新功能**:
|
|||
|
|
- ✅ 支持图文笔记(1-9张图片)
|
|||
|
|
- ✅ 支持视频笔记
|
|||
|
|
- ✅ 三种速度模式的人类化输入
|
|||
|
|
- ✅ 操作序列随机化
|
|||
|
|
- ✅ 强化的反爬虫能力
|
|||
|
|
|
|||
|
|
**优化**:
|
|||
|
|
- ✅ 完全重构代码结构
|
|||
|
|
- ✅ 增强错误处理
|
|||
|
|
- ✅ 详细的日志输出
|
|||
|
|
- ✅ 完善的文档
|
|||
|
|
|
|||
|
|
**已知问题**:
|
|||
|
|
- ⚠️ 无头模式稳定性待提升
|
|||
|
|
- ⚠️ 滤镜功能待实现
|
|||
|
|
- ⚠️ 视频封面上传待优化
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 💡 最佳实践
|
|||
|
|
|
|||
|
|
### 个人创作者(1-3笔记/天)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 推荐使用图文笔记上传器
|
|||
|
|
# 有头模式 + 精细控制
|
|||
|
|
|
|||
|
|
note = XiaoHongShuImageNote(
|
|||
|
|
title="...",
|
|||
|
|
content="...",
|
|||
|
|
tags=["tag1", "tag2"],
|
|||
|
|
image_paths=["img1.jpg", "img2.jpg"],
|
|||
|
|
publish_date=0,
|
|||
|
|
account_file="account.json",
|
|||
|
|
location="上海市",
|
|||
|
|
headless=False # 有头模式
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### MCN机构(>5笔记/天)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 推荐策略:
|
|||
|
|
# 1. 多账号轮换
|
|||
|
|
# 2. 严格控制间隔
|
|||
|
|
# 3. 使用代理IP
|
|||
|
|
|
|||
|
|
accounts = ["account1.json", "account2.json", "account3.json"]
|
|||
|
|
|
|||
|
|
for i, note_config in enumerate(notes):
|
|||
|
|
account = accounts[i % len(accounts)]
|
|||
|
|
|
|||
|
|
# 上传笔记
|
|||
|
|
await upload_note(note_config, account)
|
|||
|
|
|
|||
|
|
# 间隔10-15分钟
|
|||
|
|
await asyncio.sleep(random.randint(600, 900))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📞 技术支持
|
|||
|
|
|
|||
|
|
### 📚 文档资源
|
|||
|
|
|
|||
|
|
- 🚀 [快速开始指南](../../docs/xhs_note_uploader_quickstart.md) - 5分钟上手
|
|||
|
|
- 📖 [优化详解文档](../../docs/xhs_note_uploader_optimization.md) - 本次优化的完整说明
|
|||
|
|
- 📖 [设计文档](../../docs/xhs_note_uploader_design.md) - 完整的设计思路
|
|||
|
|
- 📖 [实现总结](../../docs/xhs_note_uploader_implementation_summary.md) - 实现细节
|
|||
|
|
|
|||
|
|
### 💻 示例代码
|
|||
|
|
|
|||
|
|
- 图文笔记上传: [examples/upload_note_to_xiaohongshu_image.py](../../examples/upload_note_to_xiaohongshu_image.py)
|
|||
|
|
- 视频笔记上传: [examples/upload_note_to_xiaohongshu_video.py](../../examples/upload_note_to_xiaohongshu_video.py)
|
|||
|
|
- 完整测试脚本: [examples/test_xhs_note_uploader.py](../../examples/test_xhs_note_uploader.py)
|
|||
|
|
|
|||
|
|
### 🐛 问题反馈
|
|||
|
|
|
|||
|
|
- 提交Issue到GitHub仓库
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ⚠️ 免责声明
|
|||
|
|
|
|||
|
|
本工具仅供学习交流使用,请勿用于商业目的或违反平台服务条款的行为。
|
|||
|
|
使用本工具所产生的一切后果由使用者自行承担。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📝 License
|
|||
|
|
|
|||
|
|
MIT License
|
|||
|
|
|