246 lines
7.4 KiB
Markdown
246 lines
7.4 KiB
Markdown
|
|
# 项目清理指南 & 技术债务分析
|
|||
|
|
|
|||
|
|
> 更新日期: 2024-12-10
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 一、待清理的过时模块
|
|||
|
|
|
|||
|
|
### 1.1 可安全删除 (未被引用)
|
|||
|
|
|
|||
|
|
| 目录/文件 | 大小 | 说明 | 建议 |
|
|||
|
|
|----------|------|------|------|
|
|||
|
|
| `infrastructure/` | 64KB | 数据库基础设施,Python 端不再直接访问数据库 | 🗑️ 删除 |
|
|||
|
|
| `core/xhs_spider/` | 224KB | 旧版小红书爬虫,已被 MediaCrawler 替代 | 🗑️ 删除 |
|
|||
|
|
| `document/` | 88KB | 文档处理模块,未被主流程引用 | 🗑️ 删除 |
|
|||
|
|
| `core/document/` | 152KB | 旧版文档处理 | 🗑️ 删除 |
|
|||
|
|
| `node_modules/` | 17MB | Node.js 依赖,当前未使用 | 🗑️ 删除 |
|
|||
|
|
|
|||
|
|
### 1.2 已禁用但仍存在的路由
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# api/main.py 中已注释的路由:
|
|||
|
|
# app.include_router(tweet.router, ...) # 依赖旧模块
|
|||
|
|
# app.include_router(poster.router, ...) # 依赖旧模块
|
|||
|
|
# app.include_router(document.router, ...) # 依赖旧模块
|
|||
|
|
# app.include_router(data.router, ...) # 依赖数据库
|
|||
|
|
# app.include_router(integration.router, ...)
|
|||
|
|
# app.include_router(content_integration.router, ...)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
对应文件可考虑归档:
|
|||
|
|
- `api/routers/tweet.py` (17KB)
|
|||
|
|
- `api/routers/poster.py` (5.9KB)
|
|||
|
|
- `api/routers/document.py` (12KB)
|
|||
|
|
- `api/routers/data.py` (13KB)
|
|||
|
|
- `api/routers/integration.py` (14KB)
|
|||
|
|
- `api/routers/content_integration.py` (6.7KB)
|
|||
|
|
|
|||
|
|
### 1.3 巨型遗留服务文件
|
|||
|
|
|
|||
|
|
| 文件 | 行数 | 说明 |
|
|||
|
|
|-----|------|------|
|
|||
|
|
| `api/services/poster.py` | **3031 行** | 旧版海报服务,已被 `domain/poster/` 替代 |
|
|||
|
|
| `api/services/database_service.py` | 1054 行 | 数据库服务,Python 端不再需要 |
|
|||
|
|
| `api/services/integration_service.py` | 795 行 | 旧版集成服务 |
|
|||
|
|
| `api/services/tweet.py` | 756 行 | 旧版推文服务 |
|
|||
|
|
|
|||
|
|
### 1.4 临时文件目录
|
|||
|
|
|
|||
|
|
| 目录 | 大小 | 说明 |
|
|||
|
|
|-----|------|------|
|
|||
|
|
| `result/` | **943 MB** | 生成结果缓存 |
|
|||
|
|
| `assets/` | 215 MB | 资源文件 |
|
|||
|
|
| `data/` | 8.1 MB | 数据文件 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 二、当前技术债务
|
|||
|
|
|
|||
|
|
### 2.1 高优先级 🔴
|
|||
|
|
|
|||
|
|
| 问题 | 影响 | 建议 |
|
|||
|
|
|-----|------|------|
|
|||
|
|
| **巨型文件** `api/services/poster.py` (3031行) | 难以维护 | 已有新服务,可删除 |
|
|||
|
|
| **巨型文件** `poster/templates/vibrant_template.py` (78KB) | 复杂度高 | 拆分重构 |
|
|||
|
|
| **数据库服务残留** | 代码混乱 | 删除 `infrastructure/` 和 `database_service.py` |
|
|||
|
|
| **临时文件堆积** (943MB) | 磁盘占用 | 定期清理 `result/` |
|
|||
|
|
|
|||
|
|
### 2.2 中优先级 🟡
|
|||
|
|
|
|||
|
|
| 问题 | 影响 | 建议 |
|
|||
|
|
|-----|------|------|
|
|||
|
|
| **重复海报模块** | `api/services/poster.py` vs `domain/poster/` | 统一到 domain 层 |
|
|||
|
|
| **旧爬虫残留** `core/xhs_spider/` | 代码冗余 | 删除,使用 MediaCrawler |
|
|||
|
|
| **禁用路由文件** | 6 个文件无用 | 归档或删除 |
|
|||
|
|
| **文档过多** (22个 MD 文件) | 难以维护 | 合并精简 |
|
|||
|
|
|
|||
|
|
### 2.3 低优先级 🟢
|
|||
|
|
|
|||
|
|
| 问题 | 影响 | 建议 |
|
|||
|
|
|-----|------|------|
|
|||
|
|
| 配置文件分散 (9个 JSON) | 管理复杂 | 统一配置 |
|
|||
|
|
| 日志配置不统一 | 调试困难 | 统一日志格式 |
|
|||
|
|
| 测试覆盖不足 | 质量风险 | 补充测试 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 三、文档整理建议
|
|||
|
|
|
|||
|
|
### 3.1 保留的核心文档
|
|||
|
|
|
|||
|
|
| 文档 | 说明 |
|
|||
|
|
|-----|------|
|
|||
|
|
| `PROJECT_OVERVIEW.md` | 项目总览 ⭐ |
|
|||
|
|
| `PROJECT_STATUS.md` | 详细状态 |
|
|||
|
|
| `HOTSPOT_MODULE.md` | 热点模块 |
|
|||
|
|
| `NEXT_PHASE_PLAN.md` | 下阶段计划 |
|
|||
|
|
| `TECHNICAL_DEBT.md` | 技术债务 |
|
|||
|
|
|
|||
|
|
### 3.2 可归档的文档
|
|||
|
|
|
|||
|
|
| 文档 | 大小 | 原因 |
|
|||
|
|
|-----|------|------|
|
|||
|
|
| `ARCHITECTURE_V2_MAINTAINABLE.md` | 48KB | 设计文档,已实现 |
|
|||
|
|
| `ARCHITECTURE_REDESIGN.md` | 44KB | 设计文档,已实现 |
|
|||
|
|
| `ARCHITECTURE_ANALYSIS.md` | 38KB | 分析文档,已过时 |
|
|||
|
|
| `MIGRATION_PLAN.md` | 6KB | 迁移计划,已完成 |
|
|||
|
|
| `MIGRATION_PROGRESS.md` | 6KB | 迁移进度,已完成 |
|
|||
|
|
| `PSD_API.md` | 1B | 空文件 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 四、海报模块分析
|
|||
|
|
|
|||
|
|
### 4.1 当前架构
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
海报相关代码分布:
|
|||
|
|
|
|||
|
|
1. domain/poster/ # ✅ 新版轻量服务 (推荐使用)
|
|||
|
|
├── poster_service.py # 245 行,轻量服务入口
|
|||
|
|
├── poster_renderer.py # 渲染器
|
|||
|
|
├── template_manager.py # 模板管理
|
|||
|
|
└── fabric_generator.py # Fabric.js JSON 输出
|
|||
|
|
|
|||
|
|
2. poster/templates/ # 模板实现
|
|||
|
|
├── base_template.py # 基类
|
|||
|
|
├── vibrant_template.py # 1800+ 行,活力风格 ⚠️ 过大
|
|||
|
|
├── business_template.py # 商务风格
|
|||
|
|
└── collage_template.py # 拼图风格
|
|||
|
|
|
|||
|
|
3. api/services/poster.py # ❌ 旧版服务 (3031 行,已废弃)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4.2 问题分析
|
|||
|
|
|
|||
|
|
| 问题 | 说明 |
|
|||
|
|
|-----|------|
|
|||
|
|
| **vibrant_template.py 过大** | 78KB / 1800+ 行,包含大量硬编码逻辑 |
|
|||
|
|
| **重复代码** | `api/services/poster.py` 和 `domain/poster/` 功能重叠 |
|
|||
|
|
| **模板耦合** | 模板与渲染逻辑耦合严重 |
|
|||
|
|
|
|||
|
|
### 4.3 重构建议
|
|||
|
|
|
|||
|
|
参考 `docs/POSTER_REFACTOR_PLAN.md` 方案 C:
|
|||
|
|
|
|||
|
|
1. **拆分 vibrant_template.py**
|
|||
|
|
- 布局计算 → `layout_calculator.py`
|
|||
|
|
- 文字渲染 → `text_renderer.py`
|
|||
|
|
- 特效处理 → `effects.py`
|
|||
|
|
|
|||
|
|
2. **删除旧服务**
|
|||
|
|
- 删除 `api/services/poster.py`
|
|||
|
|
- 统一使用 `domain/poster/poster_service.py`
|
|||
|
|
|
|||
|
|
3. **模板配置化**
|
|||
|
|
- 将硬编码参数提取到 YAML 配置
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 五、清理执行计划
|
|||
|
|
|
|||
|
|
### 第一阶段: 安全清理 (无风险)
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 1. 删除空目录和空文件
|
|||
|
|
rm -rf /root/TravelContentCreator/node_modules
|
|||
|
|
rm /root/TravelContentCreator/docs/PSD_API.md
|
|||
|
|
|
|||
|
|
# 2. 清理临时文件 (保留最近 7 天)
|
|||
|
|
find /root/TravelContentCreator/result -mtime +7 -delete
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 第二阶段: 归档旧模块
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 创建归档目录
|
|||
|
|
mkdir -p /root/TravelContentCreator/_archived
|
|||
|
|
|
|||
|
|
# 移动旧模块
|
|||
|
|
mv infrastructure/ _archived/
|
|||
|
|
mv document/ _archived/
|
|||
|
|
mv core/xhs_spider/ _archived/
|
|||
|
|
mv core/document/ _archived/
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 第三阶段: 清理旧服务
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 归档旧服务文件
|
|||
|
|
mv api/services/poster.py _archived/
|
|||
|
|
mv api/services/database_service.py _archived/
|
|||
|
|
mv api/services/integration_service.py _archived/
|
|||
|
|
mv api/services/tweet.py _archived/
|
|||
|
|
|
|||
|
|
# 归档禁用的路由
|
|||
|
|
mv api/routers/tweet.py _archived/
|
|||
|
|
mv api/routers/data.py _archived/
|
|||
|
|
mv api/routers/integration.py _archived/
|
|||
|
|
mv api/routers/content_integration.py _archived/
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 六、清理后的目录结构
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
TravelContentCreator/
|
|||
|
|
├── api/ # API 层
|
|||
|
|
│ ├── main.py
|
|||
|
|
│ └── routers/
|
|||
|
|
│ ├── aigc.py # V2 AIGC 接口
|
|||
|
|
│ ├── hotspot.py # 热点数据
|
|||
|
|
│ ├── prompt.py # Prompt 管理
|
|||
|
|
│ └── reference.py # 参考文献
|
|||
|
|
│
|
|||
|
|
├── domain/ # 领域层
|
|||
|
|
│ ├── aigc/ # AIGC 引擎
|
|||
|
|
│ ├── hotspot/ # 热点数据
|
|||
|
|
│ ├── poster/ # 海报服务
|
|||
|
|
│ └── prompt/ # Prompt 管理
|
|||
|
|
│
|
|||
|
|
├── poster/ # 海报模板
|
|||
|
|
│ └── templates/
|
|||
|
|
│
|
|||
|
|
├── prompts/ # Prompt 定义
|
|||
|
|
├── config/ # 配置
|
|||
|
|
├── libs/ # 外部库
|
|||
|
|
│ └── MediaCrawler/
|
|||
|
|
│
|
|||
|
|
├── docs/ # 文档 (精简后)
|
|||
|
|
├── tests/ # 测试
|
|||
|
|
└── _archived/ # 归档 (可删除)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 七、磁盘空间预估
|
|||
|
|
|
|||
|
|
| 操作 | 预计释放 |
|
|||
|
|
|-----|---------|
|
|||
|
|
| 删除 `node_modules/` | 17 MB |
|
|||
|
|
| 清理 `result/` (7天前) | ~800 MB |
|
|||
|
|
| 归档旧模块 | ~500 KB |
|
|||
|
|
| **总计** | **~820 MB** |
|