jinye_huang/TravelContentCreator

Template

Fork 0

jinye_huang 67722a5c72 修改了数据交互传递方式, 现在更适合API封装调用了

2025-04-22 18:14:31 +08:00

17 KiB

Raw Blame History

旅游内容创作工具 (Travel Content Creator)

这是一个基于AI的旅游内容自动生成工具，可以根据景点信息自动生成高质量的旅游推文和宣传海报。

功能特点

自动选题生成：根据提供的景点信息和配置的提示词模板，自动生成吸引人的旅游选题（以JSON格式输出）。
内容创作：基于选题和配置的提示词模板，自动生成文字内容（标题、正文）。
海报制作：结合景点图片和生成的文字内容，自动创建精美的宣传海报。
批量处理：支持一次性生成多个选题和多个变体内容。
模块化设计：核心功能（配置加载、提示词管理、AI交互、选题、内容生成、海报制作）分离，方便维护和扩展。
配置驱动：通过 poster_gen_config.json 文件集中管理所有运行参数。

安装

环境要求

Python 3.6+
依赖库安装：

# 安装依赖库
pip install numpy pandas opencv-python pillow openai
# 可能还需要安装AI模型的客户端库，例如 requests 或 特定模型的SDK
# pip install requests

项目依赖项

OpenCV (cv2)：图像处理
NumPy：数据操作
Pandas：数据处理
PIL (Pillow)：图像处理和绘制
OpenAI：AI模型交互 (即使是调用本地或其他API，通常也使用其接口)

核心组件与目录结构

main.py: 项目入口与协调器: 负责加载配置、生成运行ID (run_id)、初始化共享资源（如 AI Agent）、并按顺序调用 utils 模块中的函数来执行主要流程（选题生成 -> 内容与海报生成协调）。
poster_gen_config.json: (用户需从 example_config.json 创建) 核心配置文件。
core/: 核心算法与功能模块
- ai_agent.py: AI代理: 封装与大语言模型 API 的底层交互逻辑（发送请求、接收响应）。
- topic_parser.py: 选题解析器: 解析 AI 模型返回的选题 JSON 数据。
- contentGen.py: 内容处理器: 对 AI 生成的原始推文内容进行结构化处理，提取适用于海报的元素。
- posterGen.py: 海报生成器: 负责将图片和文字元素组合生成最终的海报图片，处理字体、布局等。
- simple_collage.py: 图片拼贴工具: 提供图片预处理和拼贴功能。
utils/: 工具与辅助模块
- resource_loader.py: 资源加载器: 负责加载项目所需的各种原始资源文件。
- prompt_manager.py: 提示词管理器: 集中管理不同阶段提示词的构建逻辑（已修正内容生成提示词构建逻辑，正确区分选题JSON中的文件名和描述性文本）。
- tweet_generator.py: 主要流程执行与数据结构: 包含执行选题生成 (run_topic_generation_pipeline)、单选题内容生成 (generate_content_for_topic)、单选题海报生成 (generate_posters_for_topic) 的核心函数，以及相关的数据类（已修正内容文件保存路径）。
genPrompts/: 内容生成提示词模板目录。
SelectPrompt/: 选题生成提示词模板目录（其中的 systemPrompt.txt 已更新为要求 JSON 输出）。
resource/: 存放基础的景点信息 .txt 文件等数据资源。
examples/: 使用示例和测试脚本。
result/: 默认输出结果保存目录。

项目流程与技术细节

加载配置与生成Run ID: main.py 读取 poster_gen_config.json 文件，并生成本次运行的 run_id。
选题生成:
- main.py 调用 utils.tweet_generator.run_topic_generation_pipeline(config, run_id)，传入配置和 run_id。
- 此函数内部：
  - 调用 utils.prompt_manager.PromptManager.get_topic_prompts() 构建提示词（系统提示词要求 AI 输出 JSON 格式）。
  - 初始化 core.ai_agent.AI_Agent。
  - 调用 AI Agent 生成选题文本（应为 JSON 字符串）。
  - 使用 core.topic_parser.parse_topics() 解析 JSON 字符串。
  - 保存选题结果 (tweet_topic.json) 到 result/<run_id>/ 目录。
  - 关闭此阶段的 AI Agent。
- 返回 run_id 和包含选题列表的 tweet_topic_record 给 main.py。
内容与海报生成协调: (在 main.generate_content_and_posters_step 中执行)
- 初始化 utils.prompt_manager.PromptManager。
- 初始化一个共享的 core.ai_agent.AI_Agent 实例供后续所有内容生成使用。
- 遍历上一步返回的每个选题 (topic_item)：
  - 内容生成: 调用 utils.tweet_generator.generate_content_for_topic()，传入共享的 AI Agent、Prompt Manager、配置、当前选题信息、run_id 和 topic_index。
    - 此函数内部循环生成该选题的所有内容变体，每次调用 prompt_manager.get_content_prompts() 获取特定提示词（根据修正后的逻辑构建，正确处理 topic_item 中的字段），并使用共享 AI Agent 执行 generate_single_content。
    - generate_single_content 将生成的文章 (article.json) 保存到正确的 result/<run_id>/<topic_index>_<variant_index>/ 目录下。
  - 海报生成: 如果内容生成成功，调用 utils.tweet_generator.generate_posters_for_topic()，传入配置、选题信息、生成的内容列表、run_id 和 topic_index。
    - 此函数内部负责初始化 ContentGenerator 和 PosterGenerator，调用 core 模块中的函数处理图片和文本，并生成所有海报变体到对应的 result/<run_id>/<topic_index>_<variant_index>/ 目录下。
- 所有选题处理完毕后，在 main.generate_content_and_posters_step 中关闭共享的 AI Agent。

使用方法

准备景点资源信息 (.txt 文件)，放入 resource/Object/ 目录（或其他在配置中指定的路径）。确保文件名与选题生成时 AI 可能选择的 object 名称匹配（目前使用包含匹配逻辑）。
准备图片资源，按照 README.md 中"开始使用"部分的说明组织图片目录结构，并确保 poster_gen_config.json 中的 image_base_dir 指向正确的图片根目录。
复制 example_config.json 为 poster_gen_config.json，并根据你的 API Key、模型端点、文件路径、图片根目录等进行修改。确保 prompts_dir 下的 Style/ 和 Demand/ 目录包含选题生成 AI 可能输出的文件名 (带 .txt 后缀)。确保 Refer/ 目录下包含所有需要的参考文件。

运行完整流程:

python main.py --config poster_gen_config.json

分阶段运行: (参考 examples/ 目录下的脚本，注意需要传递 run_id)

# 阶段 1: 仅生成选题
# 需要修改 examples/run_step1_topics.py 以接受或生成 run_id 并传递
# python examples/run_step1_topics.py 

# 阶段 2: 处理已生成的选题
# 需要修改 examples/run_step2_content_posters.py 以接受 run_id 参数
# python examples/run_step2_content_posters.py <your_run_id>

(注意: 示例脚本可能需要更新以适应 run_id 传递的变化)

配置说明

配置文件: poster_gen_config.json

必须配置项：

api_url: 大语言模型 API 地址 (或预设名称如 'vllm', 'ali', 'kimi', 'doubao', 'deepseek')
api_key: API 密钥
model: 使用的模型名称
topic_system_prompt: 选题生成系统提示词文件路径 (应为要求JSON输出的版本)
topic_user_prompt: 选题生成基础用户提示词文件路径
content_system_prompt: 内容生成系统提示词文件路径
resource_dir: 包含资源文件信息的列表。列表中的每个元素是一个字典，包含：
- type: 资源类型，目前支持 "Object" (景点/对象信息), "Description" (对应的描述文件), "Product" (关联产品信息)。
- file_path: 一个包含该类型所有资源文件完整路径的列表。
  - 对于 "Object" 类型，程序会根据选题中的对象名称在此列表中查找匹配的文件。
  - 对于 "Description" 类型，程序会根据选题中的对象名称在此列表中查找对应的描述文件 (文件名应包含对象名以便匹配)。
  - 对于 "Product" 类型，程序会根据选题中的产品名称在此列表中查找匹配的文件。
- num: (可选，目前似乎未使用) 文件数量。
prompts_dir: 存放 Demand/Style/Refer 等提示词片段的目录路径
output_dir: 输出结果保存目录路径
image_base_dir: 图片资源根目录绝对路径或相对路径 (用于查找源图片)
poster_assets_base_dir: 海报素材根目录绝对路径或相对路径 (用于查找字体、边框、贴纸、文本背景等)
num: （选题阶段）生成选题数量
variants: （内容生成阶段）每个选题生成的变体数量

可选配置项：

date: 日期标记（用于选题生成提示词，默认为空）
topic_temperature, topic_top_p, topic_presence_penalty: 选题生成 API 相关参数 (默认为 0.2, 0.5, 1.5)
content_temperature, content_top_p, content_presence_penalty: 内容生成 API 相关参数 (默认为 0.3, 0.4, 1.5)
request_timeout: AI API 请求的超时时间（秒，默认 30）
max_retries: 请求超时或可重试网络错误时的最大重试次数（默认 3）
camera_image_subdir: 存放原始照片的子目录名（相对于 image_base_dir，默认 "相机"） - 注意：此项不再用于查找描述文件。
modify_image_subdir: 存放处理后/用于拼贴的图片的子目录名（相对于 image_base_dir，默认 "modify"）
output_collage_subdir: 在每个变体输出目录中存放拼贴图的子目录名（默认 "collage_img"）
output_poster_subdir: 在每个变体输出目录中存放最终海报的子目录名（默认 "poster"）
output_poster_filename: 输出的最终海报文件名（默认 "poster.jpg"）
poster_target_size: 海报目标尺寸 [宽, 高]（默认 [900, 1200]）
text_possibility: 海报中第二段附加文字出现的概率 (默认 0.3)

项目提供了一个示例配置文件 example_config.json，请务必复制并修改：

注意事项

- 确保已安装所有依赖库，特别是 openai 库。
- 选题生成依赖于 AI 模型严格输出有效的 JSON 格式。如果 AI 输出格式错误，选题解析会失败。
- 内容生成依赖于 AI 在选题 JSON 中提供的 style 和 target_audience 文件名与 prompts_dir 下 Style/ 和 Demand/ 目录中的实际文件名（含.txt）完全一致。请检查这些目录和文件名。
- 图片目录结构和命名需严格符合预期，以便程序能找到对应景点的图片。
- AI生成内容的质量很大程度上取决于提示词的设计和输入资源信息的质量。
- 仔细检查 API Key、URL 和文件路径配置。
- 如果遇到问题，检查程序输出的日志信息和错误提示。

API 集成 / 流式输出使用 (Streaming Usage)

AI_Agent 类提供了 work_stream 方法，用于获取 AI 生成内容的流式输出。该方法返回一个 Python 生成器 (generator)，你可以迭代它来逐块获取 AI 生成的文本。

同时处理流式块并获取完整结果：

如果你既需要实时处理（或传输）每个文本块，又想在流结束后得到完整的拼接结果，可以在迭代生成器的同时进行拼接。

用法示例：

import os
import sys
import json

# 假设已正确设置 Python Path
from core.ai_agent import AI_Agent

# 1. 加载配置 (与 main.py 类似)
config_path = "poster_gen_config.json"
config = {}
try:
    with open(config_path, 'r', encoding='utf-8') as f:
        config = json.load(f)
except Exception as e:
    print(f"Error loading config: {e}")
    sys.exit(1)

# 2. 初始化 AI Agent (读取超时/重试配置)
ai_agent = None
try:
    request_timeout = config.get("request_timeout", 30)
    max_retries = config.get("max_retries", 3)
    ai_agent = AI_Agent(
        config["api_url"],
        config["model"],
        config["api_key"],
        timeout=request_timeout,
        max_retries=max_retries
    )

    # 3. 定义提示词和参数
    system_prompt = "You are a travel writer."
    user_prompt = "Describe the Great Wall of China in about 50 words."
    temperature = config.get("content_temperature", 0.7)
    top_p = config.get("content_top_p", 0.9)
    presence_penalty = config.get("content_presence_penalty", 1.0)
    file_folder = None # 可选的参考文件目录

    # 4. 调用 work_stream 获取生成器
    stream_generator = ai_agent.work_stream(
        system_prompt,
        user_prompt,
        file_folder,
        temperature,
        top_p,
        presence_penalty
    )

    # 5. 迭代生成器，处理块并拼接完整结果
    print("Streaming response:")
    full_response = "" # 初始化空字符串用于拼接
    for chunk in stream_generator:
        # 处理实时块，例如打印或发送给客户端
        print(chunk, end="", flush=True)
        # 拼接完整结果
        full_response += chunk

    print("\\n--- Stream finished ---")

    # 6. 使用拼接好的完整结果
    print("\\n--- Reconstructed Full Response ---")
    print(full_response)
    # 你可以在这里对 full_response 进行进一步处理

except Exception as e:
    print(f"An error occurred: {e}")
finally:
    if ai_agent:
        ai_agent.close()

这个模式允许你灵活地利用流式输出，同时在需要时也能访问到最终的完整文本。参考 examples/test_stream.py 获取可运行的示例（该示例也包含了拼接逻辑，只是默认注释了最后的打印）。

注意： 如果你只需要最终的完整结果而不需要流式处理，可以直接调用 ai_agent.work(...) 方法，它会内部处理好拼接并直接返回结果字符串。

Refactoring Complete: Decoupling Generation and Output Handling

To enhance the flexibility and extensibility of this tool, a refactoring effort has been completed to separate the core content/image generation logic from the output handling (previously, saving results directly to the local filesystem).

Motivation:

API Integration: Allow generated results (topics, text content, image URLs/data) to be easily returned via an API endpoint instead of only being saved locally.
Alternative Storage: Enable saving results to different backends like databases, cloud storage (e.g., S3, OSS), etc.
Modularity: Improve code structure by separating concerns.

Approach Taken:

Modified Core Functions: Functions responsible for generating topics, content, and posters (primarily in utils/tweet_generator.py, core/simple_collage.py, core/posterGen.py) have been updated:
- Topic generation now returns the generated data (run_id, topics_list, prompts).
- Image generation functions (simple_collage.process_directory, posterGen.create_poster) now return PIL Image objects instead of saving files.
- Content and poster generation workflows accept an OutputHandler instance to process results (content JSON, prompts, configurations, image data) immediately after generation.
Introduced Output Handlers: An "Output Handler" pattern has been implemented (utils/output_handler.py).
- An abstract base class (OutputHandler) defines methods for processing different types of results.
- A concrete implementation (FileSystemOutputHandler) replicates the original behavior of saving all results to the ./result/{run_id}/... directory structure.
Updated Main Workflow: The main script (main.py) now:
- Instantiates a specific OutputHandler (currently FileSystemOutputHandler).
- Calls the generation functions, passing the OutputHandler where needed.
- Uses the OutputHandler to process data returned by the topic generation step.
Reduced Config Dependency: Core logic functions (PromptManager, generate_content_for_topic, generate_posters_for_topic, etc.) now receive necessary configuration values as specific parameters rather than relying on the entire config dictionary, making them more independent and testable.

Future Possibilities:

This refactoring makes it straightforward to add new output handlers in the future, such as:

ApiOutputHandler: Formats results for API responses.
DatabaseOutputHandler: Stores results in a database.
CloudStorageOutputHandler: Uploads results (especially images) to cloud storage and potentially stores metadata elsewhere.

配置文件说明 (Configuration)

主配置文件为 poster_gen_config.json (可以复制 example_config.json 并修改)。主要包含以下部分：

17 KiB Raw Blame History Unescape Escape