Phase 7 Task 7: 插件与集成系统

- 创建 plugin_manager.py 模块 - PluginManager: 插件管理主类 - ChromeExtensionHandler: Chrome 插件处理 - BotHandler: 飞书/钉钉/Slack 机器人处理 - WebhookIntegration: Zapier/Make Webhook 集成 - WebDAVSync: WebDAV 同步管理 - 创建完整的 Chrome 扩展代码 - manifest.json, background.js, content.js, content.css - popup.html/js: 弹出窗口界面 - options.html/js: 设置页面 - 支持网页剪藏、选中文本保存、项目选择 - 更新 schema.sql 添加插件相关数据库表 - plugins: 插件配置表 - bot_sessions: 机器人会话表 - webhook_endpoints: Webhook 端点表 - webdav_syncs: WebDAV 同步配置表 - plugin_activity_logs: 插件活动日志表 - 更新 main.py 添加插件相关 API 端点 - GET/POST /api/v1/plugins - 插件管理 - POST /api/v1/plugins/chrome/clip - Chrome 插件保存网页 - POST /api/v1/bots/webhook/{platform} - 接收机器人消息 - GET /api/v1/bots/sessions - 机器人会话列表 - POST /api/v1/webhook-endpoints - 创建 Webhook 端点 - POST /webhook/{type}/{token} - 接收外部 Webhook - POST /api/v1/webdav-syncs - WebDAV 同步配置 - POST /api/v1/webdav-syncs/{id}/test - 测试 WebDAV 连接 - POST /api/v1/webdav-syncs/{id}/sync - 触发 WebDAV 同步 - 更新 requirements.txt 添加插件依赖 - beautifulsoup4: HTML 解析 - webdavclient3: WebDAV 客户端 - 更新 STATUS.md 和 README.md 开发进度
2026-02-23 12:09:15 +08:00
parent 08535e54ba
commit 797ca58e8e
27 changed files with 7350 additions and 11 deletions
--- a/backend/docs/multimodal_api.md
+++ b/backend/docs/multimodal_api.md
@@ -0,0 +1,308 @@
+# InsightFlow Phase 7 - 多模态支持 API 文档
+
+## 概述
+
+Phase 7 多模态支持模块为 InsightFlow 添加了处理视频和图片的能力，支持：
+
+1. **视频处理**：提取音频、关键帧、OCR 识别
+2. **图片处理**：识别白板、PPT、手写笔记等内容
+3. **多模态实体关联**：跨模态实体对齐和知识融合
+
+## 新增 API 端点
+
+### 视频处理
+
+#### 上传视频
+```
+POST /api/v1/projects/{project_id}/upload-video
+```
+
+**参数：**
+- `file` (required): 视频文件
+- `extract_interval` (optional): 关键帧提取间隔（秒），默认 5 秒
+
+**响应：**
+```json
+{
+  "video_id": "abc123",
+  "project_id": "proj456",
+  "filename": "meeting.mp4",
+  "status": "completed",
+  "audio_extracted": true,
+  "frame_count": 24,
+  "ocr_text_preview": "会议内容预览...",
+  "message": "Video processed successfully"
+}
+```
+
+#### 获取项目视频列表
+```
+GET /api/v1/projects/{project_id}/videos
+```
+
+**响应：**
+```json
+[
+  {
+    "id": "abc123",
+    "filename": "meeting.mp4",
+    "duration": 120.5,
+    "fps": 30.0,
+    "resolution": {"width": 1920, "height": 1080},
+    "ocr_preview": "会议内容...",
+    "status": "completed",
+    "created_at": "2024-01-15T10:30:00"
+  }
+]
+```
+
+#### 获取视频关键帧
+```
+GET /api/v1/videos/{video_id}/frames
+```
+
+**响应：**
+```json
+[
+  {
+    "id": "frame001",
+    "frame_number": 1,
+    "timestamp": 0.0,
+    "image_url": "/tmp/frames/video123/frame_000001_0.00.jpg",
+    "ocr_text": "第一页内容...",
+    "entities": [{"name": "Project Alpha", "type": "PROJECT"}]
+  }
+]
+```
+
+### 图片处理
+
+#### 上传图片
+```
+POST /api/v1/projects/{project_id}/upload-image
+```
+
+**参数：**
+- `file` (required): 图片文件
+- `detect_type` (optional): 是否自动检测图片类型，默认 true
+
+**响应：**
+```json
+{
+  "image_id": "img789",
+  "project_id": "proj456",
+  "filename": "whiteboard.jpg",
+  "image_type": "whiteboard",
+  "ocr_text_preview": "白板内容...",
+  "description": "这是一张白板图片。内容摘要：...",
+  "entity_count": 5,
+  "status": "completed"
+}
+```
+
+#### 批量上传图片
+```
+POST /api/v1/projects/{project_id}/upload-images-batch
+```
+
+**参数：**
+- `files` (required): 多个图片文件
+
+**响应：**
+```json
+{
+  "project_id": "proj456",
+  "total_count": 3,
+  "success_count": 3,
+  "failed_count": 0,
+  "results": [
+    {
+      "image_id": "img001",
+      "status": "success",
+      "image_type": "ppt",
+      "entity_count": 4
+    }
+  ]
+}
+```
+
+#### 获取项目图片列表
+```
+GET /api/v1/projects/{project_id}/images
+```
+
+### 多模态实体关联
+
+#### 跨模态实体对齐
+```
+POST /api/v1/projects/{project_id}/multimodal/align
+```
+
+**参数：**
+- `threshold` (optional): 相似度阈值，默认 0.85
+
+**响应：**
+```json
+{
+  "project_id": "proj456",
+  "aligned_count": 5,
+  "links": [
+    {
+      "link_id": "link001",
+      "source_entity_id": "ent001",
+      "target_entity_id": "ent002",
+      "source_modality": "video",
+      "target_modality": "document",
+      "link_type": "same_as",
+      "confidence": 0.95,
+      "evidence": "Cross-modal alignment: exact"
+    }
+  ],
+  "message": "Successfully aligned 5 cross-modal entity pairs"
+}
+```
+
+#### 获取多模态统计信息
+```
+GET /api/v1/projects/{project_id}/multimodal/stats
+```
+
+**响应：**
+```json
+{
+  "project_id": "proj456",
+  "video_count": 3,
+  "image_count": 10,
+  "multimodal_entity_count": 25,
+  "cross_modal_links": 8,
+  "modality_distribution": {
+    "audio": 15,
+    "video": 8,
+    "image": 12,
+    "document": 20
+  }
+}
+```
+
+#### 获取实体多模态提及
+```
+GET /api/v1/entities/{entity_id}/multimodal-mentions
+```
+
+**响应：**
+```json
+[
+  {
+    "id": "mention001",
+    "entity_id": "ent001",
+    "entity_name": "Project Alpha",
+    "modality": "video",
+    "source_id": "video123",
+    "source_type": "video_frame",
+    "text_snippet": "Project Alpha 进度",
+    "confidence": 1.0,
+    "created_at": "2024-01-15T10:30:00"
+  }
+]
+```
+
+#### 建议多模态实体合并
+```
+GET /api/v1/projects/{project_id}/multimodal/suggest-merges
+```
+
+**响应：**
+```json
+{
+  "project_id": "proj456",
+  "suggestion_count": 3,
+  "suggestions": [
+    {
+      "entity1": {"id": "ent001", "name": "K8s", "type": "TECH"},
+      "entity2": {"id": "ent002", "name": "Kubernetes", "type": "TECH"},
+      "similarity": 0.95,
+      "match_type": "alias_match",
+      "suggested_action": "merge"
+    }
+  ]
+}
+```
+
+## 数据库表结构
+
+### videos 表
+存储视频文件信息
+- `id`: 视频ID
+- `project_id`: 所属项目ID
+- `filename`: 文件名
+- `duration`: 视频时长（秒）
+- `fps`: 帧率
+- `resolution`: 分辨率（JSON）
+- `audio_transcript_id`: 关联的音频转录ID
+- `full_ocr_text`: 所有帧OCR文本合并
+- `extracted_entities`: 提取的实体（JSON）
+- `extracted_relations`: 提取的关系（JSON）
+- `status`: 处理状态
+
+### video_frames 表
+存储视频关键帧信息
+- `id`: 帧ID
+- `video_id`: 所属视频ID
+- `frame_number`: 帧序号
+- `timestamp`: 时间戳（秒）
+- `image_url`: 图片URL或路径
+- `ocr_text`: OCR识别文本
+- `extracted_entities`: 该帧提取的实体
+
+### images 表
+存储图片文件信息
+- `id`: 图片ID
+- `project_id`: 所属项目ID
+- `filename`: 文件名
+- `ocr_text`: OCR识别文本
+- `description`: 图片描述
+- `extracted_entities`: 提取的实体
+- `extracted_relations`: 提取的关系
+- `status`: 处理状态
+
+### multimodal_mentions 表
+存储实体在多模态中的提及
+- `id`: 提及ID
+- `project_id`: 所属项目ID
+- `entity_id`: 实体ID
+- `modality`: 模态类型（audio/video/image/document）
+- `source_id`: 来源ID
+- `source_type`: 来源类型
+- `text_snippet`: 文本片段
+- `confidence`: 置信度
+
+### multimodal_entity_links 表
+存储跨模态实体关联
+- `id`: 关联ID
+- `entity_id`: 实体ID
+- `linked_entity_id`: 关联实体ID
+- `link_type`: 关联类型（same_as/related_to/part_of）
+- `confidence`: 置信度
+- `evidence`: 关联证据
+- `modalities`: 涉及的模态列表
+
+## 依赖安装
+
+```bash
+pip install ffmpeg-python pillow opencv-python pytesseract
+```
+
+注意：使用 OCR 功能需要安装 Tesseract OCR 引擎：
+- Ubuntu/Debian: `sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim`
+- macOS: `brew install tesseract tesseract-lang`
+- Windows: 下载安装包从 https://github.com/UB-Mannheim/tesseract/wiki
+
+## 环境变量
+
+```bash
+# 可选：自定义临时目录
+export INSIGHTFLOW_TEMP_DIR=/path/to/temp
+
+# 可选：Tesseract 路径（Windows）
+export TESSERACT_CMD=C:\Program Files\Tesseract-OCR\tesseract.exe
+```