Phase 7 Task 7: 插件与集成系统
- 创建 plugin_manager.py 模块
- PluginManager: 插件管理主类
- ChromeExtensionHandler: Chrome 插件处理
- BotHandler: 飞书/钉钉/Slack 机器人处理
- WebhookIntegration: Zapier/Make Webhook 集成
- WebDAVSync: WebDAV 同步管理
- 创建完整的 Chrome 扩展代码
- manifest.json, background.js, content.js, content.css
- popup.html/js: 弹出窗口界面
- options.html/js: 设置页面
- 支持网页剪藏、选中文本保存、项目选择
- 更新 schema.sql 添加插件相关数据库表
- plugins: 插件配置表
- bot_sessions: 机器人会话表
- webhook_endpoints: Webhook 端点表
- webdav_syncs: WebDAV 同步配置表
- plugin_activity_logs: 插件活动日志表
- 更新 main.py 添加插件相关 API 端点
- GET/POST /api/v1/plugins - 插件管理
- POST /api/v1/plugins/chrome/clip - Chrome 插件保存网页
- POST /api/v1/bots/webhook/{platform} - 接收机器人消息
- GET /api/v1/bots/sessions - 机器人会话列表
- POST /api/v1/webhook-endpoints - 创建 Webhook 端点
- POST /webhook/{type}/{token} - 接收外部 Webhook
- POST /api/v1/webdav-syncs - WebDAV 同步配置
- POST /api/v1/webdav-syncs/{id}/test - 测试 WebDAV 连接
- POST /api/v1/webdav-syncs/{id}/sync - 触发 WebDAV 同步
- 更新 requirements.txt 添加插件依赖
- beautifulsoup4: HTML 解析
- webdavclient3: WebDAV 客户端
- 更新 STATUS.md 和 README.md 开发进度
This commit is contained in:
308
backend/docs/multimodal_api.md
Normal file
308
backend/docs/multimodal_api.md
Normal file
@@ -0,0 +1,308 @@
|
||||
# InsightFlow Phase 7 - 多模态支持 API 文档
|
||||
|
||||
## 概述
|
||||
|
||||
Phase 7 多模态支持模块为 InsightFlow 添加了处理视频和图片的能力,支持:
|
||||
|
||||
1. **视频处理**:提取音频、关键帧、OCR 识别
|
||||
2. **图片处理**:识别白板、PPT、手写笔记等内容
|
||||
3. **多模态实体关联**:跨模态实体对齐和知识融合
|
||||
|
||||
## 新增 API 端点
|
||||
|
||||
### 视频处理
|
||||
|
||||
#### 上传视频
|
||||
```
|
||||
POST /api/v1/projects/{project_id}/upload-video
|
||||
```
|
||||
|
||||
**参数:**
|
||||
- `file` (required): 视频文件
|
||||
- `extract_interval` (optional): 关键帧提取间隔(秒),默认 5 秒
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
{
|
||||
"video_id": "abc123",
|
||||
"project_id": "proj456",
|
||||
"filename": "meeting.mp4",
|
||||
"status": "completed",
|
||||
"audio_extracted": true,
|
||||
"frame_count": 24,
|
||||
"ocr_text_preview": "会议内容预览...",
|
||||
"message": "Video processed successfully"
|
||||
}
|
||||
```
|
||||
|
||||
#### 获取项目视频列表
|
||||
```
|
||||
GET /api/v1/projects/{project_id}/videos
|
||||
```
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "abc123",
|
||||
"filename": "meeting.mp4",
|
||||
"duration": 120.5,
|
||||
"fps": 30.0,
|
||||
"resolution": {"width": 1920, "height": 1080},
|
||||
"ocr_preview": "会议内容...",
|
||||
"status": "completed",
|
||||
"created_at": "2024-01-15T10:30:00"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### 获取视频关键帧
|
||||
```
|
||||
GET /api/v1/videos/{video_id}/frames
|
||||
```
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "frame001",
|
||||
"frame_number": 1,
|
||||
"timestamp": 0.0,
|
||||
"image_url": "/tmp/frames/video123/frame_000001_0.00.jpg",
|
||||
"ocr_text": "第一页内容...",
|
||||
"entities": [{"name": "Project Alpha", "type": "PROJECT"}]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### 图片处理
|
||||
|
||||
#### 上传图片
|
||||
```
|
||||
POST /api/v1/projects/{project_id}/upload-image
|
||||
```
|
||||
|
||||
**参数:**
|
||||
- `file` (required): 图片文件
|
||||
- `detect_type` (optional): 是否自动检测图片类型,默认 true
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
{
|
||||
"image_id": "img789",
|
||||
"project_id": "proj456",
|
||||
"filename": "whiteboard.jpg",
|
||||
"image_type": "whiteboard",
|
||||
"ocr_text_preview": "白板内容...",
|
||||
"description": "这是一张白板图片。内容摘要:...",
|
||||
"entity_count": 5,
|
||||
"status": "completed"
|
||||
}
|
||||
```
|
||||
|
||||
#### 批量上传图片
|
||||
```
|
||||
POST /api/v1/projects/{project_id}/upload-images-batch
|
||||
```
|
||||
|
||||
**参数:**
|
||||
- `files` (required): 多个图片文件
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
{
|
||||
"project_id": "proj456",
|
||||
"total_count": 3,
|
||||
"success_count": 3,
|
||||
"failed_count": 0,
|
||||
"results": [
|
||||
{
|
||||
"image_id": "img001",
|
||||
"status": "success",
|
||||
"image_type": "ppt",
|
||||
"entity_count": 4
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### 获取项目图片列表
|
||||
```
|
||||
GET /api/v1/projects/{project_id}/images
|
||||
```
|
||||
|
||||
### 多模态实体关联
|
||||
|
||||
#### 跨模态实体对齐
|
||||
```
|
||||
POST /api/v1/projects/{project_id}/multimodal/align
|
||||
```
|
||||
|
||||
**参数:**
|
||||
- `threshold` (optional): 相似度阈值,默认 0.85
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
{
|
||||
"project_id": "proj456",
|
||||
"aligned_count": 5,
|
||||
"links": [
|
||||
{
|
||||
"link_id": "link001",
|
||||
"source_entity_id": "ent001",
|
||||
"target_entity_id": "ent002",
|
||||
"source_modality": "video",
|
||||
"target_modality": "document",
|
||||
"link_type": "same_as",
|
||||
"confidence": 0.95,
|
||||
"evidence": "Cross-modal alignment: exact"
|
||||
}
|
||||
],
|
||||
"message": "Successfully aligned 5 cross-modal entity pairs"
|
||||
}
|
||||
```
|
||||
|
||||
#### 获取多模态统计信息
|
||||
```
|
||||
GET /api/v1/projects/{project_id}/multimodal/stats
|
||||
```
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
{
|
||||
"project_id": "proj456",
|
||||
"video_count": 3,
|
||||
"image_count": 10,
|
||||
"multimodal_entity_count": 25,
|
||||
"cross_modal_links": 8,
|
||||
"modality_distribution": {
|
||||
"audio": 15,
|
||||
"video": 8,
|
||||
"image": 12,
|
||||
"document": 20
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 获取实体多模态提及
|
||||
```
|
||||
GET /api/v1/entities/{entity_id}/multimodal-mentions
|
||||
```
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "mention001",
|
||||
"entity_id": "ent001",
|
||||
"entity_name": "Project Alpha",
|
||||
"modality": "video",
|
||||
"source_id": "video123",
|
||||
"source_type": "video_frame",
|
||||
"text_snippet": "Project Alpha 进度",
|
||||
"confidence": 1.0,
|
||||
"created_at": "2024-01-15T10:30:00"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### 建议多模态实体合并
|
||||
```
|
||||
GET /api/v1/projects/{project_id}/multimodal/suggest-merges
|
||||
```
|
||||
|
||||
**响应:**
|
||||
```json
|
||||
{
|
||||
"project_id": "proj456",
|
||||
"suggestion_count": 3,
|
||||
"suggestions": [
|
||||
{
|
||||
"entity1": {"id": "ent001", "name": "K8s", "type": "TECH"},
|
||||
"entity2": {"id": "ent002", "name": "Kubernetes", "type": "TECH"},
|
||||
"similarity": 0.95,
|
||||
"match_type": "alias_match",
|
||||
"suggested_action": "merge"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 数据库表结构
|
||||
|
||||
### videos 表
|
||||
存储视频文件信息
|
||||
- `id`: 视频ID
|
||||
- `project_id`: 所属项目ID
|
||||
- `filename`: 文件名
|
||||
- `duration`: 视频时长(秒)
|
||||
- `fps`: 帧率
|
||||
- `resolution`: 分辨率(JSON)
|
||||
- `audio_transcript_id`: 关联的音频转录ID
|
||||
- `full_ocr_text`: 所有帧OCR文本合并
|
||||
- `extracted_entities`: 提取的实体(JSON)
|
||||
- `extracted_relations`: 提取的关系(JSON)
|
||||
- `status`: 处理状态
|
||||
|
||||
### video_frames 表
|
||||
存储视频关键帧信息
|
||||
- `id`: 帧ID
|
||||
- `video_id`: 所属视频ID
|
||||
- `frame_number`: 帧序号
|
||||
- `timestamp`: 时间戳(秒)
|
||||
- `image_url`: 图片URL或路径
|
||||
- `ocr_text`: OCR识别文本
|
||||
- `extracted_entities`: 该帧提取的实体
|
||||
|
||||
### images 表
|
||||
存储图片文件信息
|
||||
- `id`: 图片ID
|
||||
- `project_id`: 所属项目ID
|
||||
- `filename`: 文件名
|
||||
- `ocr_text`: OCR识别文本
|
||||
- `description`: 图片描述
|
||||
- `extracted_entities`: 提取的实体
|
||||
- `extracted_relations`: 提取的关系
|
||||
- `status`: 处理状态
|
||||
|
||||
### multimodal_mentions 表
|
||||
存储实体在多模态中的提及
|
||||
- `id`: 提及ID
|
||||
- `project_id`: 所属项目ID
|
||||
- `entity_id`: 实体ID
|
||||
- `modality`: 模态类型(audio/video/image/document)
|
||||
- `source_id`: 来源ID
|
||||
- `source_type`: 来源类型
|
||||
- `text_snippet`: 文本片段
|
||||
- `confidence`: 置信度
|
||||
|
||||
### multimodal_entity_links 表
|
||||
存储跨模态实体关联
|
||||
- `id`: 关联ID
|
||||
- `entity_id`: 实体ID
|
||||
- `linked_entity_id`: 关联实体ID
|
||||
- `link_type`: 关联类型(same_as/related_to/part_of)
|
||||
- `confidence`: 置信度
|
||||
- `evidence`: 关联证据
|
||||
- `modalities`: 涉及的模态列表
|
||||
|
||||
## 依赖安装
|
||||
|
||||
```bash
|
||||
pip install ffmpeg-python pillow opencv-python pytesseract
|
||||
```
|
||||
|
||||
注意:使用 OCR 功能需要安装 Tesseract OCR 引擎:
|
||||
- Ubuntu/Debian: `sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim`
|
||||
- macOS: `brew install tesseract tesseract-lang`
|
||||
- Windows: 下载安装包从 https://github.com/UB-Mannheim/tesseract/wiki
|
||||
|
||||
## 环境变量
|
||||
|
||||
```bash
|
||||
# 可选:自定义临时目录
|
||||
export INSIGHTFLOW_TEMP_DIR=/path/to/temp
|
||||
|
||||
# 可选:Tesseract 路径(Windows)
|
||||
export TESSERACT_CMD=C:\Program Files\Tesseract-OCR\tesseract.exe
|
||||
```
|
||||
Reference in New Issue
Block a user