diff --git a/README.md b/README.md index 7893a64..7be9aa8 100644 --- a/README.md +++ b/README.md @@ -191,12 +191,12 @@ MIT | 任务 | 状态 | 完成时间 | |------|------|----------| | 1. 智能工作流自动化 | ✅ 已完成 | 2026-02-23 | -| 2. 多模态支持 | 🚧 进行中 | - | +| 2. 多模态支持 | ✅ 已完成 | 2026-02-23 | +| 7. 插件与集成 | ✅ 已完成 | 2026-02-23 | | 3. 数据安全与合规 | 📋 待开发 | - | | 4. 协作与共享 | 📋 待开发 | - | | 5. 智能报告生成 | 📋 待开发 | - | | 6. 高级搜索与发现 | 📋 待开发 | - | -| 7. 插件与集成 | 📋 待开发 | - | | 8. 性能优化与扩展 | 📋 待开发 | - | **建议开发顺序**: 1 → 2 → 7 → 3 → 4 → 5 → 6 → 8 diff --git a/STATUS.md b/STATUS.md index f8b3b96..23eb2ad 100644 --- a/STATUS.md +++ b/STATUS.md @@ -1,10 +1,10 @@ # InsightFlow 开发状态 -**最后更新**: 2026-02-23 00:00 +**最后更新**: 2026-02-23 06:00 ## 当前阶段 -Phase 7: 工作流自动化 - **进行中 🚧** +Phase 7: 插件与集成 - **已完成 ✅** ## 部署状态 @@ -36,7 +36,7 @@ Phase 7: 工作流自动化 - **进行中 🚧** - 导出功能 - API 开放平台 -### Phase 7 - 工作流自动化 (进行中 🚧) +### Phase 7 - 任务 1: 工作流自动化 (已完成 ✅) - ✅ 创建 workflow_manager.py - 工作流管理模块 - WorkflowManager: 主管理类 - WorkflowTask: 工作流任务定义 @@ -59,9 +59,81 @@ Phase 7: 工作流自动化 - **进行中 🚧** - POST /api/v1/webhooks/{id}/test - 测试 Webhook - ✅ 更新 requirements.txt - 添加 APScheduler 依赖 +### Phase 7 - 任务 2: 多模态支持 (已完成 ✅) +- ✅ 创建 multimodal_processor.py - 多模态处理模块 + - VideoProcessor: 视频处理器(提取音频 + 关键帧 + OCR) + - ImageProcessor: 图片处理器(OCR + 图片描述) + - MultimodalEntityExtractor: 多模态实体提取器 + - 支持 PaddleOCR/EasyOCR/Tesseract 多种 OCR 引擎 + - 支持 ffmpeg 视频处理 +- ✅ 创建 multimodal_entity_linker.py - 多模态实体关联模块 + - MultimodalEntityLinker: 跨模态实体关联器 + - 支持 embedding 相似度计算 + - 多模态实体画像生成 + - 跨模态关系发现 + - 多模态时间线生成 +- ✅ 更新 schema.sql - 添加多模态相关数据库表 + - videos: 视频表 + - video_frames: 视频关键帧表 + - images: 图片表 + - multimodal_mentions: 多模态实体提及表 + - multimodal_entity_links: 多模态实体关联表 +- ✅ 更新 main.py - 添加多模态相关 API 端点 + - POST /api/v1/projects/{id}/upload-video - 上传视频 + - POST /api/v1/projects/{id}/upload-image - 上传图片 + - GET /api/v1/projects/{id}/videos - 视频列表 + - GET /api/v1/projects/{id}/images - 图片列表 + - GET /api/v1/videos/{id} - 视频详情 + - GET /api/v1/images/{id} - 图片详情 + - POST /api/v1/projects/{id}/multimodal/link-entities - 跨模态实体关联 + - GET /api/v1/entities/{id}/multimodal-profile - 实体多模态画像 + - GET /api/v1/projects/{id}/multimodal-timeline - 多模态时间线 + - GET /api/v1/entities/{id}/cross-modal-relations - 跨模态关系 +- ✅ 更新 requirements.txt - 添加多模态依赖 + - opencv-python: 视频处理 + - pillow: 图片处理 + - paddleocr/paddlepaddle: OCR 引擎 + - ffmpeg-python: ffmpeg 封装 + - sentence-transformers: 跨模态对齐 + +### Phase 7 - 任务 7: 插件与集成 (已完成 ✅) +- ✅ 创建 plugin_manager.py - 插件管理模块 + - PluginManager: 插件管理主类 + - ChromeExtensionHandler: Chrome 插件 API 处理 + - BotHandler: 飞书/钉钉机器人处理 + - WebhookIntegration: Zapier/Make Webhook 集成 + - WebDAVSync: WebDAV 同步管理 +- ✅ 创建 Chrome 扩展代码 + - manifest.json - 扩展配置 + - background.js - 后台脚本,处理右键菜单和消息 + - content.js - 内容脚本,页面交互和浮动按钮 + - content.css - 内容样式 + - popup.html/js - 弹出窗口 + - options.html/js - 设置页面 +- ✅ 更新 schema.sql - 添加插件相关数据库表 + - plugins: 插件配置表 + - bot_sessions: 机器人会话表 + - webhook_endpoints: Webhook 端点表 + - webdav_syncs: WebDAV 同步配置表 + - plugin_activity_logs: 插件活动日志表 +- ✅ 更新 main.py - 添加插件相关 API 端点 + - GET/POST /api/v1/plugins - 插件管理 + - POST /api/v1/plugins/chrome/clip - Chrome 插件保存网页 + - POST /api/v1/bots/webhook/{platform} - 接收机器人消息 + - GET /api/v1/bots/sessions - 机器人会话列表 + - POST /api/v1/webhook-endpoints - 创建 Webhook 端点 + - POST /webhook/{type}/{token} - 接收外部 Webhook + - POST /api/v1/webdav-syncs - WebDAV 同步配置 + - POST /api/v1/webdav-syncs/{id}/test - 测试 WebDAV 连接 + - POST /api/v1/webdav-syncs/{id}/sync - 触发 WebDAV 同步 + - GET /api/v1/plugins/{id}/logs - 插件活动日志 +- ✅ 更新 requirements.txt - 添加插件依赖 + - beautifulsoup4: HTML 解析 + - webdavclient3: WebDAV 客户端 + ## 待完成 -无 - Phase 7 任务 1 已完成 +Phase 7 任务 3: 数据安全与合规 ## 技术债务 @@ -69,6 +141,7 @@ Phase 7: 工作流自动化 - **进行中 🚧** - 实体相似度匹配目前只是简单字符串包含,需要 embedding 方案 - 前端需要状态管理(目前使用全局变量) - ~~需要添加 API 文档 (OpenAPI/Swagger)~~ ✅ 已完成 +- 多模态 LLM 图片描述功能待实现(需要集成多模态模型 API) ## 部署信息 @@ -78,6 +151,36 @@ Phase 7: 工作流自动化 - **进行中 🚧** ## 最近更新 +### 2026-02-23 (午间) +- 完成 Phase 7 任务 7: 插件与集成 + - 创建 plugin_manager.py 模块 + - PluginManager: 插件管理主类 + - ChromeExtensionHandler: Chrome 插件处理 + - BotHandler: 飞书/钉钉/Slack 机器人处理 + - WebhookIntegration: Zapier/Make Webhook 集成 + - WebDAVSync: WebDAV 同步管理 + - 创建完整的 Chrome 扩展代码 + - manifest.json, background.js, content.js + - popup.html/js, options.html/js + - 支持网页剪藏、选中文本保存、项目选择 + - 更新 schema.sql 添加插件相关数据库表 + - 更新 main.py 添加插件相关 API 端点 + - 更新 requirements.txt 添加插件依赖 + +### 2026-02-23 (早间) +- 完成 Phase 7 任务 2: 多模态支持 + - 创建 multimodal_processor.py 模块 + - VideoProcessor: 视频处理(音频提取 + 关键帧 + OCR) + - ImageProcessor: 图片处理(OCR + 图片描述) + - MultimodalEntityExtractor: 多模态实体提取 + - 创建 multimodal_entity_linker.py 模块 + - MultimodalEntityLinker: 跨模态实体关联 + - 支持 embedding 相似度计算 + - 多模态实体画像和时间线 + - 更新 schema.sql 添加多模态相关数据库表 + - 更新 main.py 添加多模态相关 API 端点 + - 更新 requirements.txt 添加多模态依赖 + ### 2026-02-23 - 完成 Phase 7 任务 1: 工作流自动化模块 - 创建 workflow_manager.py 模块 diff --git a/backend/__pycache__/db_manager.cpython-312.pyc b/backend/__pycache__/db_manager.cpython-312.pyc index eb0e391..a24f1e8 100644 Binary files a/backend/__pycache__/db_manager.cpython-312.pyc and b/backend/__pycache__/db_manager.cpython-312.pyc differ diff --git a/backend/__pycache__/image_processor.cpython-312.pyc b/backend/__pycache__/image_processor.cpython-312.pyc new file mode 100644 index 0000000..cb7a09e Binary files /dev/null and b/backend/__pycache__/image_processor.cpython-312.pyc differ diff --git a/backend/__pycache__/main.cpython-312.pyc b/backend/__pycache__/main.cpython-312.pyc index 4df13f3..a64fe67 100644 Binary files a/backend/__pycache__/main.cpython-312.pyc and b/backend/__pycache__/main.cpython-312.pyc differ diff --git a/backend/__pycache__/multimodal_entity_linker.cpython-312.pyc b/backend/__pycache__/multimodal_entity_linker.cpython-312.pyc new file mode 100644 index 0000000..5aef7a6 Binary files /dev/null and b/backend/__pycache__/multimodal_entity_linker.cpython-312.pyc differ diff --git a/backend/__pycache__/multimodal_processor.cpython-312.pyc b/backend/__pycache__/multimodal_processor.cpython-312.pyc new file mode 100644 index 0000000..03b4715 Binary files /dev/null and b/backend/__pycache__/multimodal_processor.cpython-312.pyc differ diff --git a/backend/db_manager.py b/backend/db_manager.py index 3871d55..2be6b70 100644 --- a/backend/db_manager.py +++ b/backend/db_manager.py @@ -878,6 +878,310 @@ class DatabaseManager: filtered.append(entity) return filtered + # ==================== Phase 7: Multimodal Support ==================== + + def create_video(self, video_id: str, project_id: str, filename: str, + duration: float = 0, fps: float = 0, resolution: Dict = None, + audio_transcript_id: str = None, full_ocr_text: str = "", + extracted_entities: List[Dict] = None, + extracted_relations: List[Dict] = None) -> str: + """创建视频记录""" + conn = self.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT INTO videos + (id, project_id, filename, duration, fps, resolution, + audio_transcript_id, full_ocr_text, extracted_entities, + extracted_relations, status, created_at, updated_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (video_id, project_id, filename, duration, fps, + json.dumps(resolution) if resolution else None, + audio_transcript_id, full_ocr_text, + json.dumps(extracted_entities or []), + json.dumps(extracted_relations or []), + 'completed', now, now) + ) + conn.commit() + conn.close() + return video_id + + def get_video(self, video_id: str) -> Optional[Dict]: + """获取视频信息""" + conn = self.get_conn() + row = conn.execute( + "SELECT * FROM videos WHERE id = ?", (video_id,) + ).fetchone() + conn.close() + + if row: + data = dict(row) + data['resolution'] = json.loads(data['resolution']) if data['resolution'] else None + data['extracted_entities'] = json.loads(data['extracted_entities']) if data['extracted_entities'] else [] + data['extracted_relations'] = json.loads(data['extracted_relations']) if data['extracted_relations'] else [] + return data + return None + + def list_project_videos(self, project_id: str) -> List[Dict]: + """获取项目的所有视频""" + conn = self.get_conn() + rows = conn.execute( + "SELECT * FROM videos WHERE project_id = ? ORDER BY created_at DESC", + (project_id,) + ).fetchall() + conn.close() + + videos = [] + for row in rows: + data = dict(row) + data['resolution'] = json.loads(data['resolution']) if data['resolution'] else None + data['extracted_entities'] = json.loads(data['extracted_entities']) if data['extracted_entities'] else [] + data['extracted_relations'] = json.loads(data['extracted_relations']) if data['extracted_relations'] else [] + videos.append(data) + return videos + + def create_video_frame(self, frame_id: str, video_id: str, frame_number: int, + timestamp: float, image_url: str = None, + ocr_text: str = None, extracted_entities: List[Dict] = None) -> str: + """创建视频帧记录""" + conn = self.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT INTO video_frames + (id, video_id, frame_number, timestamp, image_url, ocr_text, extracted_entities, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?)""", + (frame_id, video_id, frame_number, timestamp, image_url, ocr_text, + json.dumps(extracted_entities or []), now) + ) + conn.commit() + conn.close() + return frame_id + + def get_video_frames(self, video_id: str) -> List[Dict]: + """获取视频的所有帧""" + conn = self.get_conn() + rows = conn.execute( + """SELECT * FROM video_frames WHERE video_id = ? ORDER BY timestamp""", + (video_id,) + ).fetchall() + conn.close() + + frames = [] + for row in rows: + data = dict(row) + data['extracted_entities'] = json.loads(data['extracted_entities']) if data['extracted_entities'] else [] + frames.append(data) + return frames + + def create_image(self, image_id: str, project_id: str, filename: str, + ocr_text: str = "", description: str = "", + extracted_entities: List[Dict] = None, + extracted_relations: List[Dict] = None) -> str: + """创建图片记录""" + conn = self.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT INTO images + (id, project_id, filename, ocr_text, description, + extracted_entities, extracted_relations, status, created_at, updated_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (image_id, project_id, filename, ocr_text, description, + json.dumps(extracted_entities or []), + json.dumps(extracted_relations or []), + 'completed', now, now) + ) + conn.commit() + conn.close() + return image_id + + def get_image(self, image_id: str) -> Optional[Dict]: + """获取图片信息""" + conn = self.get_conn() + row = conn.execute( + "SELECT * FROM images WHERE id = ?", (image_id,) + ).fetchone() + conn.close() + + if row: + data = dict(row) + data['extracted_entities'] = json.loads(data['extracted_entities']) if data['extracted_entities'] else [] + data['extracted_relations'] = json.loads(data['extracted_relations']) if data['extracted_relations'] else [] + return data + return None + + def list_project_images(self, project_id: str) -> List[Dict]: + """获取项目的所有图片""" + conn = self.get_conn() + rows = conn.execute( + "SELECT * FROM images WHERE project_id = ? ORDER BY created_at DESC", + (project_id,) + ).fetchall() + conn.close() + + images = [] + for row in rows: + data = dict(row) + data['extracted_entities'] = json.loads(data['extracted_entities']) if data['extracted_entities'] else [] + data['extracted_relations'] = json.loads(data['extracted_relations']) if data['extracted_relations'] else [] + images.append(data) + return images + + def create_multimodal_mention(self, mention_id: str, project_id: str, + entity_id: str, modality: str, source_id: str, + source_type: str, text_snippet: str = "", + confidence: float = 1.0) -> str: + """创建多模态实体提及记录""" + conn = self.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT OR REPLACE INTO multimodal_mentions + (id, project_id, entity_id, modality, source_id, source_type, + text_snippet, confidence, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (mention_id, project_id, entity_id, modality, source_id, + source_type, text_snippet, confidence, now) + ) + conn.commit() + conn.close() + return mention_id + + def get_entity_multimodal_mentions(self, entity_id: str) -> List[Dict]: + """获取实体的多模态提及""" + conn = self.get_conn() + rows = conn.execute( + """SELECT m.*, e.name as entity_name + FROM multimodal_mentions m + JOIN entities e ON m.entity_id = e.id + WHERE m.entity_id = ? ORDER BY m.created_at DESC""", + (entity_id,) + ).fetchall() + conn.close() + return [dict(r) for r in rows] + + def get_project_multimodal_mentions(self, project_id: str, + modality: str = None) -> List[Dict]: + """获取项目的多模态提及""" + conn = self.get_conn() + + if modality: + rows = conn.execute( + """SELECT m.*, e.name as entity_name + FROM multimodal_mentions m + JOIN entities e ON m.entity_id = e.id + WHERE m.project_id = ? AND m.modality = ? + ORDER BY m.created_at DESC""", + (project_id, modality) + ).fetchall() + else: + rows = conn.execute( + """SELECT m.*, e.name as entity_name + FROM multimodal_mentions m + JOIN entities e ON m.entity_id = e.id + WHERE m.project_id = ? ORDER BY m.created_at DESC""", + (project_id,) + ).fetchall() + + conn.close() + return [dict(r) for r in rows] + + def create_multimodal_entity_link(self, link_id: str, entity_id: str, + linked_entity_id: str, link_type: str, + confidence: float = 1.0, + evidence: str = "", + modalities: List[str] = None) -> str: + """创建多模态实体关联""" + conn = self.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT OR REPLACE INTO multimodal_entity_links + (id, entity_id, linked_entity_id, link_type, confidence, + evidence, modalities, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?)""", + (link_id, entity_id, linked_entity_id, link_type, confidence, + evidence, json.dumps(modalities or []), now) + ) + conn.commit() + conn.close() + return link_id + + def get_entity_multimodal_links(self, entity_id: str) -> List[Dict]: + """获取实体的多模态关联""" + conn = self.get_conn() + rows = conn.execute( + """SELECT l.*, e1.name as entity_name, e2.name as linked_entity_name + FROM multimodal_entity_links l + JOIN entities e1 ON l.entity_id = e1.id + JOIN entities e2 ON l.linked_entity_id = e2.id + WHERE l.entity_id = ? OR l.linked_entity_id = ?""", + (entity_id, entity_id) + ).fetchall() + conn.close() + + links = [] + for row in rows: + data = dict(row) + data['modalities'] = json.loads(data['modalities']) if data['modalities'] else [] + links.append(data) + return links + + def get_project_multimodal_stats(self, project_id: str) -> Dict: + """获取项目多模态统计信息""" + conn = self.get_conn() + + stats = { + 'video_count': 0, + 'image_count': 0, + 'multimodal_entity_count': 0, + 'cross_modal_links': 0, + 'modality_distribution': {} + } + + # 视频数量 + row = conn.execute( + "SELECT COUNT(*) as count FROM videos WHERE project_id = ?", + (project_id,) + ).fetchone() + stats['video_count'] = row['count'] + + # 图片数量 + row = conn.execute( + "SELECT COUNT(*) as count FROM images WHERE project_id = ?", + (project_id,) + ).fetchone() + stats['image_count'] = row['count'] + + # 多模态实体数量 + row = conn.execute( + """SELECT COUNT(DISTINCT entity_id) as count + FROM multimodal_mentions WHERE project_id = ?""", + (project_id,) + ).fetchone() + stats['multimodal_entity_count'] = row['count'] + + # 跨模态关联数量 + row = conn.execute( + """SELECT COUNT(*) as count FROM multimodal_entity_links + WHERE entity_id IN (SELECT id FROM entities WHERE project_id = ?)""", + (project_id,) + ).fetchone() + stats['cross_modal_links'] = row['count'] + + # 模态分布 + for modality in ['audio', 'video', 'image', 'document']: + row = conn.execute( + """SELECT COUNT(*) as count FROM multimodal_mentions + WHERE project_id = ? AND modality = ?""", + (project_id, modality) + ).fetchone() + stats['modality_distribution'][modality] = row['count'] + + conn.close() + return stats + # Singleton instance _db_manager = None diff --git a/backend/docs/multimodal_api.md b/backend/docs/multimodal_api.md new file mode 100644 index 0000000..a31b981 --- /dev/null +++ b/backend/docs/multimodal_api.md @@ -0,0 +1,308 @@ +# InsightFlow Phase 7 - 多模态支持 API 文档 + +## 概述 + +Phase 7 多模态支持模块为 InsightFlow 添加了处理视频和图片的能力,支持: + +1. **视频处理**:提取音频、关键帧、OCR 识别 +2. **图片处理**:识别白板、PPT、手写笔记等内容 +3. **多模态实体关联**:跨模态实体对齐和知识融合 + +## 新增 API 端点 + +### 视频处理 + +#### 上传视频 +``` +POST /api/v1/projects/{project_id}/upload-video +``` + +**参数:** +- `file` (required): 视频文件 +- `extract_interval` (optional): 关键帧提取间隔(秒),默认 5 秒 + +**响应:** +```json +{ + "video_id": "abc123", + "project_id": "proj456", + "filename": "meeting.mp4", + "status": "completed", + "audio_extracted": true, + "frame_count": 24, + "ocr_text_preview": "会议内容预览...", + "message": "Video processed successfully" +} +``` + +#### 获取项目视频列表 +``` +GET /api/v1/projects/{project_id}/videos +``` + +**响应:** +```json +[ + { + "id": "abc123", + "filename": "meeting.mp4", + "duration": 120.5, + "fps": 30.0, + "resolution": {"width": 1920, "height": 1080}, + "ocr_preview": "会议内容...", + "status": "completed", + "created_at": "2024-01-15T10:30:00" + } +] +``` + +#### 获取视频关键帧 +``` +GET /api/v1/videos/{video_id}/frames +``` + +**响应:** +```json +[ + { + "id": "frame001", + "frame_number": 1, + "timestamp": 0.0, + "image_url": "/tmp/frames/video123/frame_000001_0.00.jpg", + "ocr_text": "第一页内容...", + "entities": [{"name": "Project Alpha", "type": "PROJECT"}] + } +] +``` + +### 图片处理 + +#### 上传图片 +``` +POST /api/v1/projects/{project_id}/upload-image +``` + +**参数:** +- `file` (required): 图片文件 +- `detect_type` (optional): 是否自动检测图片类型,默认 true + +**响应:** +```json +{ + "image_id": "img789", + "project_id": "proj456", + "filename": "whiteboard.jpg", + "image_type": "whiteboard", + "ocr_text_preview": "白板内容...", + "description": "这是一张白板图片。内容摘要:...", + "entity_count": 5, + "status": "completed" +} +``` + +#### 批量上传图片 +``` +POST /api/v1/projects/{project_id}/upload-images-batch +``` + +**参数:** +- `files` (required): 多个图片文件 + +**响应:** +```json +{ + "project_id": "proj456", + "total_count": 3, + "success_count": 3, + "failed_count": 0, + "results": [ + { + "image_id": "img001", + "status": "success", + "image_type": "ppt", + "entity_count": 4 + } + ] +} +``` + +#### 获取项目图片列表 +``` +GET /api/v1/projects/{project_id}/images +``` + +### 多模态实体关联 + +#### 跨模态实体对齐 +``` +POST /api/v1/projects/{project_id}/multimodal/align +``` + +**参数:** +- `threshold` (optional): 相似度阈值,默认 0.85 + +**响应:** +```json +{ + "project_id": "proj456", + "aligned_count": 5, + "links": [ + { + "link_id": "link001", + "source_entity_id": "ent001", + "target_entity_id": "ent002", + "source_modality": "video", + "target_modality": "document", + "link_type": "same_as", + "confidence": 0.95, + "evidence": "Cross-modal alignment: exact" + } + ], + "message": "Successfully aligned 5 cross-modal entity pairs" +} +``` + +#### 获取多模态统计信息 +``` +GET /api/v1/projects/{project_id}/multimodal/stats +``` + +**响应:** +```json +{ + "project_id": "proj456", + "video_count": 3, + "image_count": 10, + "multimodal_entity_count": 25, + "cross_modal_links": 8, + "modality_distribution": { + "audio": 15, + "video": 8, + "image": 12, + "document": 20 + } +} +``` + +#### 获取实体多模态提及 +``` +GET /api/v1/entities/{entity_id}/multimodal-mentions +``` + +**响应:** +```json +[ + { + "id": "mention001", + "entity_id": "ent001", + "entity_name": "Project Alpha", + "modality": "video", + "source_id": "video123", + "source_type": "video_frame", + "text_snippet": "Project Alpha 进度", + "confidence": 1.0, + "created_at": "2024-01-15T10:30:00" + } +] +``` + +#### 建议多模态实体合并 +``` +GET /api/v1/projects/{project_id}/multimodal/suggest-merges +``` + +**响应:** +```json +{ + "project_id": "proj456", + "suggestion_count": 3, + "suggestions": [ + { + "entity1": {"id": "ent001", "name": "K8s", "type": "TECH"}, + "entity2": {"id": "ent002", "name": "Kubernetes", "type": "TECH"}, + "similarity": 0.95, + "match_type": "alias_match", + "suggested_action": "merge" + } + ] +} +``` + +## 数据库表结构 + +### videos 表 +存储视频文件信息 +- `id`: 视频ID +- `project_id`: 所属项目ID +- `filename`: 文件名 +- `duration`: 视频时长(秒) +- `fps`: 帧率 +- `resolution`: 分辨率(JSON) +- `audio_transcript_id`: 关联的音频转录ID +- `full_ocr_text`: 所有帧OCR文本合并 +- `extracted_entities`: 提取的实体(JSON) +- `extracted_relations`: 提取的关系(JSON) +- `status`: 处理状态 + +### video_frames 表 +存储视频关键帧信息 +- `id`: 帧ID +- `video_id`: 所属视频ID +- `frame_number`: 帧序号 +- `timestamp`: 时间戳(秒) +- `image_url`: 图片URL或路径 +- `ocr_text`: OCR识别文本 +- `extracted_entities`: 该帧提取的实体 + +### images 表 +存储图片文件信息 +- `id`: 图片ID +- `project_id`: 所属项目ID +- `filename`: 文件名 +- `ocr_text`: OCR识别文本 +- `description`: 图片描述 +- `extracted_entities`: 提取的实体 +- `extracted_relations`: 提取的关系 +- `status`: 处理状态 + +### multimodal_mentions 表 +存储实体在多模态中的提及 +- `id`: 提及ID +- `project_id`: 所属项目ID +- `entity_id`: 实体ID +- `modality`: 模态类型(audio/video/image/document) +- `source_id`: 来源ID +- `source_type`: 来源类型 +- `text_snippet`: 文本片段 +- `confidence`: 置信度 + +### multimodal_entity_links 表 +存储跨模态实体关联 +- `id`: 关联ID +- `entity_id`: 实体ID +- `linked_entity_id`: 关联实体ID +- `link_type`: 关联类型(same_as/related_to/part_of) +- `confidence`: 置信度 +- `evidence`: 关联证据 +- `modalities`: 涉及的模态列表 + +## 依赖安装 + +```bash +pip install ffmpeg-python pillow opencv-python pytesseract +``` + +注意:使用 OCR 功能需要安装 Tesseract OCR 引擎: +- Ubuntu/Debian: `sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim` +- macOS: `brew install tesseract tesseract-lang` +- Windows: 下载安装包从 https://github.com/UB-Mannheim/tesseract/wiki + +## 环境变量 + +```bash +# 可选:自定义临时目录 +export INSIGHTFLOW_TEMP_DIR=/path/to/temp + +# 可选:Tesseract 路径(Windows) +export TESSERACT_CMD=C:\Program Files\Tesseract-OCR\tesseract.exe +``` diff --git a/backend/image_processor.py b/backend/image_processor.py new file mode 100644 index 0000000..573e9cc --- /dev/null +++ b/backend/image_processor.py @@ -0,0 +1,547 @@ +#!/usr/bin/env python3 +""" +InsightFlow Image Processor - Phase 7 +图片处理模块:识别白板、PPT、手写笔记等内容 +""" + +import os +import io +import json +import uuid +import base64 +from typing import List, Dict, Optional, Tuple +from dataclasses import dataclass +from pathlib import Path + +# 尝试导入图像处理库 +try: + from PIL import Image, ImageEnhance, ImageFilter + PIL_AVAILABLE = True +except ImportError: + PIL_AVAILABLE = False + +try: + import cv2 + import numpy as np + CV2_AVAILABLE = True +except ImportError: + CV2_AVAILABLE = False + +try: + import pytesseract + PYTESSERACT_AVAILABLE = True +except ImportError: + PYTESSERACT_AVAILABLE = False + + +@dataclass +class ImageEntity: + """图片中检测到的实体""" + name: str + type: str + confidence: float + bbox: Optional[Tuple[int, int, int, int]] = None # (x, y, width, height) + + +@dataclass +class ImageRelation: + """图片中检测到的关系""" + source: str + target: str + relation_type: str + confidence: float + + +@dataclass +class ImageProcessingResult: + """图片处理结果""" + image_id: str + image_type: str # whiteboard, ppt, handwritten, screenshot, other + ocr_text: str + description: str + entities: List[ImageEntity] + relations: List[ImageRelation] + width: int + height: int + success: bool + error_message: str = "" + + +@dataclass +class BatchProcessingResult: + """批量图片处理结果""" + results: List[ImageProcessingResult] + total_count: int + success_count: int + failed_count: int + + +class ImageProcessor: + """图片处理器 - 处理各种类型图片""" + + # 图片类型定义 + IMAGE_TYPES = { + 'whiteboard': '白板', + 'ppt': 'PPT/演示文稿', + 'handwritten': '手写笔记', + 'screenshot': '屏幕截图', + 'document': '文档图片', + 'other': '其他' + } + + def __init__(self, temp_dir: str = None): + """ + 初始化图片处理器 + + Args: + temp_dir: 临时文件目录 + """ + self.temp_dir = temp_dir or os.path.join(os.getcwd(), 'temp', 'images') + os.makedirs(self.temp_dir, exist_ok=True) + + def preprocess_image(self, image, image_type: str = None): + """ + 预处理图片以提高OCR质量 + + Args: + image: PIL Image 对象 + image_type: 图片类型(用于针对性处理) + + Returns: + 处理后的图片 + """ + if not PIL_AVAILABLE: + return image + + try: + # 转换为RGB(如果是RGBA) + if image.mode == 'RGBA': + image = image.convert('RGB') + + # 根据图片类型进行针对性处理 + if image_type == 'whiteboard': + # 白板:增强对比度,去除背景 + image = self._enhance_whiteboard(image) + elif image_type == 'handwritten': + # 手写笔记:降噪,增强对比度 + image = self._enhance_handwritten(image) + elif image_type == 'screenshot': + # 截图:轻微锐化 + image = image.filter(ImageFilter.SHARPEN) + + # 通用处理:调整大小(如果太大) + max_size = 4096 + if max(image.size) > max_size: + ratio = max_size / max(image.size) + new_size = (int(image.size[0] * ratio), int(image.size[1] * ratio)) + image = image.resize(new_size, Image.Resampling.LANCZOS) + + return image + except Exception as e: + print(f"Image preprocessing error: {e}") + return image + + def _enhance_whiteboard(self, image): + """增强白板图片""" + # 转换为灰度 + gray = image.convert('L') + + # 增强对比度 + enhancer = ImageEnhance.Contrast(gray) + enhanced = enhancer.enhance(2.0) + + # 二值化 + threshold = 128 + binary = enhanced.point(lambda x: 0 if x < threshold else 255, '1') + + return binary.convert('L') + + def _enhance_handwritten(self, image): + """增强手写笔记图片""" + # 转换为灰度 + gray = image.convert('L') + + # 轻微降噪 + blurred = gray.filter(ImageFilter.GaussianBlur(radius=1)) + + # 增强对比度 + enhancer = ImageEnhance.Contrast(blurred) + enhanced = enhancer.enhance(1.5) + + return enhanced + + def detect_image_type(self, image, ocr_text: str = "") -> str: + """ + 自动检测图片类型 + + Args: + image: PIL Image 对象 + ocr_text: OCR识别的文本 + + Returns: + 图片类型字符串 + """ + if not PIL_AVAILABLE: + return 'other' + + try: + # 基于图片特征和OCR内容判断类型 + width, height = image.size + aspect_ratio = width / height + + # 检测是否为PPT(通常是16:9或4:3) + if 1.3 <= aspect_ratio <= 1.8: + # 检查是否有典型的PPT特征(标题、项目符号等) + if any(keyword in ocr_text.lower() for keyword in ['slide', 'page', '第', '页']): + return 'ppt' + + # 检测是否为白板(大量手写文字,可能有箭头、框等) + if CV2_AVAILABLE: + img_array = np.array(image.convert('RGB')) + gray = cv2.cvtColor(img_array, cv2.COLOR_RGB2GRAY) + + # 检测边缘(白板通常有很多线条) + edges = cv2.Canny(gray, 50, 150) + edge_ratio = np.sum(edges > 0) / edges.size + + # 如果边缘比例高,可能是白板 + if edge_ratio > 0.05 and len(ocr_text) > 50: + return 'whiteboard' + + # 检测是否为手写笔记(文字密度高,可能有涂鸦) + if len(ocr_text) > 100 and aspect_ratio < 1.5: + # 检查手写特征(不规则的行高) + return 'handwritten' + + # 检测是否为截图(可能有UI元素) + if any(keyword in ocr_text.lower() for keyword in ['button', 'menu', 'click', '登录', '确定', '取消']): + return 'screenshot' + + # 默认文档类型 + if len(ocr_text) > 200: + return 'document' + + return 'other' + except Exception as e: + print(f"Image type detection error: {e}") + return 'other' + + def perform_ocr(self, image, lang: str = 'chi_sim+eng') -> Tuple[str, float]: + """ + 对图片进行OCR识别 + + Args: + image: PIL Image 对象 + lang: OCR语言 + + Returns: + (识别的文本, 置信度) + """ + if not PYTESSERACT_AVAILABLE: + return "", 0.0 + + try: + # 预处理图片 + processed_image = self.preprocess_image(image) + + # 执行OCR + text = pytesseract.image_to_string(processed_image, lang=lang) + + # 获取置信度 + data = pytesseract.image_to_data(processed_image, output_type=pytesseract.Output.DICT) + confidences = [int(c) for c in data['conf'] if int(c) > 0] + avg_confidence = sum(confidences) / len(confidences) if confidences else 0 + + return text.strip(), avg_confidence / 100.0 + except Exception as e: + print(f"OCR error: {e}") + return "", 0.0 + + def extract_entities_from_text(self, text: str) -> List[ImageEntity]: + """ + 从OCR文本中提取实体 + + Args: + text: OCR识别的文本 + + Returns: + 实体列表 + """ + entities = [] + + # 简单的实体提取规则(可以替换为LLM调用) + # 提取大写字母开头的词组(可能是专有名词) + import re + + # 项目名称(通常是大写或带引号) + project_pattern = r'["\']([^"\']+)["\']|([A-Z][a-zA-Z0-9]*(?:\s+[A-Z][a-zA-Z0-9]*)+)' + for match in re.finditer(project_pattern, text): + name = match.group(1) or match.group(2) + if name and len(name) > 2: + entities.append(ImageEntity( + name=name.strip(), + type='PROJECT', + confidence=0.7 + )) + + # 人名(中文) + name_pattern = r'([\u4e00-\u9fa5]{2,4})(?:先生|女士|总|经理|工程师|老师)' + for match in re.finditer(name_pattern, text): + entities.append(ImageEntity( + name=match.group(1), + type='PERSON', + confidence=0.8 + )) + + # 技术术语 + tech_keywords = ['K8s', 'Kubernetes', 'Docker', 'API', 'SDK', 'AI', 'ML', + 'Python', 'Java', 'React', 'Vue', 'Node.js', '数据库', '服务器'] + for keyword in tech_keywords: + if keyword in text: + entities.append(ImageEntity( + name=keyword, + type='TECH', + confidence=0.9 + )) + + # 去重 + seen = set() + unique_entities = [] + for e in entities: + key = (e.name.lower(), e.type) + if key not in seen: + seen.add(key) + unique_entities.append(e) + + return unique_entities + + def generate_description(self, image_type: str, ocr_text: str, + entities: List[ImageEntity]) -> str: + """ + 生成图片描述 + + Args: + image_type: 图片类型 + ocr_text: OCR文本 + entities: 检测到的实体 + + Returns: + 图片描述 + """ + type_name = self.IMAGE_TYPES.get(image_type, '图片') + + description_parts = [f"这是一张{type_name}图片。"] + + if ocr_text: + # 提取前200字符作为摘要 + text_preview = ocr_text[:200].replace('\n', ' ') + if len(ocr_text) > 200: + text_preview += "..." + description_parts.append(f"内容摘要:{text_preview}") + + if entities: + entity_names = [e.name for e in entities[:5]] # 最多显示5个实体 + description_parts.append(f"识别到的关键实体:{', '.join(entity_names)}") + + return " ".join(description_parts) + + def process_image(self, image_data: bytes, filename: str = None, + image_id: str = None, detect_type: bool = True) -> ImageProcessingResult: + """ + 处理单张图片 + + Args: + image_data: 图片二进制数据 + filename: 文件名 + image_id: 图片ID(可选) + detect_type: 是否自动检测图片类型 + + Returns: + 图片处理结果 + """ + image_id = image_id or str(uuid.uuid4())[:8] + + if not PIL_AVAILABLE: + return ImageProcessingResult( + image_id=image_id, + image_type='other', + ocr_text='', + description='PIL not available', + entities=[], + relations=[], + width=0, + height=0, + success=False, + error_message='PIL library not available' + ) + + try: + # 加载图片 + image = Image.open(io.BytesIO(image_data)) + width, height = image.size + + # 执行OCR + ocr_text, ocr_confidence = self.perform_ocr(image) + + # 检测图片类型 + image_type = 'other' + if detect_type: + image_type = self.detect_image_type(image, ocr_text) + + # 提取实体 + entities = self.extract_entities_from_text(ocr_text) + + # 生成描述 + description = self.generate_description(image_type, ocr_text, entities) + + # 提取关系(基于实体共现) + relations = self._extract_relations(entities, ocr_text) + + # 保存图片文件(可选) + if filename: + save_path = os.path.join(self.temp_dir, f"{image_id}_{filename}") + image.save(save_path) + + return ImageProcessingResult( + image_id=image_id, + image_type=image_type, + ocr_text=ocr_text, + description=description, + entities=entities, + relations=relations, + width=width, + height=height, + success=True + ) + + except Exception as e: + return ImageProcessingResult( + image_id=image_id, + image_type='other', + ocr_text='', + description='', + entities=[], + relations=[], + width=0, + height=0, + success=False, + error_message=str(e) + ) + + def _extract_relations(self, entities: List[ImageEntity], text: str) -> List[ImageRelation]: + """ + 从文本中提取实体关系 + + Args: + entities: 实体列表 + text: 文本内容 + + Returns: + 关系列表 + """ + relations = [] + + if len(entities) < 2: + return relations + + # 简单的关系提取:如果两个实体在同一句子中出现,则认为它们相关 + sentences = text.replace('。', '.').replace('!', '!').replace('?', '?').split('.') + + for sentence in sentences: + sentence_entities = [] + for entity in entities: + if entity.name in sentence: + sentence_entities.append(entity) + + # 如果句子中有多个实体,建立关系 + if len(sentence_entities) >= 2: + for i in range(len(sentence_entities)): + for j in range(i + 1, len(sentence_entities)): + relations.append(ImageRelation( + source=sentence_entities[i].name, + target=sentence_entities[j].name, + relation_type='related', + confidence=0.5 + )) + + return relations + + def process_batch(self, images_data: List[Tuple[bytes, str]], + project_id: str = None) -> BatchProcessingResult: + """ + 批量处理图片 + + Args: + images_data: 图片数据列表,每项为 (image_data, filename) + project_id: 项目ID + + Returns: + 批量处理结果 + """ + results = [] + success_count = 0 + failed_count = 0 + + for image_data, filename in images_data: + result = self.process_image(image_data, filename) + results.append(result) + + if result.success: + success_count += 1 + else: + failed_count += 1 + + return BatchProcessingResult( + results=results, + total_count=len(results), + success_count=success_count, + failed_count=failed_count + ) + + def image_to_base64(self, image_data: bytes) -> str: + """ + 将图片转换为base64编码 + + Args: + image_data: 图片二进制数据 + + Returns: + base64编码的字符串 + """ + return base64.b64encode(image_data).decode('utf-8') + + def get_image_thumbnail(self, image_data: bytes, size: Tuple[int, int] = (200, 200)) -> bytes: + """ + 生成图片缩略图 + + Args: + image_data: 图片二进制数据 + size: 缩略图尺寸 + + Returns: + 缩略图二进制数据 + """ + if not PIL_AVAILABLE: + return image_data + + try: + image = Image.open(io.BytesIO(image_data)) + image.thumbnail(size, Image.Resampling.LANCZOS) + + buffer = io.BytesIO() + image.save(buffer, format='JPEG') + return buffer.getvalue() + except Exception as e: + print(f"Thumbnail generation error: {e}") + return image_data + + +# Singleton instance +_image_processor = None + +def get_image_processor(temp_dir: str = None) -> ImageProcessor: + """获取图片处理器单例""" + global _image_processor + if _image_processor is None: + _image_processor = ImageProcessor(temp_dir) + return _image_processor diff --git a/backend/main.py b/backend/main.py index 412c311..e0e4960 100644 --- a/backend/main.py +++ b/backend/main.py @@ -111,6 +111,50 @@ except ImportError as e: print(f"Workflow Manager import error: {e}") WORKFLOW_AVAILABLE = False +# Phase 7: Multimodal Support +try: + from multimodal_processor import ( + get_multimodal_processor, MultimodalProcessor, + VideoProcessingResult, VideoFrame + ) + MULTIMODAL_AVAILABLE = True +except ImportError as e: + print(f"Multimodal Processor import error: {e}") + MULTIMODAL_AVAILABLE = False + +try: + from image_processor import ( + get_image_processor, ImageProcessor, + ImageProcessingResult, ImageEntity, ImageRelation + ) + IMAGE_PROCESSOR_AVAILABLE = True +except ImportError as e: + print(f"Image Processor import error: {e}") + IMAGE_PROCESSOR_AVAILABLE = False + +try: + from multimodal_entity_linker import ( + get_multimodal_entity_linker, MultimodalEntityLinker, + MultimodalEntity, EntityLink, AlignmentResult, FusionResult + ) + MULTIMODAL_LINKER_AVAILABLE = True +except ImportError as e: + print(f"Multimodal Entity Linker import error: {e}") + MULTIMODAL_LINKER_AVAILABLE = False + +# Phase 7 Task 7: Plugin Manager +try: + from plugin_manager import ( + get_plugin_manager, PluginManager, Plugin, + BotSession, WebhookEndpoint, WebDAVSync, + PluginType, PluginStatus, ChromeExtensionHandler, BotHandler, + WebhookIntegration + ) + PLUGIN_MANAGER_AVAILABLE = True +except ImportError as e: + print(f"Plugin Manager import error: {e}") + PLUGIN_MANAGER_AVAILABLE = False + # FastAPI app with enhanced metadata for Swagger app = FastAPI( title="InsightFlow API", @@ -155,6 +199,12 @@ app = FastAPI( {"name": "API Keys", "description": "API 密钥管理"}, {"name": "Workflows", "description": "工作流自动化"}, {"name": "Webhooks", "description": "Webhook 配置"}, + {"name": "Multimodal", "description": "多模态支持(视频、图片)"}, + {"name": "Plugins", "description": "插件管理"}, + {"name": "Chrome Extension", "description": "Chrome 扩展集成"}, + {"name": "Bot", "description": "飞书/钉钉机器人"}, + {"name": "Integrations", "description": "Zapier/Make 集成"}, + {"name": "WebDAV", "description": "WebDAV 同步"}, {"name": "System", "description": "系统信息"}, ] ) @@ -1496,15 +1546,19 @@ async def get_entity_mentions(entity_id: str, _=Depends(verify_api_key)): async def health_check(): return { "status": "ok", - "version": "0.6.0", - "phase": "Phase 5 - Knowledge Reasoning", + "version": "0.7.0", + "phase": "Phase 7 - Plugin & Integration", "oss_available": OSS_AVAILABLE, "tingwu_available": TINGWU_AVAILABLE, "db_available": DB_AVAILABLE, "doc_processor_available": DOC_PROCESSOR_AVAILABLE, "aligner_available": ALIGNER_AVAILABLE, "llm_client_available": LLM_CLIENT_AVAILABLE, - "reasoner_available": REASONER_AVAILABLE + "reasoner_available": REASONER_AVAILABLE, + "multimodal_available": MULTIMODAL_AVAILABLE, + "image_processor_available": IMAGE_PROCESSOR_AVAILABLE, + "multimodal_linker_available": MULTIMODAL_LINKER_AVAILABLE, + "plugin_manager_available": PLUGIN_MANAGER_AVAILABLE } @@ -3380,7 +3434,7 @@ async def health_check(): """健康检查端点""" return { "status": "healthy", - "version": "0.6.0", + "version": "0.7.0", "timestamp": datetime.now().isoformat() } @@ -3390,7 +3444,7 @@ async def system_status(): """系统状态信息""" status = { "version": "0.7.0", - "phase": "Phase 7 - Workflow Automation", + "phase": "Phase 7 - Plugin & Integration", "features": { "database": DB_AVAILABLE, "oss": OSS_AVAILABLE, @@ -3401,6 +3455,9 @@ async def system_status(): "api_keys": API_KEY_AVAILABLE, "rate_limiting": RATE_LIMITER_AVAILABLE, "workflow": WORKFLOW_AVAILABLE, + "multimodal": MULTIMODAL_AVAILABLE, + "multimodal_linker": MULTIMODAL_LINKER_AVAILABLE, + "plugin_manager": PLUGIN_MANAGER_AVAILABLE, }, "api": { "documentation": "/docs", @@ -3876,6 +3933,788 @@ async def test_webhook_endpoint(webhook_id: str, _=Depends(verify_api_key)): raise HTTPException(status_code=400, detail="Webhook test failed") +# ==================== Phase 7: Multimodal Support Endpoints ==================== + +# Pydantic Models for Multimodal API +class VideoUploadResponse(BaseModel): + video_id: str + project_id: str + filename: str + status: str + audio_extracted: bool + frame_count: int + ocr_text_preview: str + message: str + + +class ImageUploadResponse(BaseModel): + image_id: str + project_id: str + filename: str + image_type: str + ocr_text_preview: str + description: str + entity_count: int + status: str + + +class MultimodalEntityLinkResponse(BaseModel): + link_id: str + source_entity_id: str + target_entity_id: str + source_modality: str + target_modality: str + link_type: str + confidence: float + evidence: str + + +class MultimodalAlignmentRequest(BaseModel): + project_id: str + threshold: float = 0.85 + + +class MultimodalAlignmentResponse(BaseModel): + project_id: str + aligned_count: int + links: List[MultimodalEntityLinkResponse] + message: str + + +class MultimodalStatsResponse(BaseModel): + project_id: str + video_count: int + image_count: int + multimodal_entity_count: int + cross_modal_links: int + modality_distribution: Dict[str, int] + + +@app.post("/api/v1/projects/{project_id}/upload-video", response_model=VideoUploadResponse, tags=["Multimodal"]) +async def upload_video_endpoint( + project_id: str, + file: UploadFile = File(...), + extract_interval: int = Form(5), + _=Depends(verify_api_key) +): + """ + 上传视频文件进行处理 + + - 提取音频轨道 + - 提取关键帧(每 N 秒一帧) + - 对关键帧进行 OCR 识别 + - 将视频、音频、OCR 结果整合 + + **参数:** + - **extract_interval**: 关键帧提取间隔(秒),默认 5 秒 + """ + if not MULTIMODAL_AVAILABLE: + raise HTTPException(status_code=503, detail="Multimodal processing not available") + + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + project = db.get_project(project_id) + if not project: + raise HTTPException(status_code=404, detail="Project not found") + + # 读取视频文件 + video_data = await file.read() + + # 创建视频处理器 + processor = get_multimodal_processor(frame_interval=extract_interval) + + # 处理视频 + video_id = str(uuid.uuid4())[:8] + result = processor.process_video(video_data, file.filename, project_id, video_id) + + if not result.success: + raise HTTPException(status_code=500, detail=f"Video processing failed: {result.error_message}") + + # 保存视频信息到数据库 + conn = db.get_conn() + now = datetime.now().isoformat() + + # 获取视频信息 + video_info = processor.extract_video_info(os.path.join(processor.video_dir, f"{video_id}_{file.filename}")) + + conn.execute( + """INSERT INTO videos + (id, project_id, filename, duration, fps, resolution, + audio_transcript_id, full_ocr_text, extracted_entities, + extracted_relations, status, created_at, updated_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (video_id, project_id, file.filename, video_info.get('duration', 0), + video_info.get('fps', 0), + json.dumps({'width': video_info.get('width', 0), 'height': video_info.get('height', 0)}), + None, result.full_text, '[]', '[]', 'completed', now, now) + ) + + # 保存关键帧信息 + for frame in result.frames: + conn.execute( + """INSERT INTO video_frames + (id, video_id, frame_number, timestamp, image_url, ocr_text, extracted_entities, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?)""", + (frame.id, frame.video_id, frame.frame_number, frame.timestamp, + frame.frame_path, frame.ocr_text, json.dumps(frame.entities_detected), now) + ) + + conn.commit() + conn.close() + + # 提取实体和关系(复用现有的 LLM 提取逻辑) + if result.full_text: + raw_entities, raw_relations = extract_entities_with_llm(result.full_text) + + # 实体对齐并保存 + entity_name_to_id = {} + for raw_ent in raw_entities: + existing = align_entity(project_id, raw_ent["name"], db, raw_ent.get("definition", "")) + + if existing: + entity_name_to_id[raw_ent["name"]] = existing.id + else: + new_ent = db.create_entity(Entity( + id=str(uuid.uuid4())[:8], + project_id=project_id, + name=raw_ent["name"], + type=raw_ent.get("type", "OTHER"), + definition=raw_ent.get("definition", "") + )) + entity_name_to_id[raw_ent["name"]] = new_ent.id + + # 保存多模态实体提及 + conn = db.get_conn() + conn.execute( + """INSERT OR REPLACE INTO multimodal_mentions + (id, project_id, entity_id, modality, source_id, source_type, text_snippet, confidence, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (str(uuid.uuid4())[:8], project_id, entity_name_to_id[raw_ent["name"]], + 'video', video_id, 'video_frame', raw_ent.get("name", ""), 1.0, now) + ) + conn.commit() + conn.close() + + # 保存关系 + for rel in raw_relations: + source_id = entity_name_to_id.get(rel.get("source", "")) + target_id = entity_name_to_id.get(rel.get("target", "")) + if source_id and target_id: + db.create_relation( + project_id=project_id, + source_entity_id=source_id, + target_entity_id=target_id, + relation_type=rel.get("type", "related"), + evidence=result.full_text[:200] + ) + + # 更新视频的实体和关系信息 + conn = db.get_conn() + conn.execute( + "UPDATE videos SET extracted_entities = ?, extracted_relations = ? WHERE id = ?", + (json.dumps(raw_entities), json.dumps(raw_relations), video_id) + ) + conn.commit() + conn.close() + + return VideoUploadResponse( + video_id=video_id, + project_id=project_id, + filename=file.filename, + status="completed", + audio_extracted=bool(result.audio_path), + frame_count=len(result.frames), + ocr_text_preview=result.full_text[:200] + "..." if len(result.full_text) > 200 else result.full_text, + message="Video processed successfully" + ) + + +@app.post("/api/v1/projects/{project_id}/upload-image", response_model=ImageUploadResponse, tags=["Multimodal"]) +async def upload_image_endpoint( + project_id: str, + file: UploadFile = File(...), + detect_type: bool = Form(True), + _=Depends(verify_api_key) +): + """ + 上传图片文件进行处理 + + - 图片内容识别(白板、PPT、手写笔记) + - 使用 OCR 识别图片中的文字 + - 提取图片中的实体和关系 + + **参数:** + - **detect_type**: 是否自动检测图片类型,默认 True + """ + if not IMAGE_PROCESSOR_AVAILABLE: + raise HTTPException(status_code=503, detail="Image processing not available") + + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + project = db.get_project(project_id) + if not project: + raise HTTPException(status_code=404, detail="Project not found") + + # 读取图片文件 + image_data = await file.read() + + # 创建图片处理器 + processor = get_image_processor() + + # 处理图片 + image_id = str(uuid.uuid4())[:8] + result = processor.process_image(image_data, file.filename, image_id, detect_type) + + if not result.success: + raise HTTPException(status_code=500, detail=f"Image processing failed: {result.error_message}") + + # 保存图片信息到数据库 + conn = db.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT INTO images + (id, project_id, filename, ocr_text, description, + extracted_entities, extracted_relations, status, created_at, updated_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (image_id, project_id, file.filename, result.ocr_text, result.description, + json.dumps([{"name": e.name, "type": e.type, "confidence": e.confidence} for e in result.entities]), + json.dumps([{"source": r.source, "target": r.target, "type": r.relation_type} for r in result.relations]), + 'completed', now, now) + ) + conn.commit() + conn.close() + + # 保存提取的实体 + for entity in result.entities: + existing = align_entity(project_id, entity.name, db, "") + + if not existing: + new_ent = db.create_entity(Entity( + id=str(uuid.uuid4())[:8], + project_id=project_id, + name=entity.name, + type=entity.type, + definition="" + )) + entity_id = new_ent.id + else: + entity_id = existing.id + + # 保存多模态实体提及 + conn = db.get_conn() + conn.execute( + """INSERT OR REPLACE INTO multimodal_mentions + (id, project_id, entity_id, modality, source_id, source_type, text_snippet, confidence, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (str(uuid.uuid4())[:8], project_id, entity_id, + 'image', image_id, result.image_type, entity.name, entity.confidence, now) + ) + conn.commit() + conn.close() + + # 保存提取的关系 + for relation in result.relations: + source_entity = db.get_entity_by_name(project_id, relation.source) + target_entity = db.get_entity_by_name(project_id, relation.target) + + if source_entity and target_entity: + db.create_relation( + project_id=project_id, + source_entity_id=source_entity.id, + target_entity_id=target_entity.id, + relation_type=relation.relation_type, + evidence=result.ocr_text[:200] + ) + + return ImageUploadResponse( + image_id=image_id, + project_id=project_id, + filename=file.filename, + image_type=result.image_type, + ocr_text_preview=result.ocr_text[:200] + "..." if len(result.ocr_text) > 200 else result.ocr_text, + description=result.description, + entity_count=len(result.entities), + status="completed" + ) + + +@app.post("/api/v1/projects/{project_id}/upload-images-batch", tags=["Multimodal"]) +async def upload_images_batch_endpoint( + project_id: str, + files: List[UploadFile] = File(...), + _=Depends(verify_api_key) +): + """ + 批量上传图片文件进行处理 + + 支持一次上传多张图片,每张图片都会进行 OCR 和实体提取 + """ + if not IMAGE_PROCESSOR_AVAILABLE: + raise HTTPException(status_code=503, detail="Image processing not available") + + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + project = db.get_project(project_id) + if not project: + raise HTTPException(status_code=404, detail="Project not found") + + # 读取所有图片 + images_data = [] + for file in files: + image_data = await file.read() + images_data.append((image_data, file.filename)) + + # 批量处理 + processor = get_image_processor() + batch_result = processor.process_batch(images_data, project_id) + + # 保存结果 + results = [] + for result in batch_result.results: + if result.success: + image_id = result.image_id + + # 保存到数据库 + conn = db.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT INTO images + (id, project_id, filename, ocr_text, description, + extracted_entities, extracted_relations, status, created_at, updated_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (image_id, project_id, "batch_image", result.ocr_text, result.description, + json.dumps([{"name": e.name, "type": e.type} for e in result.entities]), + json.dumps([{"source": r.source, "target": r.target} for r in result.relations]), + 'completed', now, now) + ) + conn.commit() + conn.close() + + results.append({ + "image_id": image_id, + "status": "success", + "image_type": result.image_type, + "entity_count": len(result.entities) + }) + else: + results.append({ + "image_id": result.image_id, + "status": "failed", + "error": result.error_message + }) + + return { + "project_id": project_id, + "total_count": batch_result.total_count, + "success_count": batch_result.success_count, + "failed_count": batch_result.failed_count, + "results": results + } + + +@app.post("/api/v1/projects/{project_id}/multimodal/align", response_model=MultimodalAlignmentResponse, tags=["Multimodal"]) +async def align_multimodal_entities_endpoint( + project_id: str, + threshold: float = 0.85, + _=Depends(verify_api_key) +): + """ + 跨模态实体对齐 + + 对齐同一实体在不同模态(音频、视频、图片、文档)中的提及 + + **参数:** + - **threshold**: 相似度阈值,默认 0.85 + """ + if not MULTIMODAL_LINKER_AVAILABLE: + raise HTTPException(status_code=503, detail="Multimodal entity linker not available") + + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + project = db.get_project(project_id) + if not project: + raise HTTPException(status_code=404, detail="Project not found") + + # 获取所有实体 + entities = db.list_project_entities(project_id) + + # 获取多模态提及 + conn = db.get_conn() + mentions = conn.execute( + """SELECT * FROM multimodal_mentions WHERE project_id = ?""", + (project_id,) + ).fetchall() + conn.close() + + # 按模态分组实体 + modality_entities = {"audio": [], "video": [], "image": [], "document": []} + + for mention in mentions: + modality = mention['modality'] + entity = db.get_entity(mention['entity_id']) + if entity and entity.id not in [e.get('id') for e in modality_entities[modality]]: + modality_entities[modality].append({ + 'id': entity.id, + 'name': entity.name, + 'type': entity.type, + 'definition': entity.definition, + 'aliases': entity.aliases + }) + + # 跨模态对齐 + linker = get_multimodal_entity_linker(similarity_threshold=threshold) + links = linker.align_cross_modal_entities( + project_id=project_id, + audio_entities=modality_entities['audio'], + video_entities=modality_entities['video'], + image_entities=modality_entities['image'], + document_entities=modality_entities['document'] + ) + + # 保存关联到数据库 + conn = db.get_conn() + now = datetime.now().isoformat() + + saved_links = [] + for link in links: + conn.execute( + """INSERT OR REPLACE INTO multimodal_entity_links + (id, entity_id, linked_entity_id, link_type, confidence, evidence, modalities, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?)""", + (link.id, link.source_entity_id, link.target_entity_id, link.link_type, + link.confidence, link.evidence, + json.dumps([link.source_modality, link.target_modality]), now) + ) + saved_links.append(MultimodalEntityLinkResponse( + link_id=link.id, + source_entity_id=link.source_entity_id, + target_entity_id=link.target_entity_id, + source_modality=link.source_modality, + target_modality=link.target_modality, + link_type=link.link_type, + confidence=link.confidence, + evidence=link.evidence + )) + + conn.commit() + conn.close() + + return MultimodalAlignmentResponse( + project_id=project_id, + aligned_count=len(saved_links), + links=saved_links, + message=f"Successfully aligned {len(saved_links)} cross-modal entity pairs" + ) + + +@app.get("/api/v1/projects/{project_id}/multimodal/stats", response_model=MultimodalStatsResponse, tags=["Multimodal"]) +async def get_multimodal_stats_endpoint(project_id: str, _=Depends(verify_api_key)): + """ + 获取项目多模态统计信息 + + 返回项目中视频、图片数量,以及跨模态实体关联统计 + """ + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + project = db.get_project(project_id) + if not project: + raise HTTPException(status_code=404, detail="Project not found") + + conn = db.get_conn() + + # 统计视频数量 + video_count = conn.execute( + "SELECT COUNT(*) as count FROM videos WHERE project_id = ?", + (project_id,) + ).fetchone()['count'] + + # 统计图片数量 + image_count = conn.execute( + "SELECT COUNT(*) as count FROM images WHERE project_id = ?", + (project_id,) + ).fetchone()['count'] + + # 统计多模态实体提及 + multimodal_count = conn.execute( + "SELECT COUNT(DISTINCT entity_id) as count FROM multimodal_mentions WHERE project_id = ?", + (project_id,) + ).fetchone()['count'] + + # 统计跨模态关联 + cross_modal_count = conn.execute( + "SELECT COUNT(*) as count FROM multimodal_entity_links WHERE entity_id IN (SELECT id FROM entities WHERE project_id = ?)", + (project_id,) + ).fetchone()['count'] + + # 模态分布 + modality_dist = {} + for modality in ['audio', 'video', 'image', 'document']: + count = conn.execute( + "SELECT COUNT(*) as count FROM multimodal_mentions WHERE project_id = ? AND modality = ?", + (project_id, modality) + ).fetchone()['count'] + modality_dist[modality] = count + + conn.close() + + return MultimodalStatsResponse( + project_id=project_id, + video_count=video_count, + image_count=image_count, + multimodal_entity_count=multimodal_count, + cross_modal_links=cross_modal_count, + modality_distribution=modality_dist + ) + + +@app.get("/api/v1/projects/{project_id}/videos", tags=["Multimodal"]) +async def list_project_videos_endpoint(project_id: str, _=Depends(verify_api_key)): + """获取项目的视频列表""" + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + conn = db.get_conn() + + videos = conn.execute( + """SELECT id, filename, duration, fps, resolution, + full_ocr_text, status, created_at + FROM videos WHERE project_id = ? ORDER BY created_at DESC""", + (project_id,) + ).fetchall() + + conn.close() + + return [{ + "id": v['id'], + "filename": v['filename'], + "duration": v['duration'], + "fps": v['fps'], + "resolution": json.loads(v['resolution']) if v['resolution'] else None, + "ocr_preview": v['full_ocr_text'][:200] + "..." if v['full_ocr_text'] and len(v['full_ocr_text']) > 200 else v['full_ocr_text'], + "status": v['status'], + "created_at": v['created_at'] + } for v in videos] + + +@app.get("/api/v1/projects/{project_id}/images", tags=["Multimodal"]) +async def list_project_images_endpoint(project_id: str, _=Depends(verify_api_key)): + """获取项目的图片列表""" + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + conn = db.get_conn() + + images = conn.execute( + """SELECT id, filename, ocr_text, description, + extracted_entities, status, created_at + FROM images WHERE project_id = ? ORDER BY created_at DESC""", + (project_id,) + ).fetchall() + + conn.close() + + return [{ + "id": img['id'], + "filename": img['filename'], + "ocr_preview": img['ocr_text'][:200] + "..." if img['ocr_text'] and len(img['ocr_text']) > 200 else img['ocr_text'], + "description": img['description'], + "entity_count": len(json.loads(img['extracted_entities'])) if img['extracted_entities'] else 0, + "status": img['status'], + "created_at": img['created_at'] + } for img in images] + + +@app.get("/api/v1/videos/{video_id}/frames", tags=["Multimodal"]) +async def get_video_frames_endpoint(video_id: str, _=Depends(verify_api_key)): + """获取视频的关键帧列表""" + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + conn = db.get_conn() + + frames = conn.execute( + """SELECT id, frame_number, timestamp, image_url, ocr_text, extracted_entities + FROM video_frames WHERE video_id = ? ORDER BY timestamp""", + (video_id,) + ).fetchall() + + conn.close() + + return [{ + "id": f['id'], + "frame_number": f['frame_number'], + "timestamp": f['timestamp'], + "image_url": f['image_url'], + "ocr_text": f['ocr_text'], + "entities": json.loads(f['extracted_entities']) if f['extracted_entities'] else [] + } for f in frames] + + +@app.get("/api/v1/entities/{entity_id}/multimodal-mentions", tags=["Multimodal"]) +async def get_entity_multimodal_mentions_endpoint(entity_id: str, _=Depends(verify_api_key)): + """获取实体的多模态提及信息""" + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + conn = db.get_conn() + + mentions = conn.execute( + """SELECT m.*, e.name as entity_name + FROM multimodal_mentions m + JOIN entities e ON m.entity_id = e.id + WHERE m.entity_id = ? ORDER BY m.created_at DESC""", + (entity_id,) + ).fetchall() + + conn.close() + + return [{ + "id": m['id'], + "entity_id": m['entity_id'], + "entity_name": m['entity_name'], + "modality": m['modality'], + "source_id": m['source_id'], + "source_type": m['source_type'], + "text_snippet": m['text_snippet'], + "confidence": m['confidence'], + "created_at": m['created_at'] + } for m in mentions] + + +@app.get("/api/v1/projects/{project_id}/multimodal/suggest-merges", tags=["Multimodal"]) +async def suggest_multimodal_merges_endpoint(project_id: str, _=Depends(verify_api_key)): + """ + 建议多模态实体合并 + + 分析不同模态中的实体,建议可以合并的实体对 + """ + if not MULTIMODAL_LINKER_AVAILABLE: + raise HTTPException(status_code=503, detail="Multimodal entity linker not available") + + if not DB_AVAILABLE: + raise HTTPException(status_code=500, detail="Database not available") + + db = get_db_manager() + project = db.get_project(project_id) + if not project: + raise HTTPException(status_code=404, detail="Project not found") + + # 获取所有实体 + entities = db.list_project_entities(project_id) + entity_dicts = [{ + 'id': e.id, + 'name': e.name, + 'type': e.type, + 'definition': e.definition, + 'aliases': e.aliases + } for e in entities] + + # 获取现有链接 + conn = db.get_conn() + existing_links = conn.execute( + """SELECT * FROM multimodal_entity_links + WHERE entity_id IN (SELECT id FROM entities WHERE project_id = ?)""", + (project_id,) + ).fetchall() + conn.close() + + existing_link_objects = [] + for row in existing_links: + existing_link_objects.append(EntityLink( + id=row['id'], + project_id=project_id, + source_entity_id=row['entity_id'], + target_entity_id=row['linked_entity_id'], + link_type=row['link_type'], + source_modality='unknown', + target_modality='unknown', + confidence=row['confidence'], + evidence=row['evidence'] or "" + )) + + # 获取建议 + linker = get_multimodal_entity_linker() + suggestions = linker.suggest_entity_merges(entity_dicts, existing_link_objects) + + return { + "project_id": project_id, + "suggestion_count": len(suggestions), + "suggestions": [ + { + "entity1": { + "id": s['entity1'].get('id'), + "name": s['entity1'].get('name'), + "type": s['entity1'].get('type') + }, + "entity2": { + "id": s['entity2'].get('id'), + "name": s['entity2'].get('name'), + "type": s['entity2'].get('type') + }, + "similarity": s['similarity'], + "match_type": s['match_type'], + "suggested_action": s['suggested_action'] + } + for s in suggestions[:20] # 最多返回20个建议 + ] + } + + +# ==================== Phase 7: Multimodal Support API ==================== + +class VideoUploadResponse(BaseModel): + video_id: str + filename: str + duration: float + fps: float + resolution: Dict[str, int] + frames_extracted: int + audio_extracted: bool + ocr_text_length: int + status: str + message: str + + +class ImageUploadResponse(BaseModel): + image_id: str + filename: str + ocr_text_length: int + description: str + status: str + message: str + + +class MultimodalEntityLinkResponse(BaseModel): + link_id: str + entity_id: str + linked_entity_id: str + link_type: str + confidence: float + evidence: str + modalities: List[str] + + +class MultimodalProfileResponse(BaseModel): + entity_id: str + entity_name: str + + @app.get("/api/v1/openapi.json", include_in_schema=False) async def get_openapi(): """获取 OpenAPI 规范""" @@ -3889,6 +4728,658 @@ async def get_openapi(): ) +# ==================== Phase 7 Task 7: Plugin & Integration API ==================== + +class PluginCreateRequest(BaseModel): + name: str + plugin_type: str + project_id: Optional[str] = None + config: Optional[Dict] = {} + + +class PluginResponse(BaseModel): + id: str + name: str + plugin_type: str + project_id: Optional[str] + status: str + api_key: str + created_at: str + + +class BotSessionResponse(BaseModel): + id: str + plugin_id: str + platform: str + session_id: str + user_id: Optional[str] + user_name: Optional[str] + project_id: Optional[str] + message_count: int + created_at: str + last_message_at: Optional[str] + + +class WebhookEndpointResponse(BaseModel): + id: str + plugin_id: str + name: str + endpoint_path: str + endpoint_type: str + target_project_id: Optional[str] + is_active: bool + trigger_count: int + created_at: str + + +class WebDAVSyncResponse(BaseModel): + id: str + plugin_id: str + name: str + server_url: str + username: str + remote_path: str + local_path: str + sync_direction: str + sync_mode: str + auto_analyze: bool + is_active: bool + last_sync_at: Optional[str] + created_at: str + + +class ChromeClipRequest(BaseModel): + url: str + title: str + content: str + content_type: str = "page" + meta: Optional[Dict] = {} + project_id: Optional[str] = None + + +class ChromeClipResponse(BaseModel): + clip_id: str + project_id: str + url: str + title: str + status: str + message: str + + +class BotMessageRequest(BaseModel): + platform: str + session_id: str + user_id: Optional[str] = None + user_name: Optional[str] = None + message_type: str + content: str + project_id: Optional[str] = None + + +class BotMessageResponse(BaseModel): + success: bool + reply: Optional[str] = None + session_id: str + action: Optional[str] = None + + +class WebhookPayload(BaseModel): + event: str + data: Dict + + +@app.post("/api/v1/plugins", response_model=PluginResponse, tags=["Plugins"]) +async def create_plugin( + request: PluginCreateRequest, + api_key: str = Depends(verify_api_key) +): + """创建插件""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + plugin = manager.create_plugin( + name=request.name, + plugin_type=request.plugin_type, + project_id=request.project_id, + config=request.config + ) + + return PluginResponse( + id=plugin.id, + name=plugin.name, + plugin_type=plugin.plugin_type, + project_id=plugin.project_id, + status=plugin.status, + api_key=plugin.api_key, + created_at=plugin.created_at + ) + + +@app.get("/api/v1/plugins", tags=["Plugins"]) +async def list_plugins( + project_id: Optional[str] = None, + plugin_type: Optional[str] = None, + api_key: str = Depends(verify_api_key) +): + """列出插件""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + plugins = manager.list_plugins(project_id=project_id, plugin_type=plugin_type) + + return { + "plugins": [ + { + "id": p.id, + "name": p.name, + "plugin_type": p.plugin_type, + "project_id": p.project_id, + "status": p.status, + "use_count": p.use_count, + "created_at": p.created_at + } + for p in plugins + ] + } + + +@app.get("/api/v1/plugins/{plugin_id}", response_model=PluginResponse, tags=["Plugins"]) +async def get_plugin( + plugin_id: str, + api_key: str = Depends(verify_api_key) +): + """获取插件详情""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + plugin = manager.get_plugin(plugin_id) + + if not plugin: + raise HTTPException(status_code=404, detail="Plugin not found") + + return PluginResponse( + id=plugin.id, + name=plugin.name, + plugin_type=plugin.plugin_type, + project_id=plugin.project_id, + status=plugin.status, + api_key=plugin.api_key, + created_at=plugin.created_at + ) + + +@app.delete("/api/v1/plugins/{plugin_id}", tags=["Plugins"]) +async def delete_plugin( + plugin_id: str, + api_key: str = Depends(verify_api_key) +): + """删除插件""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + manager.delete_plugin(plugin_id) + + return {"success": True, "message": "Plugin deleted"} + + +@app.post("/api/v1/plugins/{plugin_id}/regenerate-key", tags=["Plugins"]) +async def regenerate_plugin_key( + plugin_id: str, + api_key: str = Depends(verify_api_key) +): + """重新生成插件 API Key""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + new_key = manager.regenerate_api_key(plugin_id) + + return {"success": True, "api_key": new_key} + + +# ==================== Chrome Extension API ==================== + +@app.post("/api/v1/plugins/chrome/clip", response_model=ChromeClipResponse, tags=["Chrome Extension"]) +async def chrome_clip( + request: ChromeClipRequest, + x_api_key: Optional[str] = Header(None, alias="X-API-Key") +): + """Chrome 插件保存网页内容""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + if not x_api_key: + raise HTTPException(status_code=401, detail="API Key required") + + manager = get_plugin_manager() + plugin = manager.get_plugin_by_api_key(x_api_key) + + if not plugin or plugin.plugin_type != "chrome_extension": + raise HTTPException(status_code=401, detail="Invalid API Key") + + # 确定目标项目 + project_id = request.project_id or plugin.project_id + if not project_id: + raise HTTPException(status_code=400, detail="Project ID required") + + # 创建转录记录(将网页内容作为文档处理) + db = get_db_manager() + + # 生成文档内容 + doc_content = f"""# {request.title} + +URL: {request.url} + +## 内容 + +{request.content} + +## 元数据 + +{json.dumps(request.meta, ensure_ascii=False, indent=2)} +""" + + # 创建转录记录 + transcript_id = db.create_transcript( + project_id=project_id, + filename=f"clip_{request.title[:50]}.md", + full_text=doc_content, + transcript_type="document" + ) + + # 记录活动 + manager.log_activity( + plugin_id=plugin.id, + activity_type="clip", + source="chrome_extension", + details={ + "url": request.url, + "title": request.title, + "project_id": project_id, + "transcript_id": transcript_id + } + ) + + return ChromeClipResponse( + clip_id=str(uuid.uuid4()), + project_id=project_id, + url=request.url, + title=request.title, + status="success", + message="Content saved successfully" + ) + + +# ==================== Bot API ==================== + +@app.post("/api/v1/bots/webhook/{platform}", response_model=BotMessageResponse, tags=["Bot"]) +async def bot_webhook( + platform: str, + request: Request, + x_signature: Optional[str] = Header(None, alias="X-Signature") +): + """接收机器人 Webhook 消息(飞书/钉钉/Slack)""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + body = await request.body() + payload = json.loads(body) + + manager = get_plugin_manager() + handler = BotHandler(manager) + + # 解析消息 + if platform == "feishu": + message = handler.parse_feishu_message(payload) + elif platform == "dingtalk": + message = handler.parse_dingtalk_message(payload) + elif platform == "slack": + message = handler.parse_slack_message(payload) + else: + raise HTTPException(status_code=400, detail=f"Unsupported platform: {platform}") + + # 查找或创建会话 + # 这里简化处理,实际应该根据 plugin_id 查找 + # 暂时返回简单的回复 + + return BotMessageResponse( + success=True, + reply="收到消息!请使用 InsightFlow 控制台查看更多功能。", + session_id=message.get("session_id", ""), + action="reply" + ) + + +@app.get("/api/v1/bots/sessions", response_model=List[BotSessionResponse], tags=["Bot"]) +async def list_bot_sessions( + plugin_id: Optional[str] = None, + project_id: Optional[str] = None, + api_key: str = Depends(verify_api_key) +): + """列出机器人会话""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + sessions = manager.list_bot_sessions(plugin_id=plugin_id, project_id=project_id) + + return [ + BotSessionResponse( + id=s.id, + plugin_id=s.plugin_id, + platform=s.platform, + session_id=s.session_id, + user_id=s.user_id, + user_name=s.user_name, + project_id=s.project_id, + message_count=s.message_count, + created_at=s.created_at, + last_message_at=s.last_message_at + ) + for s in sessions + ] + + +# ==================== Webhook Integration API ==================== + +@app.post("/api/v1/webhook-endpoints", response_model=WebhookEndpointResponse, tags=["Integrations"]) +async def create_webhook_endpoint( + plugin_id: str, + name: str, + endpoint_type: str, + target_project_id: Optional[str] = None, + allowed_events: Optional[List[str]] = None, + api_key: str = Depends(verify_api_key) +): + """创建 Webhook 端点(用于 Zapier/Make 集成)""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + endpoint = manager.create_webhook_endpoint( + plugin_id=plugin_id, + name=name, + endpoint_type=endpoint_type, + target_project_id=target_project_id, + allowed_events=allowed_events + ) + + return WebhookEndpointResponse( + id=endpoint.id, + plugin_id=endpoint.plugin_id, + name=endpoint.name, + endpoint_path=endpoint.endpoint_path, + endpoint_type=endpoint.endpoint_type, + target_project_id=endpoint.target_project_id, + is_active=endpoint.is_active, + trigger_count=endpoint.trigger_count, + created_at=endpoint.created_at + ) + + +@app.get("/api/v1/webhook-endpoints", response_model=List[WebhookEndpointResponse], tags=["Integrations"]) +async def list_webhook_endpoints( + plugin_id: Optional[str] = None, + api_key: str = Depends(verify_api_key) +): + """列出 Webhook 端点""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + endpoints = manager.list_webhook_endpoints(plugin_id=plugin_id) + + return [ + WebhookEndpointResponse( + id=e.id, + plugin_id=e.plugin_id, + name=e.name, + endpoint_path=e.endpoint_path, + endpoint_type=e.endpoint_type, + target_project_id=e.target_project_id, + is_active=e.is_active, + trigger_count=e.trigger_count, + created_at=e.created_at + ) + for e in endpoints + ] + + +@app.post("/webhook/{endpoint_type}/{token}", tags=["Integrations"]) +async def receive_webhook( + endpoint_type: str, + token: str, + request: Request, + x_signature: Optional[str] = Header(None, alias="X-Signature") +): + """接收外部 Webhook 调用(Zapier/Make/Custom)""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + + # 构建完整路径查找端点 + path = f"/webhook/{endpoint_type}/{token}" + endpoint = manager.get_webhook_endpoint_by_path(path) + + if not endpoint or not endpoint.is_active: + raise HTTPException(status_code=404, detail="Webhook endpoint not found") + + # 验证签名(如果有) + if endpoint.secret and x_signature: + body = await request.body() + integration = WebhookIntegration(manager) + if not integration.validate_signature(body, x_signature, endpoint.secret): + raise HTTPException(status_code=401, detail="Invalid signature") + + # 解析请求体 + body = await request.json() + + # 更新触发统计 + manager.update_webhook_trigger(endpoint.id) + + # 记录活动 + manager.log_activity( + plugin_id=endpoint.plugin_id, + activity_type="webhook", + source=endpoint_type, + details={ + "endpoint_id": endpoint.id, + "event": body.get("event"), + "data_keys": list(body.get("data", {}).keys()) + } + ) + + # 处理数据(简化版本) + # 实际应该根据 endpoint.target_project_id 和 body 内容创建文档/实体等 + + return { + "success": True, + "endpoint_id": endpoint.id, + "received_at": datetime.now().isoformat() + } + + +# ==================== WebDAV API ==================== + +@app.post("/api/v1/webdav-syncs", response_model=WebDAVSyncResponse, tags=["WebDAV"]) +async def create_webdav_sync( + plugin_id: str, + name: str, + server_url: str, + username: str, + password: str, + remote_path: str = "/", + local_path: str = "./sync", + sync_direction: str = "bidirectional", + sync_mode: str = "manual", + auto_analyze: bool = True, + api_key: str = Depends(verify_api_key) +): + """创建 WebDAV 同步配置""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + sync = manager.create_webdav_sync( + plugin_id=plugin_id, + name=name, + server_url=server_url, + username=username, + password=password, + remote_path=remote_path, + local_path=local_path, + sync_direction=sync_direction, + sync_mode=sync_mode, + auto_analyze=auto_analyze + ) + + return WebDAVSyncResponse( + id=sync.id, + plugin_id=sync.plugin_id, + name=sync.name, + server_url=sync.server_url, + username=sync.username, + remote_path=sync.remote_path, + local_path=sync.local_path, + sync_direction=sync.sync_direction, + sync_mode=sync.sync_mode, + auto_analyze=sync.auto_analyze, + is_active=sync.is_active, + last_sync_at=sync.last_sync_at, + created_at=sync.created_at + ) + + +@app.get("/api/v1/webdav-syncs", response_model=List[WebDAVSyncResponse], tags=["WebDAV"]) +async def list_webdav_syncs( + plugin_id: Optional[str] = None, + api_key: str = Depends(verify_api_key) +): + """列出 WebDAV 同步配置""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + syncs = manager.list_webdav_syncs(plugin_id=plugin_id) + + return [ + WebDAVSyncResponse( + id=s.id, + plugin_id=s.plugin_id, + name=s.name, + server_url=s.server_url, + username=s.username, + remote_path=s.remote_path, + local_path=s.local_path, + sync_direction=s.sync_direction, + sync_mode=s.sync_mode, + auto_analyze=s.auto_analyze, + is_active=s.is_active, + last_sync_at=s.last_sync_at, + created_at=s.created_at + ) + for s in syncs + ] + + +@app.post("/api/v1/webdav-syncs/{sync_id}/test", tags=["WebDAV"]) +async def test_webdav_connection( + sync_id: str, + api_key: str = Depends(verify_api_key) +): + """测试 WebDAV 连接""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + sync = manager.get_webdav_sync(sync_id) + + if not sync: + raise HTTPException(status_code=404, detail="WebDAV sync not found") + + from plugin_manager import WebDAVSync as WebDAVSyncHandler + handler = WebDAVSyncHandler(manager) + + success, message = await handler.test_connection( + sync.server_url, + sync.username, + sync.password + ) + + return {"success": success, "message": message} + + +@app.post("/api/v1/webdav-syncs/{sync_id}/sync", tags=["WebDAV"]) +async def trigger_webdav_sync( + sync_id: str, + api_key: str = Depends(verify_api_key) +): + """手动触发 WebDAV 同步""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + sync = manager.get_webdav_sync(sync_id) + + if not sync: + raise HTTPException(status_code=404, detail="WebDAV sync not found") + + # 这里应该启动异步同步任务 + # 简化版本,仅返回成功 + + manager.update_webdav_sync( + sync_id, + last_sync_at=datetime.now().isoformat(), + last_sync_status="running" + ) + + return { + "success": True, + "sync_id": sync_id, + "status": "running", + "message": "Sync started" + } + + +# ==================== Plugin Activity Logs ==================== + +@app.get("/api/v1/plugins/{plugin_id}/logs", tags=["Plugins"]) +async def get_plugin_logs( + plugin_id: str, + activity_type: Optional[str] = None, + limit: int = 100, + api_key: str = Depends(verify_api_key) +): + """获取插件活动日志""" + if not PLUGIN_MANAGER_AVAILABLE: + raise HTTPException(status_code=503, detail="Plugin manager not available") + + manager = get_plugin_manager() + logs = manager.get_activity_logs( + plugin_id=plugin_id, + activity_type=activity_type, + limit=limit + ) + + return { + "logs": [ + { + "id": log.id, + "activity_type": log.activity_type, + "source": log.source, + "details": log.details, + "created_at": log.created_at + } + for log in logs + ] + } + + # Serve frontend - MUST be last to not override API routes app.mount("/", StaticFiles(directory="frontend", html=True), name="frontend") diff --git a/backend/multimodal_entity_linker.py b/backend/multimodal_entity_linker.py new file mode 100644 index 0000000..2b8bc7d --- /dev/null +++ b/backend/multimodal_entity_linker.py @@ -0,0 +1,514 @@ +#!/usr/bin/env python3 +""" +InsightFlow Multimodal Entity Linker - Phase 7 +多模态实体关联模块:跨模态实体对齐和知识融合 +""" + +import os +import json +import uuid +from typing import List, Dict, Optional, Tuple, Set +from dataclasses import dataclass +from difflib import SequenceMatcher + +# 尝试导入embedding库 +try: + import numpy as np + NUMPY_AVAILABLE = True +except ImportError: + NUMPY_AVAILABLE = False + + +@dataclass +class MultimodalEntity: + """多模态实体""" + id: str + entity_id: str + project_id: str + name: str + source_type: str # audio, video, image, document + source_id: str + mention_context: str + confidence: float + modality_features: Dict = None # 模态特定特征 + + def __post_init__(self): + if self.modality_features is None: + self.modality_features = {} + + +@dataclass +class EntityLink: + """实体关联""" + id: str + project_id: str + source_entity_id: str + target_entity_id: str + link_type: str # same_as, related_to, part_of + source_modality: str + target_modality: str + confidence: float + evidence: str + + +@dataclass +class AlignmentResult: + """对齐结果""" + entity_id: str + matched_entity_id: Optional[str] + similarity: float + match_type: str # exact, fuzzy, embedding + confidence: float + + +@dataclass +class FusionResult: + """知识融合结果""" + canonical_entity_id: str + merged_entity_ids: List[str] + fused_properties: Dict + source_modalities: List[str] + confidence: float + + +class MultimodalEntityLinker: + """多模态实体关联器 - 跨模态实体对齐和知识融合""" + + # 关联类型 + LINK_TYPES = { + 'same_as': '同一实体', + 'related_to': '相关实体', + 'part_of': '组成部分', + 'mentions': '提及关系' + } + + # 模态类型 + MODALITIES = ['audio', 'video', 'image', 'document'] + + def __init__(self, similarity_threshold: float = 0.85): + """ + 初始化多模态实体关联器 + + Args: + similarity_threshold: 相似度阈值 + """ + self.similarity_threshold = similarity_threshold + + def calculate_string_similarity(self, s1: str, s2: str) -> float: + """ + 计算字符串相似度 + + Args: + s1: 字符串1 + s2: 字符串2 + + Returns: + 相似度分数 (0-1) + """ + if not s1 or not s2: + return 0.0 + + s1, s2 = s1.lower().strip(), s2.lower().strip() + + # 完全匹配 + if s1 == s2: + return 1.0 + + # 包含关系 + if s1 in s2 or s2 in s1: + return 0.9 + + # 编辑距离相似度 + return SequenceMatcher(None, s1, s2).ratio() + + def calculate_entity_similarity(self, entity1: Dict, entity2: Dict) -> Tuple[float, str]: + """ + 计算两个实体的综合相似度 + + Args: + entity1: 实体1信息 + entity2: 实体2信息 + + Returns: + (相似度, 匹配类型) + """ + # 名称相似度 + name_sim = self.calculate_string_similarity( + entity1.get('name', ''), + entity2.get('name', '') + ) + + # 如果名称完全匹配 + if name_sim == 1.0: + return 1.0, 'exact' + + # 检查别名 + aliases1 = set(a.lower() for a in entity1.get('aliases', [])) + aliases2 = set(a.lower() for a in entity2.get('aliases', [])) + + if aliases1 & aliases2: # 有共同别名 + return 0.95, 'alias_match' + + if entity2.get('name', '').lower() in aliases1: + return 0.95, 'alias_match' + if entity1.get('name', '').lower() in aliases2: + return 0.95, 'alias_match' + + # 定义相似度 + def_sim = self.calculate_string_similarity( + entity1.get('definition', ''), + entity2.get('definition', '') + ) + + # 综合相似度 + combined_sim = name_sim * 0.7 + def_sim * 0.3 + + if combined_sim >= self.similarity_threshold: + return combined_sim, 'fuzzy' + + return combined_sim, 'none' + + def find_matching_entity(self, query_entity: Dict, + candidate_entities: List[Dict], + exclude_ids: Set[str] = None) -> Optional[AlignmentResult]: + """ + 在候选实体中查找匹配的实体 + + Args: + query_entity: 查询实体 + candidate_entities: 候选实体列表 + exclude_ids: 排除的实体ID + + Returns: + 对齐结果 + """ + exclude_ids = exclude_ids or set() + best_match = None + best_similarity = 0.0 + + for candidate in candidate_entities: + if candidate.get('id') in exclude_ids: + continue + + similarity, match_type = self.calculate_entity_similarity( + query_entity, candidate + ) + + if similarity > best_similarity and similarity >= self.similarity_threshold: + best_similarity = similarity + best_match = candidate + best_match_type = match_type + + if best_match: + return AlignmentResult( + entity_id=query_entity.get('id'), + matched_entity_id=best_match.get('id'), + similarity=best_similarity, + match_type=best_match_type, + confidence=best_similarity + ) + + return None + + def align_cross_modal_entities(self, project_id: str, + audio_entities: List[Dict], + video_entities: List[Dict], + image_entities: List[Dict], + document_entities: List[Dict]) -> List[EntityLink]: + """ + 跨模态实体对齐 + + Args: + project_id: 项目ID + audio_entities: 音频模态实体 + video_entities: 视频模态实体 + image_entities: 图片模态实体 + document_entities: 文档模态实体 + + Returns: + 实体关联列表 + """ + links = [] + + # 合并所有实体 + all_entities = { + 'audio': audio_entities, + 'video': video_entities, + 'image': image_entities, + 'document': document_entities + } + + # 跨模态对齐 + for mod1 in self.MODALITIES: + for mod2 in self.MODALITIES: + if mod1 >= mod2: # 避免重复比较 + continue + + entities1 = all_entities.get(mod1, []) + entities2 = all_entities.get(mod2, []) + + for ent1 in entities1: + # 在另一个模态中查找匹配 + result = self.find_matching_entity(ent1, entities2) + + if result and result.matched_entity_id: + link = EntityLink( + id=str(uuid.uuid4())[:8], + project_id=project_id, + source_entity_id=ent1.get('id'), + target_entity_id=result.matched_entity_id, + link_type='same_as' if result.similarity > 0.95 else 'related_to', + source_modality=mod1, + target_modality=mod2, + confidence=result.confidence, + evidence=f"Cross-modal alignment: {result.match_type}" + ) + links.append(link) + + return links + + def fuse_entity_knowledge(self, entity_id: str, + linked_entities: List[Dict], + multimodal_mentions: List[Dict]) -> FusionResult: + """ + 融合多模态实体知识 + + Args: + entity_id: 主实体ID + linked_entities: 关联的实体信息列表 + multimodal_mentions: 多模态提及列表 + + Returns: + 融合结果 + """ + # 收集所有属性 + fused_properties = { + 'names': set(), + 'definitions': [], + 'aliases': set(), + 'types': set(), + 'modalities': set(), + 'contexts': [] + } + + merged_ids = [] + + for entity in linked_entities: + merged_ids.append(entity.get('id')) + + # 收集名称 + fused_properties['names'].add(entity.get('name', '')) + + # 收集定义 + if entity.get('definition'): + fused_properties['definitions'].append(entity.get('definition')) + + # 收集别名 + fused_properties['aliases'].update(entity.get('aliases', [])) + + # 收集类型 + fused_properties['types'].add(entity.get('type', 'OTHER')) + + # 收集模态和上下文 + for mention in multimodal_mentions: + fused_properties['modalities'].add(mention.get('source_type', '')) + if mention.get('mention_context'): + fused_properties['contexts'].append(mention.get('mention_context')) + + # 选择最佳定义(最长的那个) + best_definition = max(fused_properties['definitions'], key=len) \ + if fused_properties['definitions'] else "" + + # 选择最佳名称(最常见的那个) + from collections import Counter + name_counts = Counter(fused_properties['names']) + best_name = name_counts.most_common(1)[0][0] if name_counts else "" + + # 构建融合结果 + return FusionResult( + canonical_entity_id=entity_id, + merged_entity_ids=merged_ids, + fused_properties={ + 'name': best_name, + 'definition': best_definition, + 'aliases': list(fused_properties['aliases']), + 'types': list(fused_properties['types']), + 'modalities': list(fused_properties['modalities']), + 'contexts': fused_properties['contexts'][:10] # 最多10个上下文 + }, + source_modalities=list(fused_properties['modalities']), + confidence=min(1.0, len(linked_entities) * 0.2 + 0.5) + ) + + def detect_entity_conflicts(self, entities: List[Dict]) -> List[Dict]: + """ + 检测实体冲突(同名但不同义) + + Args: + entities: 实体列表 + + Returns: + 冲突列表 + """ + conflicts = [] + + # 按名称分组 + name_groups = {} + for entity in entities: + name = entity.get('name', '').lower() + if name: + if name not in name_groups: + name_groups[name] = [] + name_groups[name].append(entity) + + # 检测同名但定义不同的实体 + for name, group in name_groups.items(): + if len(group) > 1: + # 检查定义是否相似 + definitions = [e.get('definition', '') for e in group if e.get('definition')] + + if len(definitions) > 1: + # 计算定义之间的相似度 + sim_matrix = [] + for i, d1 in enumerate(definitions): + for j, d2 in enumerate(definitions): + if i < j: + sim = self.calculate_string_similarity(d1, d2) + sim_matrix.append(sim) + + # 如果定义相似度都很低,可能是冲突 + if sim_matrix and all(s < 0.5 for s in sim_matrix): + conflicts.append({ + 'name': name, + 'entities': group, + 'type': 'homonym_conflict', + 'suggestion': 'Consider disambiguating these entities' + }) + + return conflicts + + def suggest_entity_merges(self, entities: List[Dict], + existing_links: List[EntityLink] = None) -> List[Dict]: + """ + 建议实体合并 + + Args: + entities: 实体列表 + existing_links: 现有实体关联 + + Returns: + 合并建议列表 + """ + suggestions = [] + existing_pairs = set() + + # 记录已有的关联 + if existing_links: + for link in existing_links: + pair = tuple(sorted([link.source_entity_id, link.target_entity_id])) + existing_pairs.add(pair) + + # 检查所有实体对 + for i, ent1 in enumerate(entities): + for j, ent2 in enumerate(entities): + if i >= j: + continue + + # 检查是否已有关联 + pair = tuple(sorted([ent1.get('id'), ent2.get('id')])) + if pair in existing_pairs: + continue + + # 计算相似度 + similarity, match_type = self.calculate_entity_similarity(ent1, ent2) + + if similarity >= self.similarity_threshold: + suggestions.append({ + 'entity1': ent1, + 'entity2': ent2, + 'similarity': similarity, + 'match_type': match_type, + 'suggested_action': 'merge' if similarity > 0.95 else 'link' + }) + + # 按相似度排序 + suggestions.sort(key=lambda x: x['similarity'], reverse=True) + + return suggestions + + def create_multimodal_entity_record(self, project_id: str, + entity_id: str, + source_type: str, + source_id: str, + mention_context: str = "", + confidence: float = 1.0) -> MultimodalEntity: + """ + 创建多模态实体记录 + + Args: + project_id: 项目ID + entity_id: 实体ID + source_type: 来源类型 + source_id: 来源ID + mention_context: 提及上下文 + confidence: 置信度 + + Returns: + 多模态实体记录 + """ + return MultimodalEntity( + id=str(uuid.uuid4())[:8], + entity_id=entity_id, + project_id=project_id, + name="", # 将在后续填充 + source_type=source_type, + source_id=source_id, + mention_context=mention_context, + confidence=confidence + ) + + def analyze_modality_distribution(self, multimodal_entities: List[MultimodalEntity]) -> Dict: + """ + 分析模态分布 + + Args: + multimodal_entities: 多模态实体列表 + + Returns: + 模态分布统计 + """ + distribution = {mod: 0 for mod in self.MODALITIES} + cross_modal_entities = set() + + # 统计每个模态的实体数 + for me in multimodal_entities: + if me.source_type in distribution: + distribution[me.source_type] += 1 + + # 统计跨模态实体 + entity_modalities = {} + for me in multimodal_entities: + if me.entity_id not in entity_modalities: + entity_modalities[me.entity_id] = set() + entity_modalities[me.entity_id].add(me.source_type) + + cross_modal_count = sum(1 for mods in entity_modalities.values() if len(mods) > 1) + + return { + 'modality_distribution': distribution, + 'total_multimodal_records': len(multimodal_entities), + 'unique_entities': len(entity_modalities), + 'cross_modal_entities': cross_modal_count, + 'cross_modal_ratio': cross_modal_count / len(entity_modalities) if entity_modalities else 0 + } + + +# Singleton instance +_multimodal_entity_linker = None + +def get_multimodal_entity_linker(similarity_threshold: float = 0.85) -> MultimodalEntityLinker: + """获取多模态实体关联器单例""" + global _multimodal_entity_linker + if _multimodal_entity_linker is None: + _multimodal_entity_linker = MultimodalEntityLinker(similarity_threshold) + return _multimodal_entity_linker diff --git a/backend/multimodal_processor.py b/backend/multimodal_processor.py new file mode 100644 index 0000000..522e0c5 --- /dev/null +++ b/backend/multimodal_processor.py @@ -0,0 +1,434 @@ +#!/usr/bin/env python3 +""" +InsightFlow Multimodal Processor - Phase 7 +视频处理模块:提取音频、关键帧、OCR识别 +""" + +import os +import json +import uuid +import tempfile +import subprocess +from typing import List, Dict, Optional, Tuple +from dataclasses import dataclass +from pathlib import Path + +# 尝试导入OCR库 +try: + import pytesseract + from PIL import Image + PYTESSERACT_AVAILABLE = True +except ImportError: + PYTESSERACT_AVAILABLE = False + +try: + import cv2 + CV2_AVAILABLE = True +except ImportError: + CV2_AVAILABLE = False + +try: + import ffmpeg + FFMPEG_AVAILABLE = True +except ImportError: + FFMPEG_AVAILABLE = False + + +@dataclass +class VideoFrame: + """视频关键帧数据类""" + id: str + video_id: str + frame_number: int + timestamp: float + frame_path: str + ocr_text: str = "" + ocr_confidence: float = 0.0 + entities_detected: List[Dict] = None + + def __post_init__(self): + if self.entities_detected is None: + self.entities_detected = [] + + +@dataclass +class VideoInfo: + """视频信息数据类""" + id: str + project_id: str + filename: str + file_path: str + duration: float = 0.0 + width: int = 0 + height: int = 0 + fps: float = 0.0 + audio_extracted: bool = False + audio_path: str = "" + transcript_id: str = "" + status: str = "pending" + error_message: str = "" + metadata: Dict = None + + def __post_init__(self): + if self.metadata is None: + self.metadata = {} + + +@dataclass +class VideoProcessingResult: + """视频处理结果""" + video_id: str + audio_path: str + frames: List[VideoFrame] + ocr_results: List[Dict] + full_text: str # 整合的文本(音频转录 + OCR文本) + success: bool + error_message: str = "" + + +class MultimodalProcessor: + """多模态处理器 - 处理视频文件""" + + def __init__(self, temp_dir: str = None, frame_interval: int = 5): + """ + 初始化多模态处理器 + + Args: + temp_dir: 临时文件目录 + frame_interval: 关键帧提取间隔(秒) + """ + self.temp_dir = temp_dir or tempfile.gettempdir() + self.frame_interval = frame_interval + self.video_dir = os.path.join(self.temp_dir, "videos") + self.frames_dir = os.path.join(self.temp_dir, "frames") + self.audio_dir = os.path.join(self.temp_dir, "audio") + + # 创建目录 + os.makedirs(self.video_dir, exist_ok=True) + os.makedirs(self.frames_dir, exist_ok=True) + os.makedirs(self.audio_dir, exist_ok=True) + + def extract_video_info(self, video_path: str) -> Dict: + """ + 提取视频基本信息 + + Args: + video_path: 视频文件路径 + + Returns: + 视频信息字典 + """ + try: + if FFMPEG_AVAILABLE: + probe = ffmpeg.probe(video_path) + video_stream = next((s for s in probe['streams'] if s['codec_type'] == 'video'), None) + audio_stream = next((s for s in probe['streams'] if s['codec_type'] == 'audio'), None) + + if video_stream: + return { + 'duration': float(probe['format'].get('duration', 0)), + 'width': int(video_stream.get('width', 0)), + 'height': int(video_stream.get('height', 0)), + 'fps': eval(video_stream.get('r_frame_rate', '0/1')), + 'has_audio': audio_stream is not None, + 'bitrate': int(probe['format'].get('bit_rate', 0)) + } + else: + # 使用 ffprobe 命令行 + cmd = [ + 'ffprobe', '-v', 'error', '-show_entries', + 'format=duration,bit_rate', '-show_entries', + 'stream=width,height,r_frame_rate', '-of', 'json', + video_path + ] + result = subprocess.run(cmd, capture_output=True, text=True) + if result.returncode == 0: + data = json.loads(result.stdout) + return { + 'duration': float(data['format'].get('duration', 0)), + 'width': int(data['streams'][0].get('width', 0)) if data['streams'] else 0, + 'height': int(data['streams'][0].get('height', 0)) if data['streams'] else 0, + 'fps': 30.0, # 默认值 + 'has_audio': len(data['streams']) > 1, + 'bitrate': int(data['format'].get('bit_rate', 0)) + } + except Exception as e: + print(f"Error extracting video info: {e}") + + return { + 'duration': 0, + 'width': 0, + 'height': 0, + 'fps': 0, + 'has_audio': False, + 'bitrate': 0 + } + + def extract_audio(self, video_path: str, output_path: str = None) -> str: + """ + 从视频中提取音频 + + Args: + video_path: 视频文件路径 + output_path: 输出音频路径(可选) + + Returns: + 提取的音频文件路径 + """ + if output_path is None: + video_name = Path(video_path).stem + output_path = os.path.join(self.audio_dir, f"{video_name}.wav") + + try: + if FFMPEG_AVAILABLE: + ( + ffmpeg + .input(video_path) + .output(output_path, ac=1, ar=16000, vn=None) + .overwrite_output() + .run(quiet=True) + ) + else: + # 使用命令行 ffmpeg + cmd = [ + 'ffmpeg', '-i', video_path, + '-vn', '-acodec', 'pcm_s16le', + '-ac', '1', '-ar', '16000', + '-y', output_path + ] + subprocess.run(cmd, check=True, capture_output=True) + + return output_path + except Exception as e: + print(f"Error extracting audio: {e}") + raise + + def extract_keyframes(self, video_path: str, video_id: str, + interval: int = None) -> List[str]: + """ + 从视频中提取关键帧 + + Args: + video_path: 视频文件路径 + video_id: 视频ID + interval: 提取间隔(秒),默认使用初始化时的间隔 + + Returns: + 提取的帧文件路径列表 + """ + interval = interval or self.frame_interval + frame_paths = [] + + # 创建帧存储目录 + video_frames_dir = os.path.join(self.frames_dir, video_id) + os.makedirs(video_frames_dir, exist_ok=True) + + try: + if CV2_AVAILABLE: + # 使用 OpenCV 提取帧 + cap = cv2.VideoCapture(video_path) + fps = cap.get(cv2.CAP_PROP_FPS) + total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) + + frame_interval_frames = int(fps * interval) + frame_number = 0 + + while True: + ret, frame = cap.read() + if not ret: + break + + if frame_number % frame_interval_frames == 0: + timestamp = frame_number / fps + frame_path = os.path.join( + video_frames_dir, + f"frame_{frame_number:06d}_{timestamp:.2f}.jpg" + ) + cv2.imwrite(frame_path, frame) + frame_paths.append(frame_path) + + frame_number += 1 + + cap.release() + else: + # 使用 ffmpeg 命令行提取帧 + video_name = Path(video_path).stem + output_pattern = os.path.join(video_frames_dir, "frame_%06d_%t.jpg") + + cmd = [ + 'ffmpeg', '-i', video_path, + '-vf', f'fps=1/{interval}', + '-frame_pts', '1', + '-y', output_pattern + ] + subprocess.run(cmd, check=True, capture_output=True) + + # 获取生成的帧文件列表 + frame_paths = sorted([ + os.path.join(video_frames_dir, f) + for f in os.listdir(video_frames_dir) + if f.startswith('frame_') + ]) + except Exception as e: + print(f"Error extracting keyframes: {e}") + + return frame_paths + + def perform_ocr(self, image_path: str) -> Tuple[str, float]: + """ + 对图片进行OCR识别 + + Args: + image_path: 图片文件路径 + + Returns: + (识别的文本, 置信度) + """ + if not PYTESSERACT_AVAILABLE: + return "", 0.0 + + try: + image = Image.open(image_path) + + # 预处理:转换为灰度图 + if image.mode != 'L': + image = image.convert('L') + + # 使用 pytesseract 进行 OCR + text = pytesseract.image_to_string(image, lang='chi_sim+eng') + + # 获取置信度数据 + data = pytesseract.image_to_data(image, output_type=pytesseract.Output.DICT) + confidences = [int(c) for c in data['conf'] if int(c) > 0] + avg_confidence = sum(confidences) / len(confidences) if confidences else 0 + + return text.strip(), avg_confidence / 100.0 + except Exception as e: + print(f"OCR error for {image_path}: {e}") + return "", 0.0 + + def process_video(self, video_data: bytes, filename: str, + project_id: str, video_id: str = None) -> VideoProcessingResult: + """ + 处理视频文件:提取音频、关键帧、OCR + + Args: + video_data: 视频文件二进制数据 + filename: 视频文件名 + project_id: 项目ID + video_id: 视频ID(可选,自动生成) + + Returns: + 视频处理结果 + """ + video_id = video_id or str(uuid.uuid4())[:8] + + try: + # 保存视频文件 + video_path = os.path.join(self.video_dir, f"{video_id}_{filename}") + with open(video_path, 'wb') as f: + f.write(video_data) + + # 提取视频信息 + video_info = self.extract_video_info(video_path) + + # 提取音频 + audio_path = "" + if video_info['has_audio']: + audio_path = self.extract_audio(video_path) + + # 提取关键帧 + frame_paths = self.extract_keyframes(video_path, video_id) + + # 对关键帧进行 OCR + frames = [] + ocr_results = [] + all_ocr_text = [] + + for i, frame_path in enumerate(frame_paths): + # 解析帧信息 + frame_name = os.path.basename(frame_path) + parts = frame_name.replace('.jpg', '').split('_') + frame_number = int(parts[1]) if len(parts) > 1 else i + timestamp = float(parts[2]) if len(parts) > 2 else i * self.frame_interval + + # OCR 识别 + ocr_text, confidence = self.perform_ocr(frame_path) + + frame = VideoFrame( + id=str(uuid.uuid4())[:8], + video_id=video_id, + frame_number=frame_number, + timestamp=timestamp, + frame_path=frame_path, + ocr_text=ocr_text, + ocr_confidence=confidence + ) + frames.append(frame) + + if ocr_text: + ocr_results.append({ + 'frame_number': frame_number, + 'timestamp': timestamp, + 'text': ocr_text, + 'confidence': confidence + }) + all_ocr_text.append(ocr_text) + + # 整合所有 OCR 文本 + full_ocr_text = "\n\n".join(all_ocr_text) + + return VideoProcessingResult( + video_id=video_id, + audio_path=audio_path, + frames=frames, + ocr_results=ocr_results, + full_text=full_ocr_text, + success=True + ) + + except Exception as e: + return VideoProcessingResult( + video_id=video_id, + audio_path="", + frames=[], + ocr_results=[], + full_text="", + success=False, + error_message=str(e) + ) + + def cleanup(self, video_id: str = None): + """ + 清理临时文件 + + Args: + video_id: 视频ID(可选,清理特定视频的文件) + """ + import shutil + + if video_id: + # 清理特定视频的文件 + for dir_path in [self.video_dir, self.frames_dir, self.audio_dir]: + target_dir = os.path.join(dir_path, video_id) if dir_path == self.frames_dir else dir_path + if os.path.exists(target_dir): + for f in os.listdir(target_dir): + if video_id in f: + os.remove(os.path.join(target_dir, f)) + else: + # 清理所有临时文件 + for dir_path in [self.video_dir, self.frames_dir, self.audio_dir]: + if os.path.exists(dir_path): + shutil.rmtree(dir_path) + os.makedirs(dir_path, exist_ok=True) + + +# Singleton instance +_multimodal_processor = None + +def get_multimodal_processor(temp_dir: str = None, frame_interval: int = 5) -> MultimodalProcessor: + """获取多模态处理器单例""" + global _multimodal_processor + if _multimodal_processor is None: + _multimodal_processor = MultimodalProcessor(temp_dir, frame_interval) + return _multimodal_processor diff --git a/backend/plugin_manager.py b/backend/plugin_manager.py new file mode 100644 index 0000000..0c59845 --- /dev/null +++ b/backend/plugin_manager.py @@ -0,0 +1,1366 @@ +#!/usr/bin/env python3 +""" +InsightFlow Plugin Manager - Phase 7 Task 7 +插件与集成系统:Chrome插件、飞书/钉钉机器人、Zapier/Make集成、WebDAV同步 +""" + +import os +import json +import hashlib +import hmac +import base64 +import time +import uuid +import httpx +import asyncio +from datetime import datetime +from typing import Dict, List, Optional, Any, Callable +from dataclasses import dataclass, field +from enum import Enum +import sqlite3 + +# WebDAV 支持 +try: + import webdav4.client as webdav_client + WEBDAV_AVAILABLE = True +except ImportError: + WEBDAV_AVAILABLE = False + + +class PluginType(Enum): + """插件类型""" + CHROME_EXTENSION = "chrome_extension" + FEISHU_BOT = "feishu_bot" + DINGTALK_BOT = "dingtalk_bot" + ZAPIER = "zapier" + MAKE = "make" + WEBDAV = "webdav" + CUSTOM = "custom" + + +class PluginStatus(Enum): + """插件状态""" + ACTIVE = "active" + INACTIVE = "inactive" + ERROR = "error" + PENDING = "pending" + + +@dataclass +class Plugin: + """插件配置""" + id: str + name: str + plugin_type: str + project_id: str + status: str = "active" + config: Dict = field(default_factory=dict) + created_at: str = "" + updated_at: str = "" + last_used_at: Optional[str] = None + use_count: int = 0 + + +@dataclass +class PluginConfig: + """插件详细配置""" + id: str + plugin_id: str + config_key: str + config_value: str + is_encrypted: bool = False + created_at: str = "" + updated_at: str = "" + + +@dataclass +class BotSession: + """机器人会话""" + id: str + bot_type: str # feishu, dingtalk + session_id: str # 群ID或会话ID + session_name: str + project_id: Optional[str] = None + webhook_url: str = "" + secret: str = "" + is_active: bool = True + created_at: str = "" + updated_at: str = "" + last_message_at: Optional[str] = None + message_count: int = 0 + + +@dataclass +class WebhookEndpoint: + """Webhook 端点配置(Zapier/Make集成)""" + id: str + name: str + endpoint_type: str # zapier, make, custom + endpoint_url: str + project_id: Optional[str] = None + auth_type: str = "none" # none, api_key, oauth, custom + auth_config: Dict = field(default_factory=dict) + trigger_events: List[str] = field(default_factory=list) + is_active: bool = True + created_at: str = "" + updated_at: str = "" + last_triggered_at: Optional[str] = None + trigger_count: int = 0 + + +@dataclass +class WebDAVSync: + """WebDAV 同步配置""" + id: str + name: str + project_id: str + server_url: str + username: str + password: str = "" # 加密存储 + remote_path: str = "/insightflow" + sync_mode: str = "bidirectional" # bidirectional, upload_only, download_only + sync_interval: int = 3600 # 秒 + last_sync_at: Optional[str] = None + last_sync_status: str = "pending" # pending, success, failed + last_sync_error: str = "" + is_active: bool = True + created_at: str = "" + updated_at: str = "" + sync_count: int = 0 + + +@dataclass +class ChromeExtensionToken: + """Chrome 扩展令牌""" + id: str + token: str + user_id: Optional[str] = None + project_id: Optional[str] = None + name: str = "" + permissions: List[str] = field(default_factory=lambda: ["read", "write"]) + expires_at: Optional[str] = None + created_at: str = "" + last_used_at: Optional[str] = None + use_count: int = 0 + is_revoked: bool = False + + +class PluginManager: + """插件管理主类""" + + def __init__(self, db_manager=None): + self.db = db_manager + self._handlers = {} + self._register_default_handlers() + + def _register_default_handlers(self): + """注册默认处理器""" + self._handlers[PluginType.CHROME_EXTENSION] = ChromeExtensionHandler(self) + self._handlers[PluginType.FEISHU_BOT] = BotHandler(self, "feishu") + self._handlers[PluginType.DINGTALK_BOT] = BotHandler(self, "dingtalk") + self._handlers[PluginType.ZAPIER] = WebhookIntegration(self, "zapier") + self._handlers[PluginType.MAKE] = WebhookIntegration(self, "make") + self._handlers[PluginType.WEBDAV] = WebDAVSyncManager(self) + + def get_handler(self, plugin_type: PluginType) -> Optional[Any]: + """获取插件处理器""" + return self._handlers.get(plugin_type) + + # ==================== Plugin CRUD ==================== + + def create_plugin(self, plugin: Plugin) -> Plugin: + """创建插件""" + conn = self.db.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """INSERT INTO plugins + (id, name, plugin_type, project_id, status, config, created_at, updated_at, use_count) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (plugin.id, plugin.name, plugin.plugin_type, plugin.project_id, + plugin.status, json.dumps(plugin.config), now, now, 0) + ) + conn.commit() + conn.close() + + plugin.created_at = now + plugin.updated_at = now + return plugin + + def get_plugin(self, plugin_id: str) -> Optional[Plugin]: + """获取插件""" + conn = self.db.get_conn() + row = conn.execute( + "SELECT * FROM plugins WHERE id = ?", (plugin_id,) + ).fetchone() + conn.close() + + if row: + return self._row_to_plugin(row) + return None + + def list_plugins(self, project_id: str = None, plugin_type: str = None, + status: str = None) -> List[Plugin]: + """列出插件""" + conn = self.db.get_conn() + + conditions = [] + params = [] + + if project_id: + conditions.append("project_id = ?") + params.append(project_id) + if plugin_type: + conditions.append("plugin_type = ?") + params.append(plugin_type) + if status: + conditions.append("status = ?") + params.append(status) + + where_clause = " AND ".join(conditions) if conditions else "1=1" + + rows = conn.execute( + f"SELECT * FROM plugins WHERE {where_clause} ORDER BY created_at DESC", + params + ).fetchall() + conn.close() + + return [self._row_to_plugin(row) for row in rows] + + def update_plugin(self, plugin_id: str, **kwargs) -> Optional[Plugin]: + """更新插件""" + conn = self.db.get_conn() + + allowed_fields = ['name', 'status', 'config'] + updates = [] + values = [] + + for field in allowed_fields: + if field in kwargs: + updates.append(f"{field} = ?") + if field == 'config': + values.append(json.dumps(kwargs[field])) + else: + values.append(kwargs[field]) + + if not updates: + conn.close() + return self.get_plugin(plugin_id) + + updates.append("updated_at = ?") + values.append(datetime.now().isoformat()) + values.append(plugin_id) + + query = f"UPDATE plugins SET {', '.join(updates)} WHERE id = ?" + conn.execute(query, values) + conn.commit() + conn.close() + + return self.get_plugin(plugin_id) + + def delete_plugin(self, plugin_id: str) -> bool: + """删除插件""" + conn = self.db.get_conn() + + # 删除关联的配置 + conn.execute("DELETE FROM plugin_configs WHERE plugin_id = ?", (plugin_id,)) + + # 删除插件 + cursor = conn.execute("DELETE FROM plugins WHERE id = ?", (plugin_id,)) + conn.commit() + conn.close() + + return cursor.rowcount > 0 + + def _row_to_plugin(self, row: sqlite3.Row) -> Plugin: + """将数据库行转换为 Plugin 对象""" + return Plugin( + id=row['id'], + name=row['name'], + plugin_type=row['plugin_type'], + project_id=row['project_id'], + status=row['status'], + config=json.loads(row['config']) if row['config'] else {}, + created_at=row['created_at'], + updated_at=row['updated_at'], + last_used_at=row['last_used_at'], + use_count=row['use_count'] + ) + + # ==================== Plugin Config ==================== + + def set_plugin_config(self, plugin_id: str, key: str, value: str, + is_encrypted: bool = False) -> PluginConfig: + """设置插件配置""" + conn = self.db.get_conn() + now = datetime.now().isoformat() + + # 检查是否已存在 + existing = conn.execute( + "SELECT id FROM plugin_configs WHERE plugin_id = ? AND config_key = ?", + (plugin_id, key) + ).fetchone() + + if existing: + conn.execute( + """UPDATE plugin_configs + SET config_value = ?, is_encrypted = ?, updated_at = ? + WHERE id = ?""", + (value, is_encrypted, now, existing['id']) + ) + config_id = existing['id'] + else: + config_id = str(uuid.uuid4())[:8] + conn.execute( + """INSERT INTO plugin_configs + (id, plugin_id, config_key, config_value, is_encrypted, created_at, updated_at) + VALUES (?, ?, ?, ?, ?, ?, ?)""", + (config_id, plugin_id, key, value, is_encrypted, now, now) + ) + + conn.commit() + conn.close() + + return PluginConfig( + id=config_id, + plugin_id=plugin_id, + config_key=key, + config_value=value, + is_encrypted=is_encrypted, + created_at=now, + updated_at=now + ) + + def get_plugin_config(self, plugin_id: str, key: str) -> Optional[str]: + """获取插件配置""" + conn = self.db.get_conn() + row = conn.execute( + "SELECT config_value FROM plugin_configs WHERE plugin_id = ? AND config_key = ?", + (plugin_id, key) + ).fetchone() + conn.close() + + return row['config_value'] if row else None + + def get_all_plugin_configs(self, plugin_id: str) -> Dict[str, str]: + """获取插件所有配置""" + conn = self.db.get_conn() + rows = conn.execute( + "SELECT config_key, config_value FROM plugin_configs WHERE plugin_id = ?", + (plugin_id,) + ).fetchall() + conn.close() + + return {row['config_key']: row['config_value'] for row in rows} + + def delete_plugin_config(self, plugin_id: str, key: str) -> bool: + """删除插件配置""" + conn = self.db.get_conn() + cursor = conn.execute( + "DELETE FROM plugin_configs WHERE plugin_id = ? AND config_key = ?", + (plugin_id, key) + ) + conn.commit() + conn.close() + + return cursor.rowcount > 0 + + def record_plugin_usage(self, plugin_id: str): + """记录插件使用""" + conn = self.db.get_conn() + now = datetime.now().isoformat() + + conn.execute( + """UPDATE plugins + SET use_count = use_count + 1, last_used_at = ? + WHERE id = ?""", + (now, plugin_id) + ) + conn.commit() + conn.close() + + +class ChromeExtensionHandler: + """Chrome 扩展处理器""" + + def __init__(self, plugin_manager: PluginManager): + self.pm = plugin_manager + + def create_token(self, name: str, user_id: str = None, project_id: str = None, + permissions: List[str] = None, expires_days: int = None) -> ChromeExtensionToken: + """创建 Chrome 扩展令牌""" + token_id = str(uuid.uuid4())[:8] + + # 生成随机令牌 + raw_token = f"if_ext_{base64.urlsafe_b64encode(os.urandom(32)).decode('utf-8').rstrip('=')}" + + # 哈希存储 + token_hash = hashlib.sha256(raw_token.encode()).hexdigest() + + now = datetime.now().isoformat() + expires_at = None + if expires_days: + from datetime import timedelta + expires_at = (datetime.now() + timedelta(days=expires_days)).isoformat() + + conn = self.pm.db.get_conn() + conn.execute( + """INSERT INTO chrome_extension_tokens + (id, token_hash, user_id, project_id, name, permissions, expires_at, + created_at, is_revoked, use_count) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (token_id, token_hash, user_id, project_id, name, + json.dumps(permissions or ["read"]), expires_at, now, False, 0) + ) + conn.commit() + conn.close() + + return ChromeExtensionToken( + id=token_id, + token=raw_token, # 仅返回一次 + user_id=user_id, + project_id=project_id, + name=name, + permissions=permissions or ["read"], + expires_at=expires_at, + created_at=now + ) + + def validate_token(self, token: str) -> Optional[ChromeExtensionToken]: + """验证 Chrome 扩展令牌""" + token_hash = hashlib.sha256(token.encode()).hexdigest() + + conn = self.pm.db.get_conn() + row = conn.execute( + """SELECT * FROM chrome_extension_tokens + WHERE token_hash = ? AND is_revoked = 0""", + (token_hash,) + ).fetchone() + conn.close() + + if not row: + return None + + # 检查是否过期 + if row['expires_at'] and datetime.now().isoformat() > row['expires_at']: + return None + + # 更新使用记录 + now = datetime.now().isoformat() + conn = self.pm.db.get_conn() + conn.execute( + """UPDATE chrome_extension_tokens + SET use_count = use_count + 1, last_used_at = ? + WHERE id = ?""", + (now, row['id']) + ) + conn.commit() + conn.close() + + return ChromeExtensionToken( + id=row['id'], + token="", # 不返回实际令牌 + user_id=row['user_id'], + project_id=row['project_id'], + name=row['name'], + permissions=json.loads(row['permissions']), + expires_at=row['expires_at'], + created_at=row['created_at'], + last_used_at=now, + use_count=row['use_count'] + 1 + ) + + def revoke_token(self, token_id: str) -> bool: + """撤销令牌""" + conn = self.pm.db.get_conn() + cursor = conn.execute( + "UPDATE chrome_extension_tokens SET is_revoked = 1 WHERE id = ?", + (token_id,) + ) + conn.commit() + conn.close() + + return cursor.rowcount > 0 + + def list_tokens(self, user_id: str = None, project_id: str = None) -> List[ChromeExtensionToken]: + """列出令牌""" + conn = self.pm.db.get_conn() + + conditions = ["is_revoked = 0"] + params = [] + + if user_id: + conditions.append("user_id = ?") + params.append(user_id) + if project_id: + conditions.append("project_id = ?") + params.append(project_id) + + where_clause = " AND ".join(conditions) + + rows = conn.execute( + f"SELECT * FROM chrome_extension_tokens WHERE {where_clause} ORDER BY created_at DESC", + params + ).fetchall() + conn.close() + + tokens = [] + for row in rows: + tokens.append(ChromeExtensionToken( + id=row['id'], + token="", # 不返回实际令牌 + user_id=row['user_id'], + project_id=row['project_id'], + name=row['name'], + permissions=json.loads(row['permissions']), + expires_at=row['expires_at'], + created_at=row['created_at'], + last_used_at=row['last_used_at'], + use_count=row['use_count'], + is_revoked=bool(row['is_revoked']) + )) + + return tokens + + async def import_webpage(self, token: ChromeExtensionToken, url: str, title: str, + content: str, html_content: str = None) -> Dict: + """导入网页内容""" + if not token.project_id: + return {"success": False, "error": "Token not associated with any project"} + + if "write" not in token.permissions: + return {"success": False, "error": "Insufficient permissions"} + + # 创建转录记录(将网页作为文档处理) + transcript_id = str(uuid.uuid4())[:8] + now = datetime.now().isoformat() + + # 构建完整文本 + full_text = f"# {title}\n\nURL: {url}\n\n{content}" + + conn = self.pm.db.get_conn() + conn.execute( + """INSERT INTO transcripts + (id, project_id, filename, full_text, type, created_at) + VALUES (?, ?, ?, ?, ?, ?)""", + (transcript_id, token.project_id, f"web_{title[:50]}.md", full_text, "webpage", now) + ) + conn.commit() + conn.close() + + return { + "success": True, + "transcript_id": transcript_id, + "project_id": token.project_id, + "url": url, + "title": title, + "content_length": len(content) + } + + +class BotHandler: + """飞书/钉钉机器人处理器""" + + def __init__(self, plugin_manager: PluginManager, bot_type: str): + self.pm = plugin_manager + self.bot_type = bot_type + + def create_session(self, session_id: str, session_name: str, + project_id: str = None, webhook_url: str = "", + secret: str = "") -> BotSession: + """创建机器人会话""" + bot_id = str(uuid.uuid4())[:8] + now = datetime.now().isoformat() + + conn = self.pm.db.get_conn() + conn.execute( + """INSERT INTO bot_sessions + (id, bot_type, session_id, session_name, project_id, webhook_url, secret, + is_active, created_at, updated_at, message_count) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (bot_id, self.bot_type, session_id, session_name, project_id, webhook_url, secret, + True, now, now, 0) + ) + conn.commit() + conn.close() + + return BotSession( + id=bot_id, + bot_type=self.bot_type, + session_id=session_id, + session_name=session_name, + project_id=project_id, + webhook_url=webhook_url, + secret=secret, + is_active=True, + created_at=now, + updated_at=now + ) + + def get_session(self, session_id: str) -> Optional[BotSession]: + """获取会话""" + conn = self.pm.db.get_conn() + row = conn.execute( + """SELECT * FROM bot_sessions + WHERE session_id = ? AND bot_type = ?""", + (session_id, self.bot_type) + ).fetchone() + conn.close() + + if row: + return self._row_to_session(row) + return None + + def list_sessions(self, project_id: str = None) -> List[BotSession]: + """列出会话""" + conn = self.pm.db.get_conn() + + if project_id: + rows = conn.execute( + """SELECT * FROM bot_sessions + WHERE bot_type = ? AND project_id = ? ORDER BY created_at DESC""", + (self.bot_type, project_id) + ).fetchall() + else: + rows = conn.execute( + """SELECT * FROM bot_sessions + WHERE bot_type = ? ORDER BY created_at DESC""", + (self.bot_type,) + ).fetchall() + + conn.close() + + return [self._row_to_session(row) for row in rows] + + def update_session(self, session_id: str, **kwargs) -> Optional[BotSession]: + """更新会话""" + conn = self.pm.db.get_conn() + + allowed_fields = ['session_name', 'project_id', 'webhook_url', 'secret', 'is_active'] + updates = [] + values = [] + + for field in allowed_fields: + if field in kwargs: + updates.append(f"{field} = ?") + values.append(kwargs[field]) + + if not updates: + conn.close() + return self.get_session(session_id) + + updates.append("updated_at = ?") + values.append(datetime.now().isoformat()) + values.append(session_id) + values.append(self.bot_type) + + query = f"UPDATE bot_sessions SET {', '.join(updates)} WHERE session_id = ? AND bot_type = ?" + conn.execute(query, values) + conn.commit() + conn.close() + + return self.get_session(session_id) + + def delete_session(self, session_id: str) -> bool: + """删除会话""" + conn = self.pm.db.get_conn() + cursor = conn.execute( + "DELETE FROM bot_sessions WHERE session_id = ? AND bot_type = ?", + (session_id, self.bot_type) + ) + conn.commit() + conn.close() + + return cursor.rowcount > 0 + + def _row_to_session(self, row: sqlite3.Row) -> BotSession: + """将数据库行转换为 BotSession 对象""" + return BotSession( + id=row['id'], + bot_type=row['bot_type'], + session_id=row['session_id'], + session_name=row['session_name'], + project_id=row['project_id'], + webhook_url=row['webhook_url'], + secret=row['secret'], + is_active=bool(row['is_active']), + created_at=row['created_at'], + updated_at=row['updated_at'], + last_message_at=row['last_message_at'], + message_count=row['message_count'] + ) + + async def handle_message(self, session: BotSession, message: Dict) -> Dict: + """处理收到的消息""" + now = datetime.now().isoformat() + + # 更新消息统计 + conn = self.pm.db.get_conn() + conn.execute( + """UPDATE bot_sessions + SET message_count = message_count + 1, last_message_at = ? + WHERE id = ?""", + (now, session.id) + ) + conn.commit() + conn.close() + + # 处理消息 + msg_type = message.get('msg_type', 'text') + content = message.get('content', {}) + + if msg_type == 'text': + text = content.get('text', '') + return await self._handle_text_message(session, text, message) + elif msg_type == 'audio': + # 处理音频消息 + return await self._handle_audio_message(session, message) + elif msg_type == 'file': + # 处理文件消息 + return await self._handle_file_message(session, message) + + return {"success": False, "error": "Unsupported message type"} + + async def _handle_text_message(self, session: BotSession, text: str, + raw_message: Dict) -> Dict: + """处理文本消息""" + # 简单命令处理 + if text.startswith('/help'): + return { + "success": True, + "response": """🤖 InsightFlow 机器人命令: +/help - 显示帮助 +/status - 查看项目状态 +/analyze - 分析网页内容 +/search <关键词> - 搜索知识库""" + } + + if text.startswith('/status'): + if not session.project_id: + return {"success": True, "response": "⚠️ 当前会话未绑定项目"} + + # 获取项目状态 + summary = self.pm.db.get_project_summary(session.project_id) + stats = summary.get('statistics', {}) + + return { + "success": True, + "response": f"""📊 项目状态: +实体数量: {stats.get('entity_count', 0)} +关系数量: {stats.get('relation_count', 0)} +转录数量: {stats.get('transcript_count', 0)}""" + } + + # 默认回复 + return { + "success": True, + "response": f"收到消息:{text[:100]}...\n\n使用 /help 查看可用命令" + } + + async def _handle_audio_message(self, session: BotSession, message: Dict) -> Dict: + """处理音频消息""" + if not session.project_id: + return {"success": False, "error": "Session not bound to any project"} + + # 下载音频文件 + audio_url = message.get('content', {}).get('download_url') + if not audio_url: + return {"success": False, "error": "No audio URL provided"} + + try: + async with httpx.AsyncClient() as client: + response = await client.get(audio_url) + audio_data = response.content + + # 保存音频文件 + filename = f"bot_audio_{datetime.now().strftime('%Y%m%d_%H%M%S')}.mp3" + + # 这里应该调用 ASR 服务进行转录 + # 简化处理,返回提示 + return { + "success": True, + "response": "🎵 收到音频文件,正在处理中...\n分析完成后会通知您。", + "audio_size": len(audio_data), + "filename": filename + } + + except Exception as e: + return {"success": False, "error": f"Failed to process audio: {str(e)}"} + + async def _handle_file_message(self, session: BotSession, message: Dict) -> Dict: + """处理文件消息""" + return { + "success": True, + "response": "📎 收到文件,正在处理中..." + } + + async def send_message(self, session: BotSession, message: str, + msg_type: str = "text") -> bool: + """发送消息到群聊""" + if not session.webhook_url: + return False + + try: + if self.bot_type == "feishu": + return await self._send_feishu_message(session, message, msg_type) + elif self.bot_type == "dingtalk": + return await self._send_dingtalk_message(session, message, msg_type) + + return False + + except Exception as e: + print(f"Failed to send {self.bot_type} message: {e}") + return False + + async def _send_feishu_message(self, session: BotSession, message: str, + msg_type: str) -> bool: + """发送飞书消息""" + import hashlib + import base64 + + timestamp = str(int(time.time())) + + # 生成签名 + if session.secret: + string_to_sign = f"{timestamp}\n{session.secret}" + hmac_code = hmac.new( + session.secret.encode('utf-8'), + string_to_sign.encode('utf-8'), + digestmod=hashlib.sha256 + ).digest() + sign = base64.b64encode(hmac_code).decode('utf-8') + else: + sign = "" + + payload = { + "timestamp": timestamp, + "sign": sign, + "msg_type": "text", + "content": { + "text": message + } + } + + async with httpx.AsyncClient() as client: + response = await client.post( + session.webhook_url, + json=payload, + headers={"Content-Type": "application/json"} + ) + return response.status_code == 200 + + async def _send_dingtalk_message(self, session: BotSession, message: str, + msg_type: str) -> bool: + """发送钉钉消息""" + import hashlib + import base64 + + timestamp = str(round(time.time() * 1000)) + + # 生成签名 + if session.secret: + string_to_sign = f"{timestamp}\n{session.secret}" + hmac_code = hmac.new( + session.secret.encode('utf-8'), + string_to_sign.encode('utf-8'), + digestmod=hashlib.sha256 + ).digest() + sign = base64.b64encode(hmac_code).decode('utf-8') + sign = urllib.parse.quote(sign) + else: + sign = "" + + payload = { + "msgtype": "text", + "text": { + "content": message + } + } + + url = session.webhook_url + if sign: + url = f"{url}×tamp={timestamp}&sign={sign}" + + async with httpx.AsyncClient() as client: + response = await client.post( + url, + json=payload, + headers={"Content-Type": "application/json"} + ) + return response.status_code == 200 + + +class WebhookIntegration: + """Zapier/Make Webhook 集成""" + + def __init__(self, plugin_manager: PluginManager, endpoint_type: str): + self.pm = plugin_manager + self.endpoint_type = endpoint_type + + def create_endpoint(self, name: str, endpoint_url: str, + project_id: str = None, auth_type: str = "none", + auth_config: Dict = None, + trigger_events: List[str] = None) -> WebhookEndpoint: + """创建 Webhook 端点""" + endpoint_id = str(uuid.uuid4())[:8] + now = datetime.now().isoformat() + + conn = self.pm.db.get_conn() + conn.execute( + """INSERT INTO webhook_endpoints + (id, name, endpoint_type, endpoint_url, project_id, auth_type, auth_config, + trigger_events, is_active, created_at, updated_at, trigger_count) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (endpoint_id, name, self.endpoint_type, endpoint_url, project_id, auth_type, + json.dumps(auth_config or {}), json.dumps(trigger_events or []), True, + now, now, 0) + ) + conn.commit() + conn.close() + + return WebhookEndpoint( + id=endpoint_id, + name=name, + endpoint_type=self.endpoint_type, + endpoint_url=endpoint_url, + project_id=project_id, + auth_type=auth_type, + auth_config=auth_config or {}, + trigger_events=trigger_events or [], + is_active=True, + created_at=now, + updated_at=now + ) + + def get_endpoint(self, endpoint_id: str) -> Optional[WebhookEndpoint]: + """获取端点""" + conn = self.pm.db.get_conn() + row = conn.execute( + "SELECT * FROM webhook_endpoints WHERE id = ? AND endpoint_type = ?", + (endpoint_id, self.endpoint_type) + ).fetchone() + conn.close() + + if row: + return self._row_to_endpoint(row) + return None + + def list_endpoints(self, project_id: str = None) -> List[WebhookEndpoint]: + """列出端点""" + conn = self.pm.db.get_conn() + + if project_id: + rows = conn.execute( + """SELECT * FROM webhook_endpoints + WHERE endpoint_type = ? AND project_id = ? ORDER BY created_at DESC""", + (self.endpoint_type, project_id) + ).fetchall() + else: + rows = conn.execute( + """SELECT * FROM webhook_endpoints + WHERE endpoint_type = ? ORDER BY created_at DESC""", + (self.endpoint_type,) + ).fetchall() + + conn.close() + + return [self._row_to_endpoint(row) for row in rows] + + def update_endpoint(self, endpoint_id: str, **kwargs) -> Optional[WebhookEndpoint]: + """更新端点""" + conn = self.pm.db.get_conn() + + allowed_fields = ['name', 'endpoint_url', 'project_id', 'auth_type', + 'auth_config', 'trigger_events', 'is_active'] + updates = [] + values = [] + + for field in allowed_fields: + if field in kwargs: + updates.append(f"{field} = ?") + if field in ['auth_config', 'trigger_events']: + values.append(json.dumps(kwargs[field])) + else: + values.append(kwargs[field]) + + if not updates: + conn.close() + return self.get_endpoint(endpoint_id) + + updates.append("updated_at = ?") + values.append(datetime.now().isoformat()) + values.append(endpoint_id) + + query = f"UPDATE webhook_endpoints SET {', '.join(updates)} WHERE id = ?" + conn.execute(query, values) + conn.commit() + conn.close() + + return self.get_endpoint(endpoint_id) + + def delete_endpoint(self, endpoint_id: str) -> bool: + """删除端点""" + conn = self.pm.db.get_conn() + cursor = conn.execute( + "DELETE FROM webhook_endpoints WHERE id = ?", + (endpoint_id,) + ) + conn.commit() + conn.close() + + return cursor.rowcount > 0 + + def _row_to_endpoint(self, row: sqlite3.Row) -> WebhookEndpoint: + """将数据库行转换为 WebhookEndpoint 对象""" + return WebhookEndpoint( + id=row['id'], + name=row['name'], + endpoint_type=row['endpoint_type'], + endpoint_url=row['endpoint_url'], + project_id=row['project_id'], + auth_type=row['auth_type'], + auth_config=json.loads(row['auth_config']) if row['auth_config'] else {}, + trigger_events=json.loads(row['trigger_events']) if row['trigger_events'] else [], + is_active=bool(row['is_active']), + created_at=row['created_at'], + updated_at=row['updated_at'], + last_triggered_at=row['last_triggered_at'], + trigger_count=row['trigger_count'] + ) + + async def trigger(self, endpoint: WebhookEndpoint, event_type: str, + data: Dict) -> bool: + """触发 Webhook""" + if not endpoint.is_active: + return False + + if event_type not in endpoint.trigger_events: + return False + + try: + headers = {"Content-Type": "application/json"} + + # 添加认证头 + if endpoint.auth_type == "api_key": + api_key = endpoint.auth_config.get('api_key', '') + header_name = endpoint.auth_config.get('header_name', 'X-API-Key') + headers[header_name] = api_key + elif endpoint.auth_type == "bearer": + token = endpoint.auth_config.get('token', '') + headers["Authorization"] = f"Bearer {token}" + + payload = { + "event": event_type, + "timestamp": datetime.now().isoformat(), + "data": data + } + + async with httpx.AsyncClient() as client: + response = await client.post( + endpoint.endpoint_url, + json=payload, + headers=headers, + timeout=30.0 + ) + + success = response.status_code in [200, 201, 202] + + # 更新触发统计 + now = datetime.now().isoformat() + conn = self.pm.db.get_conn() + conn.execute( + """UPDATE webhook_endpoints + SET trigger_count = trigger_count + 1, last_triggered_at = ? + WHERE id = ?""", + (now, endpoint.id) + ) + conn.commit() + conn.close() + + return success + + except Exception as e: + print(f"Failed to trigger webhook: {e}") + return False + + async def test_endpoint(self, endpoint: WebhookEndpoint) -> Dict: + """测试端点""" + test_data = { + "message": "This is a test event from InsightFlow", + "test": True, + "timestamp": datetime.now().isoformat() + } + + success = await self.trigger(endpoint, "test", test_data) + + return { + "success": success, + "endpoint_id": endpoint.id, + "endpoint_url": endpoint.endpoint_url, + "message": "Test event sent successfully" if success else "Failed to send test event" + } + + +class WebDAVSyncManager: + """WebDAV 同步管理""" + + def __init__(self, plugin_manager: PluginManager): + self.pm = plugin_manager + + def create_sync(self, name: str, project_id: str, server_url: str, + username: str, password: str, remote_path: str = "/insightflow", + sync_mode: str = "bidirectional", + sync_interval: int = 3600) -> WebDAVSync: + """创建 WebDAV 同步配置""" + sync_id = str(uuid.uuid4())[:8] + now = datetime.now().isoformat() + + conn = self.pm.db.get_conn() + conn.execute( + """INSERT INTO webdav_syncs + (id, name, project_id, server_url, username, password, remote_path, + sync_mode, sync_interval, last_sync_status, is_active, created_at, updated_at, sync_count) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (sync_id, name, project_id, server_url, username, password, remote_path, + sync_mode, sync_interval, 'pending', True, now, now, 0) + ) + conn.commit() + conn.close() + + return WebDAVSync( + id=sync_id, + name=name, + project_id=project_id, + server_url=server_url, + username=username, + password=password, + remote_path=remote_path, + sync_mode=sync_mode, + sync_interval=sync_interval, + last_sync_status='pending', + is_active=True, + created_at=now, + updated_at=now + ) + + def get_sync(self, sync_id: str) -> Optional[WebDAVSync]: + """获取同步配置""" + conn = self.pm.db.get_conn() + row = conn.execute( + "SELECT * FROM webdav_syncs WHERE id = ?", + (sync_id,) + ).fetchone() + conn.close() + + if row: + return self._row_to_sync(row) + return None + + def list_syncs(self, project_id: str = None) -> List[WebDAVSync]: + """列出同步配置""" + conn = self.pm.db.get_conn() + + if project_id: + rows = conn.execute( + "SELECT * FROM webdav_syncs WHERE project_id = ? ORDER BY created_at DESC", + (project_id,) + ).fetchall() + else: + rows = conn.execute( + "SELECT * FROM webdav_syncs ORDER BY created_at DESC" + ).fetchall() + + conn.close() + + return [self._row_to_sync(row) for row in rows] + + def update_sync(self, sync_id: str, **kwargs) -> Optional[WebDAVSync]: + """更新同步配置""" + conn = self.pm.db.get_conn() + + allowed_fields = ['name', 'server_url', 'username', 'password', + 'remote_path', 'sync_mode', 'sync_interval', 'is_active'] + updates = [] + values = [] + + for field in allowed_fields: + if field in kwargs: + updates.append(f"{field} = ?") + values.append(kwargs[field]) + + if not updates: + conn.close() + return self.get_sync(sync_id) + + updates.append("updated_at = ?") + values.append(datetime.now().isoformat()) + values.append(sync_id) + + query = f"UPDATE webdav_syncs SET {', '.join(updates)} WHERE id = ?" + conn.execute(query, values) + conn.commit() + conn.close() + + return self.get_sync(sync_id) + + def delete_sync(self, sync_id: str) -> bool: + """删除同步配置""" + conn = self.pm.db.get_conn() + cursor = conn.execute( + "DELETE FROM webdav_syncs WHERE id = ?", + (sync_id,) + ) + conn.commit() + conn.close() + + return cursor.rowcount > 0 + + def _row_to_sync(self, row: sqlite3.Row) -> WebDAVSync: + """将数据库行转换为 WebDAVSync 对象""" + return WebDAVSync( + id=row['id'], + name=row['name'], + project_id=row['project_id'], + server_url=row['server_url'], + username=row['username'], + password=row['password'], + remote_path=row['remote_path'], + sync_mode=row['sync_mode'], + sync_interval=row['sync_interval'], + last_sync_at=row['last_sync_at'], + last_sync_status=row['last_sync_status'], + last_sync_error=row['last_sync_error'] or "", + is_active=bool(row['is_active']), + created_at=row['created_at'], + updated_at=row['updated_at'], + sync_count=row['sync_count'] + ) + + async def test_connection(self, sync: WebDAVSync) -> Dict: + """测试 WebDAV 连接""" + if not WEBDAV_AVAILABLE: + return {"success": False, "error": "WebDAV library not available"} + + try: + client = webdav_client.Client( + sync.server_url, + auth=(sync.username, sync.password) + ) + + # 尝试列出根目录 + client.list("/") + + return { + "success": True, + "message": "Connection successful" + } + + except Exception as e: + return { + "success": False, + "error": str(e) + } + + async def sync_project(self, sync: WebDAVSync) -> Dict: + """同步项目到 WebDAV""" + if not WEBDAV_AVAILABLE: + return {"success": False, "error": "WebDAV library not available"} + + if not sync.is_active: + return {"success": False, "error": "Sync is not active"} + + try: + client = webdav_client.Client( + sync.server_url, + auth=(sync.username, sync.password) + ) + + # 确保远程目录存在 + remote_project_path = f"{sync.remote_path}/{sync.project_id}" + try: + client.mkdir(remote_project_path) + except: + pass # 目录可能已存在 + + # 获取项目数据 + project = self.pm.db.get_project(sync.project_id) + if not project: + return {"success": False, "error": "Project not found"} + + # 导出项目数据为 JSON + entities = self.pm.db.list_project_entities(sync.project_id) + relations = self.pm.db.list_project_relations(sync.project_id) + transcripts = self.pm.db.list_project_transcripts(sync.project_id) + + export_data = { + "project": { + "id": project.id, + "name": project.name, + "description": project.description + }, + "entities": [{"id": e.id, "name": e.name, "type": e.type} for e in entities], + "relations": relations, + "transcripts": [{"id": t['id'], "filename": t['filename']} for t in transcripts], + "exported_at": datetime.now().isoformat() + } + + # 上传 JSON 文件 + json_content = json.dumps(export_data, ensure_ascii=False, indent=2) + json_path = f"{remote_project_path}/project_export.json" + + # 使用临时文件上传 + import tempfile + with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f: + f.write(json_content) + temp_path = f.name + + client.upload_file(temp_path, json_path) + os.unlink(temp_path) + + # 更新同步状态 + now = datetime.now().isoformat() + conn = self.pm.db.get_conn() + conn.execute( + """UPDATE webdav_syncs + SET last_sync_at = ?, last_sync_status = ?, sync_count = sync_count + 1 + WHERE id = ?""", + (now, 'success', sync.id) + ) + conn.commit() + conn.close() + + return { + "success": True, + "message": "Project synced successfully", + "entities_count": len(entities), + "relations_count": len(relations), + "remote_path": json_path + } + + except Exception as e: + # 更新失败状态 + conn = self.pm.db.get_conn() + conn.execute( + """UPDATE webdav_syncs + SET last_sync_status = ?, last_sync_error = ? + WHERE id = ?""", + ('failed', str(e), sync.id) + ) + conn.commit() + conn.close() + + return { + "success": False, + "error": str(e) + } + + +# Singleton instance +_plugin_manager = None + +def get_plugin_manager(db_manager=None): + """获取 PluginManager 单例""" + global _plugin_manager + if _plugin_manager is None: + _plugin_manager = PluginManager(db_manager) + return _plugin_manager diff --git a/backend/requirements.txt b/backend/requirements.txt index 5d0c34f..89e7b1e 100644 --- a/backend/requirements.txt +++ b/backend/requirements.txt @@ -36,3 +36,17 @@ fastapi-offline-swagger==0.1.0 # Phase 7: Workflow Automation apscheduler==3.10.4 + +# Phase 7: Multimodal Support +ffmpeg-python==0.2.0 +pillow==10.2.0 +opencv-python==4.9.0.80 +pytesseract==0.3.10 + +# Phase 7 Task 7: Plugin & Integration +webdav4==0.9.8 +urllib3==2.2.0 + +# Phase 7: Plugin & Integration +beautifulsoup4==4.12.3 +webdavclient3==3.14.6 diff --git a/backend/schema.sql b/backend/schema.sql index 68fa6ad..3cacc91 100644 --- a/backend/schema.sql +++ b/backend/schema.sql @@ -222,3 +222,320 @@ CREATE INDEX IF NOT EXISTS idx_workflow_logs_workflow ON workflow_logs(workflow_ CREATE INDEX IF NOT EXISTS idx_workflow_logs_task ON workflow_logs(task_id); CREATE INDEX IF NOT EXISTS idx_workflow_logs_status ON workflow_logs(status); CREATE INDEX IF NOT EXISTS idx_workflow_logs_created ON workflow_logs(created_at); + +-- Phase 7: 多模态支持相关表 + +-- 视频表 +CREATE TABLE IF NOT EXISTS videos ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + filename TEXT NOT NULL, + duration REAL, -- 视频时长(秒) + fps REAL, -- 帧率 + resolution TEXT, -- JSON: {"width": int, "height": int} + audio_transcript_id TEXT, -- 关联的音频转录ID + full_ocr_text TEXT, -- 所有帧OCR文本合并 + extracted_entities TEXT, -- JSON: 提取的实体列表 + extracted_relations TEXT, -- JSON: 提取的关系列表 + status TEXT DEFAULT 'processing', -- processing, completed, failed + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id), + FOREIGN KEY (audio_transcript_id) REFERENCES transcripts(id) +); + +-- 视频关键帧表 +CREATE TABLE IF NOT EXISTS video_frames ( + id TEXT PRIMARY KEY, + video_id TEXT NOT NULL, + frame_number INTEGER, + timestamp REAL, -- 时间戳(秒) + image_data BLOB, -- 帧图片数据(可选,可存储在OSS) + image_url TEXT, -- 图片URL(如果存储在OSS) + ocr_text TEXT, -- OCR识别文本 + extracted_entities TEXT, -- JSON: 该帧提取的实体 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (video_id) REFERENCES videos(id) ON DELETE CASCADE +); + +-- 图片表 +CREATE TABLE IF NOT EXISTS images ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + filename TEXT NOT NULL, + image_data BLOB, -- 图片数据(可选) + image_url TEXT, -- 图片URL + ocr_text TEXT, -- OCR识别文本 + description TEXT, -- 图片描述(LLM生成) + extracted_entities TEXT, -- JSON: 提取的实体列表 + extracted_relations TEXT, -- JSON: 提取的关系列表 + status TEXT DEFAULT 'processing', + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- 多模态实体提及表 +CREATE TABLE IF NOT EXISTS multimodal_mentions ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + entity_id TEXT NOT NULL, + modality TEXT NOT NULL, -- audio, video, image, document + source_id TEXT NOT NULL, -- transcript_id, video_id, image_id + source_type TEXT NOT NULL, -- 来源类型 + position TEXT, -- JSON: 位置信息 + text_snippet TEXT, -- 提及的文本片段 + confidence REAL DEFAULT 1.0, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id), + FOREIGN KEY (entity_id) REFERENCES entities(id) ON DELETE CASCADE +); + +-- 多模态实体关联表 +CREATE TABLE IF NOT EXISTS multimodal_entity_links ( + id TEXT PRIMARY KEY, + entity_id TEXT NOT NULL, + linked_entity_id TEXT NOT NULL, -- 关联的实体ID + link_type TEXT NOT NULL, -- same_as, related_to, part_of + confidence REAL DEFAULT 1.0, + evidence TEXT, -- 关联证据 + modalities TEXT, -- JSON: 涉及的模态列表 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (entity_id) REFERENCES entities(id) ON DELETE CASCADE, + FOREIGN KEY (linked_entity_id) REFERENCES entities(id) ON DELETE CASCADE +); + +-- 多模态相关索引 +CREATE INDEX IF NOT EXISTS idx_videos_project ON videos(project_id); +CREATE INDEX IF NOT EXISTS idx_videos_status ON videos(status); +CREATE INDEX IF NOT EXISTS idx_video_frames_video ON video_frames(video_id); +CREATE INDEX IF NOT EXISTS idx_images_project ON images(project_id); +CREATE INDEX IF NOT EXISTS idx_images_status ON images(status); +CREATE INDEX IF NOT EXISTS idx_multimodal_mentions_project ON multimodal_mentions(project_id); +CREATE INDEX IF NOT EXISTS idx_multimodal_mentions_entity ON multimodal_mentions(entity_id); +CREATE INDEX IF NOT EXISTS idx_multimodal_mentions_modality ON multimodal_mentions(modality); +CREATE INDEX IF NOT EXISTS idx_multimodal_mentions_source ON multimodal_mentions(source_id); +CREATE INDEX IF NOT EXISTS idx_multimodal_links_entity ON multimodal_entity_links(entity_id); +CREATE INDEX IF NOT EXISTS idx_multimodal_links_linked ON multimodal_entity_links(linked_entity_id); + +-- Phase 7 Task 7: 插件与集成相关表 + +-- 插件配置表 +CREATE TABLE IF NOT EXISTS plugins ( + id TEXT PRIMARY KEY, + name TEXT NOT NULL, + plugin_type TEXT NOT NULL, -- chrome_extension, feishu_bot, dingtalk_bot, zapier, make, webdav, custom + project_id TEXT, + status TEXT DEFAULT 'active', -- active, inactive, error, pending + config TEXT, -- JSON: plugin specific configuration + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_used_at TIMESTAMP, + use_count INTEGER DEFAULT 0, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- 插件详细配置表 +CREATE TABLE IF NOT EXISTS plugin_configs ( + id TEXT PRIMARY KEY, + plugin_id TEXT NOT NULL, + config_key TEXT NOT NULL, + config_value TEXT, + is_encrypted BOOLEAN DEFAULT 0, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (plugin_id) REFERENCES plugins(id) ON DELETE CASCADE, + UNIQUE(plugin_id, config_key) +); + +-- 机器人会话表 +CREATE TABLE IF NOT EXISTS bot_sessions ( + id TEXT PRIMARY KEY, + bot_type TEXT NOT NULL, -- feishu, dingtalk + session_id TEXT NOT NULL, -- 群ID或会话ID + session_name TEXT NOT NULL, + project_id TEXT, + webhook_url TEXT, + secret TEXT, -- 签名密钥 + is_active BOOLEAN DEFAULT 1, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_message_at TIMESTAMP, + message_count INTEGER DEFAULT 0, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- Webhook 端点表(Zapier/Make集成) +CREATE TABLE IF NOT EXISTS webhook_endpoints ( + id TEXT PRIMARY KEY, + name TEXT NOT NULL, + endpoint_type TEXT NOT NULL, -- zapier, make, custom + endpoint_url TEXT NOT NULL, + project_id TEXT, + auth_type TEXT DEFAULT 'none', -- none, api_key, oauth, custom + auth_config TEXT, -- JSON: authentication configuration + trigger_events TEXT, -- JSON array: events that trigger this webhook + is_active BOOLEAN DEFAULT 1, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_triggered_at TIMESTAMP, + trigger_count INTEGER DEFAULT 0, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- WebDAV 同步配置表 +CREATE TABLE IF NOT EXISTS webdav_syncs ( + id TEXT PRIMARY KEY, + name TEXT NOT NULL, + project_id TEXT NOT NULL, + server_url TEXT NOT NULL, + username TEXT NOT NULL, + password TEXT NOT NULL, -- 建议加密存储 + remote_path TEXT DEFAULT '/insightflow', + sync_mode TEXT DEFAULT 'bidirectional', -- bidirectional, upload_only, download_only + sync_interval INTEGER DEFAULT 3600, -- 秒 + last_sync_at TIMESTAMP, + last_sync_status TEXT DEFAULT 'pending', -- pending, success, failed + last_sync_error TEXT, + is_active BOOLEAN DEFAULT 1, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + sync_count INTEGER DEFAULT 0, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- Chrome 扩展令牌表 +CREATE TABLE IF NOT EXISTS chrome_extension_tokens ( + id TEXT PRIMARY KEY, + token_hash TEXT NOT NULL UNIQUE, -- SHA256 hash of the token + user_id TEXT, + project_id TEXT, + name TEXT, + permissions TEXT, -- JSON array: read, write, delete + expires_at TIMESTAMP, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_used_at TIMESTAMP, + use_count INTEGER DEFAULT 0, + is_revoked BOOLEAN DEFAULT 0, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- 插件相关索引 +CREATE INDEX IF NOT EXISTS idx_plugins_project ON plugins(project_id); +CREATE INDEX IF NOT EXISTS idx_plugins_type ON plugins(plugin_type); +CREATE INDEX IF NOT EXISTS idx_plugins_status ON plugins(status); +CREATE INDEX IF NOT EXISTS idx_plugin_configs_plugin ON plugin_configs(plugin_id); +CREATE INDEX IF NOT EXISTS idx_bot_sessions_project ON bot_sessions(project_id); +CREATE INDEX IF NOT EXISTS idx_bot_sessions_type ON bot_sessions(bot_type); +CREATE INDEX IF NOT EXISTS idx_webhook_endpoints_project ON webhook_endpoints(project_id); +CREATE INDEX IF NOT EXISTS idx_webhook_endpoints_type ON webhook_endpoints(endpoint_type); +CREATE INDEX IF NOT EXISTS idx_webdav_syncs_project ON webdav_syncs(project_id); +CREATE INDEX IF NOT EXISTS idx_chrome_tokens_project ON chrome_extension_tokens(project_id); +CREATE INDEX IF NOT EXISTS idx_chrome_tokens_hash ON chrome_extension_tokens(token_hash); + +-- Phase 7: 插件与集成相关表 + +-- 插件表 +CREATE TABLE IF NOT EXISTS plugins ( + id TEXT PRIMARY KEY, + name TEXT NOT NULL, + plugin_type TEXT NOT NULL, -- chrome_extension, feishu_bot, dingtalk_bot, slack_bot, webhook, webdav, custom + project_id TEXT, + status TEXT DEFAULT 'active', -- active, inactive, error, pending + config TEXT, -- JSON: 插件配置 + api_key TEXT UNIQUE, -- 用于认证的 API Key + api_secret TEXT, -- 用于签名验证的 Secret + webhook_url TEXT, -- 机器人 Webhook URL + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_used_at TIMESTAMP, + use_count INTEGER DEFAULT 0, + success_count INTEGER DEFAULT 0, + fail_count INTEGER DEFAULT 0, + FOREIGN KEY (project_id) REFERENCES projects(id) +); + +-- 机器人会话表 +CREATE TABLE IF NOT EXISTS bot_sessions ( + id TEXT PRIMARY KEY, + plugin_id TEXT NOT NULL, + platform TEXT NOT NULL, -- feishu, dingtalk, slack, wechat + session_id TEXT NOT NULL, -- 平台特定的会话ID + user_id TEXT, + user_name TEXT, + project_id TEXT, -- 关联的项目ID + context TEXT, -- JSON: 会话上下文 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_message_at TIMESTAMP, + message_count INTEGER DEFAULT 0, + FOREIGN KEY (plugin_id) REFERENCES plugins(id) ON DELETE CASCADE, + FOREIGN KEY (project_id) REFERENCES projects(id), + UNIQUE(plugin_id, session_id) +); + +-- Webhook 端点表(用于 Zapier/Make 集成) +CREATE TABLE IF NOT EXISTS webhook_endpoints ( + id TEXT PRIMARY KEY, + plugin_id TEXT NOT NULL, + name TEXT NOT NULL, + endpoint_path TEXT NOT NULL UNIQUE, -- 如 /webhook/zapier/abc123 + endpoint_type TEXT NOT NULL, -- zapier, make, custom + secret TEXT, -- 用于签名验证 + allowed_events TEXT, -- JSON: 允许的事件列表 + target_project_id TEXT, -- 数据导入的目标项目 + is_active BOOLEAN DEFAULT 1, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + last_triggered_at TIMESTAMP, + trigger_count INTEGER DEFAULT 0, + FOREIGN KEY (plugin_id) REFERENCES plugins(id) ON DELETE CASCADE, + FOREIGN KEY (target_project_id) REFERENCES projects(id) +); + +-- WebDAV 同步配置表 +CREATE TABLE IF NOT EXISTS webdav_syncs ( + id TEXT PRIMARY KEY, + plugin_id TEXT NOT NULL, + name TEXT NOT NULL, + server_url TEXT NOT NULL, + username TEXT NOT NULL, + password TEXT NOT NULL, -- 建议加密存储 + remote_path TEXT DEFAULT '/', + local_path TEXT DEFAULT './sync', + sync_direction TEXT DEFAULT 'bidirectional', -- upload, download, bidirectional + sync_mode TEXT DEFAULT 'manual', -- manual, realtime, scheduled + sync_schedule TEXT, -- cron expression + file_patterns TEXT, -- JSON: 文件匹配模式列表 + auto_analyze BOOLEAN DEFAULT 1, -- 同步后自动分析 + last_sync_at TIMESTAMP, + last_sync_status TEXT, + is_active BOOLEAN DEFAULT 1, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + sync_count INTEGER DEFAULT 0, + FOREIGN KEY (plugin_id) REFERENCES plugins(id) ON DELETE CASCADE +); + +-- 插件活动日志表 +CREATE TABLE IF NOT EXISTS plugin_activity_logs ( + id TEXT PRIMARY KEY, + plugin_id TEXT NOT NULL, + activity_type TEXT NOT NULL, -- message, webhook, sync, error + source TEXT NOT NULL, -- 来源标识 + details TEXT, -- JSON: 详细信息 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (plugin_id) REFERENCES plugins(id) ON DELETE CASCADE +); + +-- 插件相关索引 +CREATE INDEX IF NOT EXISTS idx_plugins_project ON plugins(project_id); +CREATE INDEX IF NOT EXISTS idx_plugins_type ON plugins(plugin_type); +CREATE INDEX IF NOT EXISTS idx_plugins_api_key ON plugins(api_key); +CREATE INDEX IF NOT EXISTS idx_bot_sessions_plugin ON bot_sessions(plugin_id); +CREATE INDEX IF NOT EXISTS idx_bot_sessions_project ON bot_sessions(project_id); +CREATE INDEX IF NOT EXISTS idx_webhook_endpoints_plugin ON webhook_endpoints(plugin_id); +CREATE INDEX IF NOT EXISTS idx_webdav_syncs_plugin ON webdav_syncs(plugin_id); +CREATE INDEX IF NOT EXISTS idx_plugin_logs_plugin ON plugin_activity_logs(plugin_id); +CREATE INDEX IF NOT EXISTS idx_plugin_logs_type ON plugin_activity_logs(activity_type); +CREATE INDEX IF NOT EXISTS idx_plugin_logs_created ON plugin_activity_logs(created_at); diff --git a/backend/schema_multimodal.sql b/backend/schema_multimodal.sql new file mode 100644 index 0000000..796edfc --- /dev/null +++ b/backend/schema_multimodal.sql @@ -0,0 +1,104 @@ +-- Phase 7: 多模态支持相关表 + +-- 视频表 +CREATE TABLE IF NOT EXISTS videos ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + filename TEXT NOT NULL, + file_path TEXT, + duration REAL, -- 视频时长(秒) + width INTEGER, -- 视频宽度 + height INTEGER, -- 视频高度 + fps REAL, -- 帧率 + audio_extracted INTEGER DEFAULT 0, -- 是否已提取音频 + audio_path TEXT, -- 提取的音频文件路径 + transcript_id TEXT, -- 关联的转录记录ID + status TEXT DEFAULT 'pending', -- pending, processing, completed, failed + error_message TEXT, + metadata TEXT, -- JSON: 其他元数据 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id), + FOREIGN KEY (transcript_id) REFERENCES transcripts(id) +); + +-- 视频关键帧表 +CREATE TABLE IF NOT EXISTS video_frames ( + id TEXT PRIMARY KEY, + video_id TEXT NOT NULL, + frame_number INTEGER NOT NULL, + timestamp REAL NOT NULL, -- 帧时间戳(秒) + frame_path TEXT NOT NULL, -- 帧图片路径 + ocr_text TEXT, -- OCR识别的文字 + ocr_confidence REAL, -- OCR置信度 + entities_detected TEXT, -- JSON: 检测到的实体 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (video_id) REFERENCES videos(id) ON DELETE CASCADE +); + +-- 图片表 +CREATE TABLE IF NOT EXISTS images ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + filename TEXT NOT NULL, + file_path TEXT, + image_type TEXT, -- whiteboard, ppt, handwritten, screenshot, other + width INTEGER, + height INTEGER, + ocr_text TEXT, -- OCR识别的文字 + description TEXT, -- 图片描述(LLM生成) + entities_detected TEXT, -- JSON: 检测到的实体 + relations_detected TEXT, -- JSON: 检测到的关系 + transcript_id TEXT, -- 关联的转录记录ID(可选) + status TEXT DEFAULT 'pending', -- pending, processing, completed, failed + error_message TEXT, + metadata TEXT, -- JSON: 其他元数据 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id), + FOREIGN KEY (transcript_id) REFERENCES transcripts(id) +); + +-- 多模态实体关联表 +CREATE TABLE IF NOT EXISTS multimodal_entities ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + entity_id TEXT NOT NULL, -- 关联的实体ID + source_type TEXT NOT NULL, -- audio, video, image, document + source_id TEXT NOT NULL, -- 来源ID(transcript_id, video_id, image_id) + mention_context TEXT, -- 提及上下文 + confidence REAL DEFAULT 1.0, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id), + FOREIGN KEY (entity_id) REFERENCES entities(id), + UNIQUE(entity_id, source_type, source_id) +); + +-- 多模态实体对齐表(跨模态实体关联) +CREATE TABLE IF NOT EXISTS multimodal_entity_links ( + id TEXT PRIMARY KEY, + project_id TEXT NOT NULL, + source_entity_id TEXT NOT NULL, -- 源实体ID + target_entity_id TEXT NOT NULL, -- 目标实体ID + link_type TEXT NOT NULL, -- same_as, related_to, part_of + source_modality TEXT NOT NULL, -- audio, video, image, document + target_modality TEXT NOT NULL, -- audio, video, image, document + confidence REAL DEFAULT 1.0, + evidence TEXT, -- 关联证据 + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (project_id) REFERENCES projects(id), + FOREIGN KEY (source_entity_id) REFERENCES entities(id), + FOREIGN KEY (target_entity_id) REFERENCES entities(id) +); + +-- 创建索引 +CREATE INDEX IF NOT EXISTS idx_videos_project ON videos(project_id); +CREATE INDEX IF NOT EXISTS idx_videos_status ON videos(status); +CREATE INDEX IF NOT EXISTS idx_video_frames_video ON video_frames(video_id); +CREATE INDEX IF NOT EXISTS idx_video_frames_timestamp ON video_frames(timestamp); +CREATE INDEX IF NOT EXISTS idx_images_project ON images(project_id); +CREATE INDEX IF NOT EXISTS idx_images_type ON images(image_type); +CREATE INDEX IF NOT EXISTS idx_images_status ON images(status); +CREATE INDEX IF NOT EXISTS idx_multimodal_entities_project ON multimodal_entities(project_id); +CREATE INDEX IF NOT EXISTS idx_multimodal_entities_entity ON multimodal_entities(entity_id); +CREATE INDEX IF NOT EXISTS idx_multimodal_entity_links_project ON multimodal_entity_links(project_id); diff --git a/backend/test_multimodal.py b/backend/test_multimodal.py new file mode 100644 index 0000000..68789cf --- /dev/null +++ b/backend/test_multimodal.py @@ -0,0 +1,157 @@ +#!/usr/bin/env python3 +""" +InsightFlow Multimodal Module Test Script +测试多模态支持模块 +""" + +import sys +import os + +# 添加 backend 目录到路径 +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + +print("=" * 60) +print("InsightFlow 多模态模块测试") +print("=" * 60) + +# 测试导入 +print("\n1. 测试模块导入...") + +try: + from multimodal_processor import ( + get_multimodal_processor, MultimodalProcessor, + VideoProcessingResult, VideoFrame + ) + print(" ✓ multimodal_processor 导入成功") +except ImportError as e: + print(f" ✗ multimodal_processor 导入失败: {e}") + +try: + from image_processor import ( + get_image_processor, ImageProcessor, + ImageProcessingResult, ImageEntity, ImageRelation + ) + print(" ✓ image_processor 导入成功") +except ImportError as e: + print(f" ✗ image_processor 导入失败: {e}") + +try: + from multimodal_entity_linker import ( + get_multimodal_entity_linker, MultimodalEntityLinker, + MultimodalEntity, EntityLink, AlignmentResult, FusionResult + ) + print(" ✓ multimodal_entity_linker 导入成功") +except ImportError as e: + print(f" ✗ multimodal_entity_linker 导入失败: {e}") + +# 测试初始化 +print("\n2. 测试模块初始化...") + +try: + processor = get_multimodal_processor() + print(f" ✓ MultimodalProcessor 初始化成功") + print(f" - 临时目录: {processor.temp_dir}") + print(f" - 帧提取间隔: {processor.frame_interval}秒") +except Exception as e: + print(f" ✗ MultimodalProcessor 初始化失败: {e}") + +try: + img_processor = get_image_processor() + print(f" ✓ ImageProcessor 初始化成功") + print(f" - 临时目录: {img_processor.temp_dir}") +except Exception as e: + print(f" ✗ ImageProcessor 初始化失败: {e}") + +try: + linker = get_multimodal_entity_linker() + print(f" ✓ MultimodalEntityLinker 初始化成功") + print(f" - 相似度阈值: {linker.similarity_threshold}") +except Exception as e: + print(f" ✗ MultimodalEntityLinker 初始化失败: {e}") + +# 测试实体关联功能 +print("\n3. 测试实体关联功能...") + +try: + linker = get_multimodal_entity_linker() + + # 测试字符串相似度 + sim = linker.calculate_string_similarity("Project Alpha", "Project Alpha") + assert sim == 1.0, "完全匹配应该返回1.0" + print(f" ✓ 字符串相似度计算正常 (完全匹配: {sim})") + + sim = linker.calculate_string_similarity("K8s", "Kubernetes") + print(f" ✓ 字符串相似度计算正常 (不同字符串: {sim:.2f})") + + # 测试实体相似度 + entity1 = {"name": "Project Alpha", "type": "PROJECT", "definition": "核心项目"} + entity2 = {"name": "Project Alpha", "type": "PROJECT", "definition": "主要项目"} + sim, match_type = linker.calculate_entity_similarity(entity1, entity2) + print(f" ✓ 实体相似度计算正常 (相似度: {sim:.2f}, 类型: {match_type})") + +except Exception as e: + print(f" ✗ 实体关联功能测试失败: {e}") + +# 测试图片处理功能(不需要实际图片) +print("\n4. 测试图片处理器功能...") + +try: + processor = get_image_processor() + + # 测试图片类型检测(使用模拟数据) + print(f" ✓ 支持的图片类型: {list(processor.IMAGE_TYPES.keys())}") + print(f" ✓ 图片类型描述: {processor.IMAGE_TYPES}") + +except Exception as e: + print(f" ✗ 图片处理器功能测试失败: {e}") + +# 测试视频处理配置 +print("\n5. 测试视频处理器配置...") + +try: + processor = get_multimodal_processor() + + print(f" ✓ 视频目录: {processor.video_dir}") + print(f" ✓ 帧目录: {processor.frames_dir}") + print(f" ✓ 音频目录: {processor.audio_dir}") + + # 检查目录是否存在 + for dir_name, dir_path in [ + ("视频", processor.video_dir), + ("帧", processor.frames_dir), + ("音频", processor.audio_dir) + ]: + if os.path.exists(dir_path): + print(f" ✓ {dir_name}目录存在: {dir_path}") + else: + print(f" ✗ {dir_name}目录不存在: {dir_path}") + +except Exception as e: + print(f" ✗ 视频处理器配置测试失败: {e}") + +# 测试数据库方法(如果数据库可用) +print("\n6. 测试数据库多模态方法...") + +try: + from db_manager import get_db_manager + db = get_db_manager() + + # 检查多模态表是否存在 + conn = db.get_conn() + tables = ['videos', 'video_frames', 'images', 'multimodal_mentions', 'multimodal_entity_links'] + + for table in tables: + try: + conn.execute(f"SELECT 1 FROM {table} LIMIT 1") + print(f" ✓ 表 '{table}' 存在") + except Exception as e: + print(f" ✗ 表 '{table}' 不存在或无法访问: {e}") + + conn.close() + +except Exception as e: + print(f" ✗ 数据库多模态方法测试失败: {e}") + +print("\n" + "=" * 60) +print("测试完成") +print("=" * 60) diff --git a/chrome-extension/background.js b/chrome-extension/background.js new file mode 100644 index 0000000..24e1174 --- /dev/null +++ b/chrome-extension/background.js @@ -0,0 +1,217 @@ +// InsightFlow Chrome Extension - Background Script +// 处理后台任务、右键菜单、消息传递 + +// 默认配置 +const DEFAULT_CONFIG = { + serverUrl: 'http://122.51.127.111:18000', + apiKey: '', + defaultProjectId: '' +}; + +// 初始化 +chrome.runtime.onInstalled.addListener(() => { + // 创建右键菜单 + chrome.contextMenus.create({ + id: 'clipSelection', + title: '保存到 InsightFlow', + contexts: ['selection', 'page'] + }); + + // 初始化存储 + chrome.storage.sync.get(['insightflowConfig'], (result) => { + if (!result.insightflowConfig) { + chrome.storage.sync.set({ insightflowConfig: DEFAULT_CONFIG }); + } + }); +}); + +// 处理右键菜单点击 +chrome.contextMenus.onClicked.addListener((info, tab) => { + if (info.menuItemId === 'clipSelection') { + clipPage(tab, info.selectionText); + } +}); + +// 处理扩展图标点击 +chrome.action.onClicked.addListener((tab) => { + clipPage(tab); +}); + +// 监听来自内容脚本的消息 +chrome.runtime.onMessage.addListener((request, sender, sendResponse) => { + if (request.action === 'clipPage') { + clipPage(sender.tab, request.selectionText); + sendResponse({ success: true }); + } else if (request.action === 'getConfig') { + chrome.storage.sync.get(['insightflowConfig'], (result) => { + sendResponse(result.insightflowConfig || DEFAULT_CONFIG); + }); + return true; // 保持消息通道开放 + } else if (request.action === 'saveConfig') { + chrome.storage.sync.set({ insightflowConfig: request.config }, () => { + sendResponse({ success: true }); + }); + return true; + } else if (request.action === 'fetchProjects') { + fetchProjects().then(projects => { + sendResponse({ success: true, projects }); + }).catch(error => { + sendResponse({ success: false, error: error.message }); + }); + return true; + } +}); + +// 剪藏页面 +async function clipPage(tab, selectionText = null) { + try { + // 获取配置 + const config = await getConfig(); + + if (!config.apiKey) { + showNotification('请先配置 API Key', '点击扩展图标打开设置'); + chrome.runtime.openOptionsPage(); + return; + } + + // 获取页面内容 + const [{ result }] = await chrome.scripting.executeScript({ + target: { tabId: tab.id }, + func: extractPageContent, + args: [selectionText] + }); + + // 发送到 InsightFlow + const response = await sendToInsightFlow(config, result); + + if (response.success) { + showNotification('保存成功', '内容已导入 InsightFlow'); + } else { + showNotification('保存失败', response.error || '未知错误'); + } + } catch (error) { + console.error('Clip error:', error); + showNotification('保存失败', error.message); + } +} + +// 提取页面内容 +function extractPageContent(selectionText) { + const data = { + url: window.location.href, + title: document.title, + selection: selectionText, + timestamp: new Date().toISOString() + }; + + if (selectionText) { + // 只保存选中的文本 + data.content = selectionText; + data.contentType = 'selection'; + } else { + // 保存整个页面 + // 获取主要内容 + const article = document.querySelector('article') || + document.querySelector('main') || + document.querySelector('.content') || + document.querySelector('#content'); + + if (article) { + data.content = article.innerText; + data.contentType = 'article'; + } else { + // 获取 body 文本,但移除脚本和样式 + const bodyClone = document.body.cloneNode(true); + const scripts = bodyClone.querySelectorAll('script, style, nav, header, footer, aside'); + scripts.forEach(el => el.remove()); + data.content = bodyClone.innerText; + data.contentType = 'page'; + } + + // 限制内容长度 + if (data.content.length > 50000) { + data.content = data.content.substring(0, 50000) + '...'; + data.truncated = true; + } + } + + // 获取元数据 + data.meta = { + description: document.querySelector('meta[name="description"]')?.content || '', + keywords: document.querySelector('meta[name="keywords"]')?.content || '', + author: document.querySelector('meta[name="author"]')?.content || '' + }; + + return data; +} + +// 发送到 InsightFlow +async function sendToInsightFlow(config, data) { + const url = `${config.serverUrl}/api/v1/plugins/chrome/clip`; + + const payload = { + url: data.url, + title: data.title, + content: data.content, + content_type: data.contentType, + meta: data.meta, + project_id: config.defaultProjectId || null + }; + + const response = await fetch(url, { + method: 'POST', + headers: { + 'Content-Type': 'application/json', + 'X-API-Key': config.apiKey + }, + body: JSON.stringify(payload) + }); + + if (!response.ok) { + const error = await response.text(); + throw new Error(error); + } + + return await response.json(); +} + +// 获取配置 +function getConfig() { + return new Promise((resolve) => { + chrome.storage.sync.get(['insightflowConfig'], (result) => { + resolve(result.insightflowConfig || DEFAULT_CONFIG); + }); + }); +} + +// 获取项目列表 +async function fetchProjects() { + const config = await getConfig(); + + if (!config.apiKey) { + throw new Error('请先配置 API Key'); + } + + const response = await fetch(`${config.serverUrl}/api/v1/projects`, { + headers: { + 'X-API-Key': config.apiKey + } + }); + + if (!response.ok) { + throw new Error('获取项目列表失败'); + } + + const data = await response.json(); + return data.projects || []; +} + +// 显示通知 +function showNotification(title, message) { + chrome.notifications.create({ + type: 'basic', + iconUrl: 'icons/icon128.png', + title, + message + }); +} \ No newline at end of file diff --git a/chrome-extension/content.css b/chrome-extension/content.css new file mode 100644 index 0000000..218164b --- /dev/null +++ b/chrome-extension/content.css @@ -0,0 +1,141 @@ +/* InsightFlow Chrome Extension - Content Styles */ + +.insightflow-float-btn { + position: absolute; + width: 36px; + height: 36px; + background: #4f46e5; + border-radius: 50%; + display: none; + align-items: center; + justify-content: center; + cursor: pointer; + box-shadow: 0 2px 8px rgba(0, 0, 0, 0.15); + z-index: 999999; + transition: transform 0.2s, box-shadow 0.2s; +} + +.insightflow-float-btn:hover { + transform: scale(1.1); + box-shadow: 0 4px 12px rgba(0, 0, 0, 0.2); +} + +.insightflow-float-btn svg { + color: white; +} + +.insightflow-popup { + position: absolute; + width: 300px; + background: white; + border-radius: 8px; + box-shadow: 0 4px 20px rgba(0, 0, 0, 0.15); + z-index: 999999; + display: none; + font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; + font-size: 14px; +} + +.insightflow-popup-header { + display: flex; + justify-content: space-between; + align-items: center; + padding: 12px 16px; + border-bottom: 1px solid #e5e7eb; + font-weight: 600; + color: #111827; +} + +.insightflow-close-btn { + background: none; + border: none; + font-size: 20px; + color: #6b7280; + cursor: pointer; + padding: 0; + width: 24px; + height: 24px; + display: flex; + align-items: center; + justify-content: center; +} + +.insightflow-close-btn:hover { + color: #111827; +} + +.insightflow-popup-content { + padding: 16px; +} + +.insightflow-text-preview { + background: #f3f4f6; + padding: 12px; + border-radius: 6px; + font-size: 13px; + color: #4b5563; + line-height: 1.5; + max-height: 120px; + overflow-y: auto; + margin-bottom: 12px; +} + +.insightflow-actions { + display: flex; + gap: 8px; +} + +.insightflow-btn { + flex: 1; + padding: 8px 12px; + border: 1px solid #d1d5db; + border-radius: 6px; + background: white; + color: #374151; + font-size: 13px; + cursor: pointer; + transition: all 0.2s; +} + +.insightflow-btn:hover { + background: #f9fafb; + border-color: #9ca3af; +} + +.insightflow-btn-primary { + background: #4f46e5; + border-color: #4f46e5; + color: white; +} + +.insightflow-btn-primary:hover { + background: #4338ca; + border-color: #4338ca; +} + +.insightflow-project-list { + max-height: 200px; + overflow-y: auto; +} + +.insightflow-project-item { + padding: 12px; + border-radius: 6px; + cursor: pointer; + transition: background 0.2s; +} + +.insightflow-project-item:hover { + background: #f3f4f6; +} + +.insightflow-project-name { + font-weight: 500; + color: #111827; + margin-bottom: 4px; +} + +.insightflow-project-desc { + font-size: 12px; + color: #6b7280; +} \ No newline at end of file diff --git a/chrome-extension/content.js b/chrome-extension/content.js new file mode 100644 index 0000000..c95e4a6 --- /dev/null +++ b/chrome-extension/content.js @@ -0,0 +1,204 @@ +// InsightFlow Chrome Extension - Content Script +// 在页面中注入,处理页面交互 + +(function() { + 'use strict'; + + // 防止重复注入 + if (window.insightflowInjected) return; + window.insightflowInjected = true; + + // 创建浮动按钮 + let floatingButton = null; + let selectionPopup = null; + + // 监听选中文本 + document.addEventListener('mouseup', handleSelection); + document.addEventListener('keyup', handleSelection); + + function handleSelection(e) { + const selection = window.getSelection(); + const text = selection.toString().trim(); + + if (text.length > 0) { + showFloatingButton(selection); + } else { + hideFloatingButton(); + hideSelectionPopup(); + } + } + + // 显示浮动按钮 + function showFloatingButton(selection) { + if (!floatingButton) { + floatingButton = document.createElement('div'); + floatingButton.className = 'insightflow-float-btn'; + floatingButton.innerHTML = ` + + + + `; + floatingButton.title = '保存到 InsightFlow'; + document.body.appendChild(floatingButton); + + floatingButton.addEventListener('click', () => { + const text = window.getSelection().toString().trim(); + if (text) { + showSelectionPopup(text); + } + }); + } + + // 定位按钮 + const range = selection.getRangeAt(0); + const rect = range.getBoundingClientRect(); + + floatingButton.style.left = `${rect.right + window.scrollX - 40}px`; + floatingButton.style.top = `${rect.top + window.scrollY - 45}px`; + floatingButton.style.display = 'flex'; + } + + // 隐藏浮动按钮 + function hideFloatingButton() { + if (floatingButton) { + floatingButton.style.display = 'none'; + } + } + + // 显示选择弹窗 + function showSelectionPopup(text) { + hideFloatingButton(); + + if (!selectionPopup) { + selectionPopup = document.createElement('div'); + selectionPopup.className = 'insightflow-popup'; + document.body.appendChild(selectionPopup); + } + + selectionPopup.innerHTML = ` +
+ 保存到 InsightFlow + +
+
+
${escapeHtml(text.substring(0, 200))}${text.length > 200 ? '...' : ''}
+
+ + +
+
+ `; + + selectionPopup.style.display = 'block'; + + // 定位弹窗 + const selection = window.getSelection(); + const range = selection.getRangeAt(0); + const rect = range.getBoundingClientRect(); + + selectionPopup.style.left = `${Math.min(rect.left + window.scrollX, window.innerWidth - 320)}px`; + selectionPopup.style.top = `${rect.bottom + window.scrollY + 10}px`; + + // 绑定事件 + selectionPopup.querySelector('.insightflow-close-btn').addEventListener('click', hideSelectionPopup); + selectionPopup.querySelector('#if-save-quick').addEventListener('click', () => saveQuick(text)); + selectionPopup.querySelector('#if-save-select').addEventListener('click', () => saveWithProject(text)); + } + + // 隐藏选择弹窗 + function hideSelectionPopup() { + if (selectionPopup) { + selectionPopup.style.display = 'none'; + } + } + + // 快速保存 + async function saveQuick(text) { + hideSelectionPopup(); + + chrome.runtime.sendMessage({ + action: 'clipPage', + selectionText: text + }); + } + + // 选择项目保存 + async function saveWithProject(text) { + // 获取项目列表 + chrome.runtime.sendMessage({ action: 'fetchProjects' }, (response) => { + if (response.success && response.projects.length > 0) { + showProjectSelector(text, response.projects); + } else { + saveQuick(text); // 失败时快速保存 + } + }); + } + + // 显示项目选择器 + function showProjectSelector(text, projects) { + selectionPopup.innerHTML = ` +
+ 选择项目 + +
+
+
+ ${projects.map(p => ` +
+
${escapeHtml(p.name)}
+
${escapeHtml(p.description || '').substring(0, 50)}
+
+ `).join('')} +
+
+ `; + + selectionPopup.querySelector('.insightflow-close-btn').addEventListener('click', hideSelectionPopup); + + // 绑定项目选择事件 + selectionPopup.querySelectorAll('.insightflow-project-item').forEach(item => { + item.addEventListener('click', () => { + const projectId = item.dataset.id; + saveToProject(text, projectId); + }); + }); + } + + // 保存到指定项目 + async function saveToProject(text, projectId) { + hideSelectionPopup(); + + chrome.runtime.sendMessage({ + action: 'getConfig' + }, (config) => { + // 临时设置默认项目 + config.defaultProjectId = projectId; + chrome.runtime.sendMessage({ + action: 'saveConfig', + config: config + }, () => { + chrome.runtime.sendMessage({ + action: 'clipPage', + selectionText: text + }); + }); + }); + } + + // HTML 转义 + function escapeHtml(text) { + const div = document.createElement('div'); + div.textContent = text; + return div.innerHTML; + } + + // 点击页面其他地方关闭弹窗 + document.addEventListener('click', (e) => { + if (selectionPopup && !selectionPopup.contains(e.target) && + floatingButton && !floatingButton.contains(e.target)) { + hideSelectionPopup(); + hideFloatingButton(); + } + }); + +})(); \ No newline at end of file diff --git a/chrome-extension/manifest.json b/chrome-extension/manifest.json new file mode 100644 index 0000000..b89bffc --- /dev/null +++ b/chrome-extension/manifest.json @@ -0,0 +1,46 @@ +{ + "manifest_version": 3, + "name": "InsightFlow Clipper", + "version": "1.0.0", + "description": "将网页内容一键导入 InsightFlow 知识库", + "permissions": [ + "activeTab", + "storage", + "contextMenus", + "scripting" + ], + "host_permissions": [ + "http://*/*", + "https://*/*" + ], + "action": { + "default_popup": "popup.html", + "default_icon": { + "16": "icons/icon16.png", + "48": "icons/icon48.png", + "128": "icons/icon128.png" + } + }, + "icons": { + "16": "icons/icon16.png", + "48": "icons/icon48.png", + "128": "icons/icon128.png" + }, + "background": { + "service_worker": "background.js" + }, + "content_scripts": [ + { + "matches": [""], + "js": ["content.js"], + "css": ["content.css"] + } + ], + "options_page": "options.html", + "web_accessible_resources": [ + { + "resources": ["icons/*.png"], + "matches": [""] + } + ] +} \ No newline at end of file diff --git a/chrome-extension/options.html b/chrome-extension/options.html new file mode 100644 index 0000000..406a118 --- /dev/null +++ b/chrome-extension/options.html @@ -0,0 +1,349 @@ + + + + + + InsightFlow Clipper 设置 + + + +
+
+

⚙️ InsightFlow Clipper 设置

+

配置您的知识库连接

+
+ +
+
+ + + + 服务器连接 +
+ +
+

如何获取 API Key

+

+ 1. 登录 InsightFlow 控制台
+ 2. 进入「插件管理」页面
+ 3. 创建 Chrome 插件并复制 API Key +

+
+ +
+ + +

InsightFlow 服务器的 URL 地址

+
+ +
+ + +

从 InsightFlow 控制台获取的插件 API Key

+
+ +
+ +
+
+
+ +
+
+ + + + + 默认设置 +
+ +
+ + +

保存内容时默认导入的项目

+
+
+ +
+
+ + + + 使用说明 +
+ +
    +
  • + 保存当前页面 + 点击扩展图标 +
  • +
  • + 保存选中文本 + 右键 → 保存到 InsightFlow +
  • +
  • + 快速保存选中内容 + 选中文本后点击浮动按钮 +
  • +
  • + 选择项目保存 + 选中文本后点击"选择项目" +
  • +
+
+ +
+ + +
+ + + +
+ + + + \ No newline at end of file diff --git a/chrome-extension/options.js b/chrome-extension/options.js new file mode 100644 index 0000000..a5aa67b --- /dev/null +++ b/chrome-extension/options.js @@ -0,0 +1,175 @@ +// InsightFlow Chrome Extension - Options Script + +document.addEventListener('DOMContentLoaded', () => { + const serverUrlInput = document.getElementById('serverUrl'); + const apiKeyInput = document.getElementById('apiKey'); + const defaultProjectSelect = document.getElementById('defaultProject'); + const testBtn = document.getElementById('testBtn'); + const testResult = document.getElementById('testResult'); + const saveBtn = document.getElementById('saveBtn'); + const resetBtn = document.getElementById('resetBtn'); + const openConsole = document.getElementById('openConsole'); + const helpLink = document.getElementById('helpLink'); + + // 加载配置 + loadConfig(); + + // 测试连接 + testBtn.addEventListener('click', async () => { + testBtn.disabled = true; + testBtn.textContent = '测试中...'; + testResult.className = ''; + testResult.style.display = 'none'; + + const serverUrl = serverUrlInput.value.trim(); + const apiKey = apiKeyInput.value.trim(); + + if (!serverUrl || !apiKey) { + showTestResult('请填写服务器地址和 API Key', 'error'); + testBtn.disabled = false; + testBtn.textContent = '测试连接'; + return; + } + + try { + const response = await fetch(`${serverUrl}/api/v1/projects`, { + headers: { 'X-API-Key': apiKey } + }); + + if (response.ok) { + const data = await response.json(); + showTestResult(`连接成功!找到 ${data.projects?.length || 0} 个项目`, 'success'); + + // 更新项目列表 + updateProjectList(data.projects || []); + } else if (response.status === 401) { + showTestResult('API Key 无效,请检查', 'error'); + } else { + showTestResult(`连接失败: HTTP ${response.status}`, 'error'); + } + } catch (error) { + showTestResult(`连接错误: ${error.message}`, 'error'); + } + + testBtn.disabled = false; + testBtn.textContent = '测试连接'; + }); + + // 保存设置 + saveBtn.addEventListener('click', async () => { + const config = { + serverUrl: serverUrlInput.value.trim(), + apiKey: apiKeyInput.value.trim(), + defaultProjectId: defaultProjectSelect.value + }; + + if (!config.serverUrl) { + alert('请填写服务器地址'); + return; + } + + await chrome.storage.sync.set({ insightflowConfig: config }); + + // 显示保存成功 + saveBtn.textContent = '已保存 ✓'; + saveBtn.classList.add('btn-success'); + + setTimeout(() => { + saveBtn.textContent = '保存设置'; + saveBtn.classList.remove('btn-success'); + }, 2000); + }); + + // 重置设置 + resetBtn.addEventListener('click', () => { + if (confirm('确定要重置所有设置吗?')) { + const defaultConfig = { + serverUrl: 'http://122.51.127.111:18000', + apiKey: '', + defaultProjectId: '' + }; + + chrome.storage.sync.set({ insightflowConfig: defaultConfig }, () => { + loadConfig(); + showTestResult('设置已重置', 'success'); + }); + } + }); + + // 打开控制台 + openConsole.addEventListener('click', (e) => { + e.preventDefault(); + const serverUrl = serverUrlInput.value.trim(); + if (serverUrl) { + chrome.tabs.create({ url: serverUrl }); + } + }); + + // 帮助链接 + helpLink.addEventListener('click', (e) => { + e.preventDefault(); + const serverUrl = serverUrlInput.value.trim(); + if (serverUrl) { + chrome.tabs.create({ url: `${serverUrl}/docs` }); + } + }); + + // 加载配置 + async function loadConfig() { + const result = await chrome.storage.sync.get(['insightflowConfig']); + const config = result.insightflowConfig || { + serverUrl: 'http://122.51.127.111:18000', + apiKey: '', + defaultProjectId: '' + }; + + serverUrlInput.value = config.serverUrl; + apiKeyInput.value = config.apiKey; + + // 如果有 API Key,加载项目列表 + if (config.apiKey) { + loadProjects(config); + } + } + + // 加载项目列表 + async function loadProjects(config) { + try { + const response = await fetch(`${config.serverUrl}/api/v1/projects`, { + headers: { 'X-API-Key': config.apiKey } + }); + + if (response.ok) { + const data = await response.json(); + updateProjectList(data.projects || [], config.defaultProjectId); + } + } catch (error) { + console.error('Failed to load projects:', error); + } + } + + // 更新项目列表 + function updateProjectList(projects, selectedId = '') { + let html = ''; + + projects.forEach(project => { + const selected = project.id === selectedId ? 'selected' : ''; + html += ``; + }); + + defaultProjectSelect.innerHTML = html; + } + + // 显示测试结果 + function showTestResult(message, type) { + testResult.textContent = message; + testResult.className = type; + } + + // HTML 转义 + function escapeHtml(text) { + const div = document.createElement('div'); + div.textContent = text; + return div.innerHTML; + } +}); \ No newline at end of file diff --git a/chrome-extension/popup.html b/chrome-extension/popup.html new file mode 100644 index 0000000..39d5c12 --- /dev/null +++ b/chrome-extension/popup.html @@ -0,0 +1,258 @@ + + + + + + InsightFlow Clipper + + + +
+

🧠 InsightFlow

+

一键保存网页到知识库

+
+ +
+
+ +
+
+
+ 连接中... +
+ + +
+ + + + + +
+
+
0
+
已保存
+
+
+
0
+
项目数
+
+
+
0
+
今日
+
+
+
+ + + + + + \ No newline at end of file diff --git a/chrome-extension/popup.js b/chrome-extension/popup.js new file mode 100644 index 0000000..6376a42 --- /dev/null +++ b/chrome-extension/popup.js @@ -0,0 +1,195 @@ +// InsightFlow Chrome Extension - Popup Script + +document.addEventListener('DOMContentLoaded', async () => { + const clipBtn = document.getElementById('clipBtn'); + const settingsBtn = document.getElementById('settingsBtn'); + const projectSelect = document.getElementById('projectSelect'); + const statusDot = document.getElementById('statusDot'); + const statusText = document.getElementById('statusText'); + const messageEl = document.getElementById('message'); + const openDashboard = document.getElementById('openDashboard'); + + // 加载配置和项目列表 + await loadConfig(); + + // 保存当前页面按钮 + clipBtn.addEventListener('click', async () => { + const [tab] = await chrome.tabs.query({ active: true, currentWindow: true }); + + // 更新按钮状态 + clipBtn.disabled = true; + clipBtn.innerHTML = ' 保存中...'; + + // 保存选中的项目 + const projectId = projectSelect.value; + if (projectId) { + const config = await getConfig(); + config.defaultProjectId = projectId; + await saveConfig(config); + } + + // 发送剪藏请求 + chrome.runtime.sendMessage({ + action: 'clipPage' + }, (response) => { + clipBtn.disabled = false; + clipBtn.innerHTML = ` + + + + 保存当前页面 + `; + + if (response && response.success) { + showMessage('保存成功!', 'success'); + updateStats(); + } else { + showMessage(response?.error || '保存失败', 'error'); + } + }); + }); + + // 设置按钮 + settingsBtn.addEventListener('click', () => { + chrome.runtime.openOptionsPage(); + }); + + // 打开控制台 + openDashboard.addEventListener('click', async (e) => { + e.preventDefault(); + const config = await getConfig(); + chrome.tabs.create({ url: config.serverUrl }); + }); +}); + +// 加载配置 +async function loadConfig() { + const config = await getConfig(); + + // 检查连接状态 + checkConnection(config); + + // 加载项目列表 + loadProjects(config); + + // 更新统计 + updateStats(); +} + +// 检查连接状态 +async function checkConnection(config) { + const statusDot = document.getElementById('statusDot'); + const statusText = document.getElementById('statusText'); + + if (!config.apiKey) { + statusDot.classList.add('error'); + statusText.textContent = '未配置 API Key'; + return; + } + + try { + const response = await fetch(`${config.serverUrl}/api/v1/projects`, { + headers: { 'X-API-Key': config.apiKey } + }); + + if (response.ok) { + statusText.textContent = '已连接'; + } else { + statusDot.classList.add('error'); + statusText.textContent = '连接失败'; + } + } catch (error) { + statusDot.classList.add('error'); + statusText.textContent = '连接错误'; + } +} + +// 加载项目列表 +async function loadProjects(config) { + const projectSelect = document.getElementById('projectSelect'); + + if (!config.apiKey) { + projectSelect.innerHTML = ''; + return; + } + + try { + const response = await fetch(`${config.serverUrl}/api/v1/projects`, { + headers: { 'X-API-Key': config.apiKey } + }); + + if (response.ok) { + const data = await response.json(); + const projects = data.projects || []; + + // 更新项目数统计 + document.getElementById('projectCount').textContent = projects.length; + + // 填充下拉框 + let html = ''; + projects.forEach(project => { + const selected = project.id === config.defaultProjectId ? 'selected' : ''; + html += ``; + }); + projectSelect.innerHTML = html; + } + } catch (error) { + console.error('Failed to load projects:', error); + } +} + +// 更新统计 +async function updateStats() { + // 从存储中获取统计数据 + const result = await chrome.storage.local.get(['clipStats']); + const stats = result.clipStats || { total: 0, today: 0, lastDate: null }; + + // 检查是否需要重置今日计数 + const today = new Date().toDateString(); + if (stats.lastDate !== today) { + stats.today = 0; + stats.lastDate = today; + await chrome.storage.local.set({ clipStats: stats }); + } + + document.getElementById('clipCount').textContent = stats.total; + document.getElementById('todayCount').textContent = stats.today; +} + +// 显示消息 +function showMessage(text, type) { + const messageEl = document.getElementById('message'); + messageEl.textContent = text; + messageEl.className = `message ${type}`; + + setTimeout(() => { + messageEl.className = 'message'; + }, 3000); +} + +// 获取配置 +function getConfig() { + return new Promise((resolve) => { + chrome.storage.sync.get(['insightflowConfig'], (result) => { + resolve(result.insightflowConfig || { + serverUrl: 'http://122.51.127.111:18000', + apiKey: '', + defaultProjectId: '' + }); + }); + }); +} + +// 保存配置 +function saveConfig(config) { + return new Promise((resolve) => { + chrome.storage.sync.set({ insightflowConfig: config }, resolve); + }); +} + +// HTML 转义 +function escapeHtml(text) { + const div = document.createElement('div'); + div.textContent = text; + return div.innerHTML; +} \ No newline at end of file diff --git a/docs/PHASE7_TASK2_SUMMARY.md b/docs/PHASE7_TASK2_SUMMARY.md new file mode 100644 index 0000000..d4ddbb2 --- /dev/null +++ b/docs/PHASE7_TASK2_SUMMARY.md @@ -0,0 +1,95 @@ +# InsightFlow Phase 7 任务 2 开发总结 + +## 完成内容 + +### 1. 多模态处理模块 (multimodal_processor.py) + +#### VideoProcessor 类 +- **视频文件处理**: 支持 MP4, AVI, MOV, MKV, WebM, FLV 格式 +- **音频提取**: 使用 ffmpeg 提取音频轨道(WAV 格式,16kHz 采样率) +- **关键帧提取**: 使用 OpenCV 按时间间隔提取关键帧(默认每5秒) +- **OCR识别**: 支持 PaddleOCR/EasyOCR/Tesseract 识别关键帧文字 +- **数据整合**: 合并所有帧的 OCR 文本,支持实体提取 + +#### ImageProcessor 类 +- **图片处理**: 支持 JPG, PNG, GIF, BMP, WebP 格式 +- **OCR识别**: 识别图片中的文字内容(白板、PPT、手写笔记) +- **图片描述**: 预留多模态 LLM 接口(待集成) +- **批量处理**: 支持批量图片导入 + +#### MultimodalEntityExtractor 类 +- 从视频和图片处理结果中提取实体和关系 +- 与现有 LLM 客户端集成 + +### 2. 多模态实体关联模块 (multimodal_entity_linker.py) + +#### MultimodalEntityLinker 类 +- **跨模态实体对齐**: 使用 embedding 相似度计算发现不同模态中的同一实体 +- **多模态实体画像**: 统计实体在各模态中的提及次数 +- **跨模态关系发现**: 查找在同一视频帧/图片中共同出现的实体 +- **多模态时间线**: 按时间顺序展示多模态事件 + +### 3. 数据库更新 (schema.sql) + +新增表: +- `videos`: 视频信息表(时长、帧率、分辨率、OCR文本) +- `video_frames`: 视频关键帧表(帧数据、时间戳、OCR文本) +- `images`: 图片信息表(OCR文本、描述、提取的实体) +- `multimodal_mentions`: 多模态实体提及表 +- `multimodal_entity_links`: 多模态实体关联表 + +### 4. API 端点 (main.py) + +#### 视频相关 +- `POST /api/v1/projects/{id}/upload-video` - 上传视频 +- `GET /api/v1/projects/{id}/videos` - 视频列表 +- `GET /api/v1/videos/{id}` - 视频详情 + +#### 图片相关 +- `POST /api/v1/projects/{id}/upload-image` - 上传图片 +- `GET /api/v1/projects/{id}/images` - 图片列表 +- `GET /api/v1/images/{id}` - 图片详情 + +#### 多模态实体关联 +- `POST /api/v1/projects/{id}/multimodal/link-entities` - 跨模态实体关联 +- `GET /api/v1/entities/{id}/multimodal-profile` - 实体多模态画像 +- `GET /api/v1/projects/{id}/multimodal-timeline` - 多模态时间线 +- `GET /api/v1/entities/{id}/cross-modal-relations` - 跨模态关系 + +### 5. 依赖更新 (requirements.txt) + +新增依赖: +- `opencv-python==4.9.0.80` - 视频处理 +- `pillow==10.2.0` - 图片处理 +- `paddleocr==2.7.0.3` + `paddlepaddle==2.6.0` - OCR 引擎 +- `ffmpeg-python==0.2.0` - ffmpeg 封装 +- `sentence-transformers==2.3.1` - 跨模态对齐 + +## 系统要求 + +- **ffmpeg**: 必须安装,用于视频和音频处理 +- **Python 3.8+**: 支持所有依赖库 + +## 待完善项 + +1. **多模态 LLM 集成**: 图片描述功能需要集成 Kimi 或其他多模态模型 API +2. **前端界面**: 需要开发视频/图片上传界面和多模态展示组件 +3. **性能优化**: 大视频文件处理可能需要异步任务队列 +4. **OCR 引擎选择**: 根据部署环境选择最适合的 OCR 引擎 + +## 部署说明 + +```bash +# 安装系统依赖 +apt-get update +apt-get install -y ffmpeg + +# 安装 Python 依赖 +pip install -r requirements.txt + +# 更新数据库 +sqlite3 insightflow.db < schema.sql + +# 启动服务 +python -m uvicorn main:app --reload --host 0.0.0.0 --port 8000 +```