Compare commits
4 Commits
2a3081c151
...
7e192a9f0a
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7e192a9f0a | ||
|
|
5005a2df52 | ||
|
|
da8a4db985 | ||
|
|
643fe46780 |
34
Dockerfile
34
Dockerfile
@@ -1,29 +1,33 @@
|
|||||||
|
# InsightFlow - Audio to Knowledge Graph Platform
|
||||||
|
# Phase 3: Memory & Growth
|
||||||
|
|
||||||
FROM python:3.11-slim
|
FROM python:3.11-slim
|
||||||
|
|
||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
|
|
||||||
# Install uv
|
# Install system dependencies
|
||||||
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
|
|
||||||
|
|
||||||
# Install system deps
|
|
||||||
RUN apt-get update && apt-get install -y \
|
RUN apt-get update && apt-get install -y \
|
||||||
ffmpeg \
|
gcc \
|
||||||
git \
|
libpq-dev \
|
||||||
&& rm -rf /var/lib/apt/lists/*
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
# Copy project files
|
# Copy backend requirements
|
||||||
COPY backend/pyproject.toml backend/uv.lock ./
|
COPY backend/requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
# Install dependencies using uv sync
|
# Copy application code
|
||||||
RUN uv sync --frozen --no-install-project
|
|
||||||
|
|
||||||
# Copy code
|
|
||||||
COPY backend/ ./backend/
|
COPY backend/ ./backend/
|
||||||
COPY frontend/ ./frontend/
|
COPY frontend/ ./frontend/
|
||||||
|
|
||||||
# Install project
|
# Create data directory
|
||||||
RUN uv sync --frozen
|
RUN mkdir -p /app/data
|
||||||
|
|
||||||
|
# Set environment variables
|
||||||
|
ENV PYTHONPATH=/app
|
||||||
|
ENV DB_PATH=/app/data/insightflow.db
|
||||||
|
|
||||||
|
# Expose port
|
||||||
EXPOSE 8000
|
EXPOSE 8000
|
||||||
|
|
||||||
CMD ["uv", "run", "python", "backend/main.py"]
|
# Run the application
|
||||||
|
CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
|
|||||||
103
README.md
103
README.md
@@ -1,27 +1,88 @@
|
|||||||
# InsightFlow
|
# InsightFlow - Audio to Knowledge Graph Platform
|
||||||
|
|
||||||
音频与文档的领域知识构建平台
|
## Phase 3: Memory & Growth - Completed ✅
|
||||||
|
|
||||||
## 产品定位
|
### 新增功能
|
||||||
将会议录音和文档转化为结构化的知识图谱,通过人机回圈(Human-in-the-Loop)实现知识持续生长。
|
|
||||||
|
|
||||||
## 核心特性
|
#### 1. 多文件图谱融合 ✅
|
||||||
- 🎙️ ASR 语音识别 + 热词注入
|
- 支持上传多个音频文件到同一项目
|
||||||
- 🧠 LLM 实体抽取与解释
|
- 系统自动对齐实体,合并图谱
|
||||||
- 🔗 双视图联动(文档视图 + 图谱视图)
|
- 实体提及跨文件追踪
|
||||||
- 📈 知识生长(多文件实体对齐)
|
- 文件选择器切换不同转录内容
|
||||||
|
|
||||||
## 技术栈
|
#### 2. 实体对齐算法优化 ✅
|
||||||
- 前端: Next.js + Tailwind
|
- 新增 `entity_aligner.py` 模块
|
||||||
- 后端: Node.js / Python
|
- 支持使用 Kimi API embedding 进行语义相似度匹配
|
||||||
- 数据库: MySQL + Neo4j
|
- 余弦相似度计算
|
||||||
- ASR: Whisper
|
- 自动别名建议
|
||||||
- LLM: OpenAI / Kimi
|
- 批量实体对齐 API
|
||||||
|
|
||||||
## 开发阶段
|
#### 3. PDF/DOCX 文档导入 ✅
|
||||||
- [ ] Phase 1: 骨架与单体分析 (MVP)
|
- 新增 `document_processor.py` 模块
|
||||||
- [ ] Phase 2: 交互与纠错工作台
|
- 支持 PDF、DOCX、TXT、MD 格式
|
||||||
- [ ] Phase 3: 记忆与生长
|
- 文档文本提取并参与实体提取
|
||||||
|
- 文档类型标记(音频/文档)
|
||||||
|
|
||||||
## 文档
|
#### 4. 项目知识库面板 ✅
|
||||||
- [PRD v2.0](docs/PRD-v2.0.md)
|
- 全新的知识库视图
|
||||||
|
- 统计面板:实体数、关系数、文件数、术语数
|
||||||
|
- 实体网格展示(带提及统计)
|
||||||
|
- 关系列表展示
|
||||||
|
- 术语表管理(添加/删除)
|
||||||
|
- 文件列表展示
|
||||||
|
|
||||||
|
### 技术栈
|
||||||
|
- 后端: FastAPI + SQLite
|
||||||
|
- 前端: 原生 HTML/JS + D3.js
|
||||||
|
- ASR: 阿里云听悟
|
||||||
|
- LLM: Kimi API
|
||||||
|
- 文档处理: PyPDF2, python-docx
|
||||||
|
|
||||||
|
### 部署
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 构建 Docker 镜像
|
||||||
|
docker build -t insightflow:phase3 .
|
||||||
|
|
||||||
|
# 运行容器
|
||||||
|
docker run -d \
|
||||||
|
-p 18000:8000 \
|
||||||
|
-v /opt/data:/app/data \
|
||||||
|
-e KIMI_API_KEY=your_key \
|
||||||
|
-e ALIYUN_ACCESS_KEY_ID=your_key \
|
||||||
|
-e ALIYUN_ACCESS_KEY_SECRET=your_secret \
|
||||||
|
insightflow:phase3
|
||||||
|
```
|
||||||
|
|
||||||
|
### API 文档
|
||||||
|
|
||||||
|
#### 新增 API
|
||||||
|
|
||||||
|
**文档上传**
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/upload-document
|
||||||
|
Content-Type: multipart/form-data
|
||||||
|
file: <文件>
|
||||||
|
```
|
||||||
|
|
||||||
|
**知识库查询**
|
||||||
|
```
|
||||||
|
GET /api/v1/projects/{project_id}/knowledge-base
|
||||||
|
```
|
||||||
|
|
||||||
|
**术语表管理**
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/glossary
|
||||||
|
GET /api/v1/projects/{project_id}/glossary
|
||||||
|
DELETE /api/v1/glossary/{term_id}
|
||||||
|
```
|
||||||
|
|
||||||
|
**实体对齐**
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/align-entities?threshold=0.85
|
||||||
|
```
|
||||||
|
|
||||||
|
### 数据库 Schema 更新
|
||||||
|
- `transcripts` 表新增 `type` 字段(audio/document)
|
||||||
|
- `entities` 表新增 `embedding` 字段
|
||||||
|
- 新增索引优化查询性能
|
||||||
|
|||||||
102
STATUS.md
102
STATUS.md
@@ -4,11 +4,13 @@
|
|||||||
|
|
||||||
## 当前阶段
|
## 当前阶段
|
||||||
|
|
||||||
Phase 1: 骨架与单体分析 (MVP) - **已完成 ✅**
|
Phase 3: 记忆与生长 - **已完成 ✅**
|
||||||
|
|
||||||
## 已完成
|
## 已完成
|
||||||
|
|
||||||
### 后端 (backend/)
|
### Phase 1: 骨架与单体分析 (MVP) ✅
|
||||||
|
|
||||||
|
#### 后端 (backend/)
|
||||||
- ✅ FastAPI 项目框架搭建
|
- ✅ FastAPI 项目框架搭建
|
||||||
- ✅ SQLite 数据库设计 (schema.sql)
|
- ✅ SQLite 数据库设计 (schema.sql)
|
||||||
- ✅ 数据库管理模块 (db_manager.py)
|
- ✅ 数据库管理模块 (db_manager.py)
|
||||||
@@ -26,7 +28,7 @@ Phase 1: 骨架与单体分析 (MVP) - **已完成 ✅**
|
|||||||
- ✅ entity_mentions 表数据写入
|
- ✅ entity_mentions 表数据写入
|
||||||
- ✅ entity_relations 表数据写入
|
- ✅ entity_relations 表数据写入
|
||||||
|
|
||||||
### 前端 (frontend/)
|
#### 前端 (frontend/)
|
||||||
- ✅ 项目管理页面 (index.html)
|
- ✅ 项目管理页面 (index.html)
|
||||||
- ✅ 知识工作台页面 (workbench.html)
|
- ✅ 知识工作台页面 (workbench.html)
|
||||||
- ✅ D3.js 知识图谱可视化
|
- ✅ D3.js 知识图谱可视化
|
||||||
@@ -35,35 +37,97 @@ Phase 1: 骨架与单体分析 (MVP) - **已完成 ✅**
|
|||||||
- ✅ 转录文本中实体高亮显示
|
- ✅ 转录文本中实体高亮显示
|
||||||
- ✅ 图谱与文本联动(点击实体双向高亮)
|
- ✅ 图谱与文本联动(点击实体双向高亮)
|
||||||
|
|
||||||
### 基础设施
|
### Phase 2: 交互与纠错工作台 ✅
|
||||||
- ✅ Dockerfile
|
|
||||||
- ✅ docker-compose.yml
|
|
||||||
- ✅ Git 仓库初始化
|
|
||||||
|
|
||||||
## Phase 2 计划 (交互与纠错工作台) - **即将开始**
|
#### 后端 API 新增
|
||||||
|
- ✅ 实体编辑 API (PUT /api/v1/entities/{id})
|
||||||
|
- ✅ 实体删除 API (DELETE /api/v1/entities/{id})
|
||||||
|
- ✅ 实体合并 API (POST /api/v1/entities/{id}/merge)
|
||||||
|
- ✅ 手动创建实体 API (POST /api/v1/projects/{id}/entities)
|
||||||
|
- ✅ 关系创建 API (POST /api/v1/projects/{id}/relations)
|
||||||
|
- ✅ 关系删除 API (DELETE /api/v1/relations/{id})
|
||||||
|
- ✅ 转录编辑 API (PUT /api/v1/transcripts/{id})
|
||||||
|
|
||||||
- 实体定义编辑功能
|
#### 前端交互功能
|
||||||
- 实体合并功能
|
- ✅ 实体编辑器模态框(名称、类型、定义、别名)
|
||||||
- 关系编辑功能(添加/删除)
|
- ✅ 右键菜单(编辑实体、合并实体、标记为实体)
|
||||||
- 人工修正数据保存
|
- ✅ 实体合并功能
|
||||||
- 文本编辑器增强(支持编辑转录文本)
|
- ✅ 关系管理(添加、删除)
|
||||||
|
- ✅ 转录文本编辑模式
|
||||||
|
- ✅ 划词创建实体
|
||||||
|
- ✅ 文本与图谱双向联动
|
||||||
|
|
||||||
## Phase 3 计划 (记忆与生长)
|
#### 数据库更新
|
||||||
|
- ✅ update_entity() - 更新实体信息
|
||||||
|
- ✅ delete_entity() - 删除实体及关联数据
|
||||||
|
- ✅ delete_relation() - 删除关系
|
||||||
|
- ✅ update_relation() - 更新关系
|
||||||
|
- ✅ update_transcript() - 更新转录文本
|
||||||
|
|
||||||
- 多文件图谱融合
|
### Phase 3: 记忆与生长 ✅
|
||||||
- 实体对齐算法优化
|
|
||||||
- PDF/DOCX 文档导入
|
#### 多文件图谱融合
|
||||||
- 项目知识库面板
|
- ✅ 支持上传多个音频文件到同一项目
|
||||||
|
- ✅ 系统自动对齐实体,合并图谱
|
||||||
|
- ✅ 实体提及跨文件追踪
|
||||||
|
- ✅ 文件选择器切换不同转录内容
|
||||||
|
- ✅ 转录列表 API 返回文件类型
|
||||||
|
|
||||||
|
#### 实体对齐算法优化
|
||||||
|
- ✅ 新增 `entity_aligner.py` 模块
|
||||||
|
- ✅ 使用 Kimi API embedding 进行语义相似度匹配
|
||||||
|
- ✅ 余弦相似度计算
|
||||||
|
- ✅ 自动别名建议
|
||||||
|
- ✅ 批量实体对齐 API
|
||||||
|
- ✅ 实体对齐回退机制(字符串匹配)
|
||||||
|
|
||||||
|
#### PDF/DOCX 文档导入
|
||||||
|
- ✅ 新增 `document_processor.py` 模块
|
||||||
|
- ✅ 支持 PDF、DOCX、TXT、MD 格式
|
||||||
|
- ✅ 文档文本提取并参与实体提取
|
||||||
|
- ✅ 文档上传 API (/api/v1/projects/{id}/upload-document)
|
||||||
|
- ✅ 文档类型标记(audio/document)
|
||||||
|
|
||||||
|
#### 项目知识库面板
|
||||||
|
- ✅ 全新的知识库视图
|
||||||
|
- ✅ 侧边栏导航切换(工作台/知识库)
|
||||||
|
- ✅ 统计面板:实体数、关系数、文件数、术语数
|
||||||
|
- ✅ 实体网格展示(带提及统计)
|
||||||
|
- ✅ 关系列表展示
|
||||||
|
- ✅ 术语表管理(添加/删除)
|
||||||
|
- ✅ 文件列表展示(区分音频/文档)
|
||||||
|
|
||||||
|
#### 术语表功能
|
||||||
|
- ✅ 术语表数据库表 (glossary)
|
||||||
|
- ✅ 添加术语 API
|
||||||
|
- ✅ 获取术语列表 API
|
||||||
|
- ✅ 删除术语 API
|
||||||
|
- ✅ 前端术语表管理界面
|
||||||
|
|
||||||
|
#### 数据库更新
|
||||||
|
- ✅ transcripts 表新增 `type` 字段
|
||||||
|
- ✅ entities 表新增 `embedding` 字段
|
||||||
|
- ✅ 新增 glossary 表
|
||||||
|
- ✅ 新增索引优化查询性能
|
||||||
|
|
||||||
## 技术债务
|
## 技术债务
|
||||||
|
|
||||||
- 听悟 SDK fallback 到 mock 需要更好的错误处理
|
- 听悟 SDK fallback 到 mock 需要更好的错误处理
|
||||||
- 实体相似度匹配目前只是简单字符串包含,需要 embedding 方案
|
|
||||||
- 前端需要状态管理(目前使用全局变量)
|
- 前端需要状态管理(目前使用全局变量)
|
||||||
- 需要添加 API 文档 (OpenAPI/Swagger)
|
- 需要添加 API 文档 (OpenAPI/Swagger)
|
||||||
|
- Embedding 缓存需要持久化
|
||||||
|
- 实体对齐算法需要更多测试
|
||||||
|
|
||||||
## 部署信息
|
## 部署信息
|
||||||
|
|
||||||
- 服务器: 122.51.127.111
|
- 服务器: 122.51.127.111
|
||||||
- 项目路径: /opt/projects/insightflow
|
- 项目路径: /opt/projects/insightflow
|
||||||
- 端口: 18000
|
- 端口: 18000
|
||||||
|
- Docker 镜像: insightflow:phase3
|
||||||
|
|
||||||
|
## 下一步 (Phase 4)
|
||||||
|
|
||||||
|
- 知识推理与问答
|
||||||
|
- 实体属性扩展
|
||||||
|
- 时间线视图
|
||||||
|
- 导出功能(PDF/图片)
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
"""
|
"""
|
||||||
InsightFlow Database Manager
|
InsightFlow Database Manager - Phase 3
|
||||||
处理项目、实体、关系的持久化
|
处理项目、实体、关系的持久化
|
||||||
|
支持文档类型和多文件融合
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
@@ -166,6 +167,18 @@ class DatabaseManager:
|
|||||||
(target_id, source_id)
|
(target_id, source_id)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# 更新关系 - source 作为 source_entity_id
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE entity_relations SET source_entity_id = ? WHERE source_entity_id = ?",
|
||||||
|
(target_id, source_id)
|
||||||
|
)
|
||||||
|
|
||||||
|
# 更新关系 - source 作为 target_entity_id
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE entity_relations SET target_entity_id = ? WHERE target_entity_id = ?",
|
||||||
|
(target_id, source_id)
|
||||||
|
)
|
||||||
|
|
||||||
# 删除源实体
|
# 删除源实体
|
||||||
conn.execute("DELETE FROM entities WHERE id = ?", (source_id,))
|
conn.execute("DELETE FROM entities WHERE id = ?", (source_id,))
|
||||||
|
|
||||||
@@ -222,13 +235,13 @@ class DatabaseManager:
|
|||||||
return [EntityMention(**dict(r)) for r in rows]
|
return [EntityMention(**dict(r)) for r in rows]
|
||||||
|
|
||||||
# Transcript operations
|
# Transcript operations
|
||||||
def save_transcript(self, transcript_id: str, project_id: str, filename: str, full_text: str):
|
def save_transcript(self, transcript_id: str, project_id: str, filename: str, full_text: str, transcript_type: str = "audio"):
|
||||||
"""保存转录记录"""
|
"""保存转录记录"""
|
||||||
conn = self.get_conn()
|
conn = self.get_conn()
|
||||||
now = datetime.now().isoformat()
|
now = datetime.now().isoformat()
|
||||||
conn.execute(
|
conn.execute(
|
||||||
"INSERT INTO transcripts (id, project_id, filename, full_text, created_at) VALUES (?, ?, ?, ?, ?)",
|
"INSERT INTO transcripts (id, project_id, filename, full_text, type, created_at) VALUES (?, ?, ?, ?, ?, ?)",
|
||||||
(transcript_id, project_id, filename, full_text, now)
|
(transcript_id, project_id, filename, full_text, transcript_type, now)
|
||||||
)
|
)
|
||||||
conn.commit()
|
conn.commit()
|
||||||
conn.close()
|
conn.close()
|
||||||
@@ -291,6 +304,156 @@ class DatabaseManager:
|
|||||||
conn.close()
|
conn.close()
|
||||||
return [dict(r) for r in rows]
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
def update_entity(self, entity_id: str, **kwargs) -> Entity:
|
||||||
|
"""更新实体信息"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
|
||||||
|
# 构建更新字段
|
||||||
|
allowed_fields = ['name', 'type', 'definition', 'canonical_name']
|
||||||
|
updates = []
|
||||||
|
values = []
|
||||||
|
|
||||||
|
for field in allowed_fields:
|
||||||
|
if field in kwargs:
|
||||||
|
updates.append(f"{field} = ?")
|
||||||
|
values.append(kwargs[field])
|
||||||
|
|
||||||
|
# 处理别名
|
||||||
|
if 'aliases' in kwargs:
|
||||||
|
updates.append("aliases = ?")
|
||||||
|
values.append(json.dumps(kwargs['aliases']))
|
||||||
|
|
||||||
|
if not updates:
|
||||||
|
conn.close()
|
||||||
|
return self.get_entity(entity_id)
|
||||||
|
|
||||||
|
updates.append("updated_at = ?")
|
||||||
|
values.append(datetime.now().isoformat())
|
||||||
|
values.append(entity_id)
|
||||||
|
|
||||||
|
query = f"UPDATE entities SET {', '.join(updates)} WHERE id = ?"
|
||||||
|
conn.execute(query, values)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
return self.get_entity(entity_id)
|
||||||
|
|
||||||
|
def delete_entity(self, entity_id: str):
|
||||||
|
"""删除实体及其关联数据"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
|
||||||
|
# 删除提及记录
|
||||||
|
conn.execute("DELETE FROM entity_mentions WHERE entity_id = ?", (entity_id,))
|
||||||
|
|
||||||
|
# 删除关系
|
||||||
|
conn.execute("DELETE FROM entity_relations WHERE source_entity_id = ? OR target_entity_id = ?",
|
||||||
|
(entity_id, entity_id))
|
||||||
|
|
||||||
|
# 删除实体
|
||||||
|
conn.execute("DELETE FROM entities WHERE id = ?", (entity_id,))
|
||||||
|
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
def delete_relation(self, relation_id: str):
|
||||||
|
"""删除关系"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
conn.execute("DELETE FROM entity_relations WHERE id = ?", (relation_id,))
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
def update_relation(self, relation_id: str, **kwargs) -> dict:
|
||||||
|
"""更新关系"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
|
||||||
|
allowed_fields = ['relation_type', 'evidence']
|
||||||
|
updates = []
|
||||||
|
values = []
|
||||||
|
|
||||||
|
for field in allowed_fields:
|
||||||
|
if field in kwargs:
|
||||||
|
updates.append(f"{field} = ?")
|
||||||
|
values.append(kwargs[field])
|
||||||
|
|
||||||
|
if updates:
|
||||||
|
query = f"UPDATE entity_relations SET {', '.join(updates)} WHERE id = ?"
|
||||||
|
values.append(relation_id)
|
||||||
|
conn.execute(query, values)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
row = conn.execute("SELECT * FROM entity_relations WHERE id = ?", (relation_id,)).fetchone()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
return dict(row) if row else None
|
||||||
|
|
||||||
|
def update_transcript(self, transcript_id: str, full_text: str) -> dict:
|
||||||
|
"""更新转录文本"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
now = datetime.now().isoformat()
|
||||||
|
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE transcripts SET full_text = ?, updated_at = ? WHERE id = ?",
|
||||||
|
(full_text, now, transcript_id)
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
row = conn.execute("SELECT * FROM transcripts WHERE id = ?", (transcript_id,)).fetchone()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
return dict(row) if row else None
|
||||||
|
|
||||||
|
# Phase 3: Glossary operations
|
||||||
|
def add_glossary_term(self, project_id: str, term: str, pronunciation: str = "") -> str:
|
||||||
|
"""添加术语到术语表"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
|
||||||
|
# 检查是否已存在
|
||||||
|
existing = conn.execute(
|
||||||
|
"SELECT * FROM glossary WHERE project_id = ? AND term = ?",
|
||||||
|
(project_id, term)
|
||||||
|
).fetchone()
|
||||||
|
|
||||||
|
if existing:
|
||||||
|
# 更新频率
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE glossary SET frequency = frequency + 1 WHERE id = ?",
|
||||||
|
(existing['id'],)
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
return existing['id']
|
||||||
|
|
||||||
|
term_id = str(uuid.uuid4())[:8]
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO glossary (id, project_id, term, pronunciation, frequency) VALUES (?, ?, ?, ?, ?)",
|
||||||
|
(term_id, project_id, term, pronunciation, 1)
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
return term_id
|
||||||
|
|
||||||
|
def list_glossary(self, project_id: str) -> List[dict]:
|
||||||
|
"""列出项目术语表"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT * FROM glossary WHERE project_id = ? ORDER BY frequency DESC",
|
||||||
|
(project_id,)
|
||||||
|
).fetchall()
|
||||||
|
conn.close()
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
def delete_glossary_term(self, term_id: str):
|
||||||
|
"""删除术语"""
|
||||||
|
conn = self.get_conn()
|
||||||
|
conn.execute("DELETE FROM glossary WHERE id = ?", (term_id,))
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
# Phase 3: Get all entities for embedding
|
||||||
|
def get_all_entities_for_embedding(self, project_id: str) -> List[Entity]:
|
||||||
|
"""获取所有实体用于 embedding 计算"""
|
||||||
|
return self.list_project_entities(project_id)
|
||||||
|
|
||||||
|
|
||||||
# Singleton instance
|
# Singleton instance
|
||||||
_db_manager = None
|
_db_manager = None
|
||||||
|
|||||||
180
backend/document_processor.py
Normal file
180
backend/document_processor.py
Normal file
@@ -0,0 +1,180 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Document Processor - Phase 3
|
||||||
|
支持 PDF 和 DOCX 文档导入
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import io
|
||||||
|
from typing import Dict, Optional
|
||||||
|
|
||||||
|
class DocumentProcessor:
|
||||||
|
"""文档处理器 - 提取 PDF/DOCX 文本"""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.supported_formats = {
|
||||||
|
'.pdf': self._extract_pdf,
|
||||||
|
'.docx': self._extract_docx,
|
||||||
|
'.doc': self._extract_docx,
|
||||||
|
'.txt': self._extract_txt,
|
||||||
|
'.md': self._extract_txt,
|
||||||
|
}
|
||||||
|
|
||||||
|
def process(self, content: bytes, filename: str) -> Dict[str, str]:
|
||||||
|
"""
|
||||||
|
处理文档并提取文本
|
||||||
|
|
||||||
|
Args:
|
||||||
|
content: 文件二进制内容
|
||||||
|
filename: 文件名
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
{"text": "提取的文本内容", "format": "文件格式"}
|
||||||
|
"""
|
||||||
|
ext = os.path.splitext(filename.lower())[1]
|
||||||
|
|
||||||
|
if ext not in self.supported_formats:
|
||||||
|
raise ValueError(f"Unsupported file format: {ext}. Supported: {list(self.supported_formats.keys())}")
|
||||||
|
|
||||||
|
extractor = self.supported_formats[ext]
|
||||||
|
text = extractor(content)
|
||||||
|
|
||||||
|
# 清理文本
|
||||||
|
text = self._clean_text(text)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"text": text,
|
||||||
|
"format": ext,
|
||||||
|
"filename": filename
|
||||||
|
}
|
||||||
|
|
||||||
|
def _extract_pdf(self, content: bytes) -> str:
|
||||||
|
"""提取 PDF 文本"""
|
||||||
|
try:
|
||||||
|
import PyPDF2
|
||||||
|
pdf_file = io.BytesIO(content)
|
||||||
|
reader = PyPDF2.PdfReader(pdf_file)
|
||||||
|
|
||||||
|
text_parts = []
|
||||||
|
for page in reader.pages:
|
||||||
|
page_text = page.extract_text()
|
||||||
|
if page_text:
|
||||||
|
text_parts.append(page_text)
|
||||||
|
|
||||||
|
return "\n\n".join(text_parts)
|
||||||
|
except ImportError:
|
||||||
|
# Fallback: 尝试使用 pdfplumber
|
||||||
|
try:
|
||||||
|
import pdfplumber
|
||||||
|
text_parts = []
|
||||||
|
with pdfplumber.open(io.BytesIO(content)) as pdf:
|
||||||
|
for page in pdf.pages:
|
||||||
|
page_text = page.extract_text()
|
||||||
|
if page_text:
|
||||||
|
text_parts.append(page_text)
|
||||||
|
return "\n\n".join(text_parts)
|
||||||
|
except ImportError:
|
||||||
|
raise ImportError("PDF processing requires PyPDF2 or pdfplumber. Install with: pip install PyPDF2")
|
||||||
|
except Exception as e:
|
||||||
|
raise ValueError(f"PDF extraction failed: {str(e)}")
|
||||||
|
|
||||||
|
def _extract_docx(self, content: bytes) -> str:
|
||||||
|
"""提取 DOCX 文本"""
|
||||||
|
try:
|
||||||
|
import docx
|
||||||
|
doc_file = io.BytesIO(content)
|
||||||
|
doc = docx.Document(doc_file)
|
||||||
|
|
||||||
|
text_parts = []
|
||||||
|
for para in doc.paragraphs:
|
||||||
|
if para.text.strip():
|
||||||
|
text_parts.append(para.text)
|
||||||
|
|
||||||
|
# 提取表格中的文本
|
||||||
|
for table in doc.tables:
|
||||||
|
for row in table.rows:
|
||||||
|
row_text = []
|
||||||
|
for cell in row.cells:
|
||||||
|
if cell.text.strip():
|
||||||
|
row_text.append(cell.text.strip())
|
||||||
|
if row_text:
|
||||||
|
text_parts.append(" | ".join(row_text))
|
||||||
|
|
||||||
|
return "\n\n".join(text_parts)
|
||||||
|
except ImportError:
|
||||||
|
raise ImportError("DOCX processing requires python-docx. Install with: pip install python-docx")
|
||||||
|
except Exception as e:
|
||||||
|
raise ValueError(f"DOCX extraction failed: {str(e)}")
|
||||||
|
|
||||||
|
def _extract_txt(self, content: bytes) -> str:
|
||||||
|
"""提取纯文本"""
|
||||||
|
# 尝试多种编码
|
||||||
|
encodings = ['utf-8', 'gbk', 'gb2312', 'latin-1']
|
||||||
|
|
||||||
|
for encoding in encodings:
|
||||||
|
try:
|
||||||
|
return content.decode(encoding)
|
||||||
|
except UnicodeDecodeError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# 如果都失败了,使用 latin-1 并忽略错误
|
||||||
|
return content.decode('latin-1', errors='ignore')
|
||||||
|
|
||||||
|
def _clean_text(self, text: str) -> str:
|
||||||
|
"""清理提取的文本"""
|
||||||
|
if not text:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
# 移除多余的空白字符
|
||||||
|
lines = text.split('\n')
|
||||||
|
cleaned_lines = []
|
||||||
|
|
||||||
|
for line in lines:
|
||||||
|
line = line.strip()
|
||||||
|
# 移除空行,但保留段落分隔
|
||||||
|
if line:
|
||||||
|
cleaned_lines.append(line)
|
||||||
|
|
||||||
|
# 合并行,保留段落结构
|
||||||
|
text = '\n\n'.join(cleaned_lines)
|
||||||
|
|
||||||
|
# 移除多余的空格
|
||||||
|
text = ' '.join(text.split())
|
||||||
|
|
||||||
|
# 移除控制字符
|
||||||
|
text = ''.join(char for char in text if ord(char) >= 32 or char in '\n\r\t')
|
||||||
|
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
def is_supported(self, filename: str) -> bool:
|
||||||
|
"""检查文件格式是否支持"""
|
||||||
|
ext = os.path.splitext(filename.lower())[1]
|
||||||
|
return ext in self.supported_formats
|
||||||
|
|
||||||
|
|
||||||
|
# 简单的文本提取器(不需要外部依赖)
|
||||||
|
class SimpleTextExtractor:
|
||||||
|
"""简单的文本提取器,用于测试"""
|
||||||
|
|
||||||
|
def extract(self, content: bytes, filename: str) -> str:
|
||||||
|
"""尝试提取文本"""
|
||||||
|
encodings = ['utf-8', 'gbk', 'latin-1']
|
||||||
|
|
||||||
|
for encoding in encodings:
|
||||||
|
try:
|
||||||
|
return content.decode(encoding)
|
||||||
|
except UnicodeDecodeError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
return content.decode('latin-1', errors='ignore')
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# 测试
|
||||||
|
processor = DocumentProcessor()
|
||||||
|
|
||||||
|
# 测试文本提取
|
||||||
|
test_text = "Hello World\n\nThis is a test document.\n\nMultiple paragraphs."
|
||||||
|
result = processor.process(test_text.encode('utf-8'), "test.txt")
|
||||||
|
print(f"Text extraction test: {len(result['text'])} chars")
|
||||||
|
print(result['text'][:100])
|
||||||
372
backend/entity_aligner.py
Normal file
372
backend/entity_aligner.py
Normal file
@@ -0,0 +1,372 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Entity Aligner - Phase 3
|
||||||
|
使用 embedding 进行实体对齐
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import json
|
||||||
|
import httpx
|
||||||
|
import numpy as np
|
||||||
|
from typing import List, Optional, Dict
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
# API Keys
|
||||||
|
KIMI_API_KEY = os.getenv("KIMI_API_KEY", "")
|
||||||
|
KIMI_BASE_URL = os.getenv("KIMI_BASE_URL", "https://api.kimi.com/coding")
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class EntityEmbedding:
|
||||||
|
entity_id: str
|
||||||
|
name: str
|
||||||
|
definition: str
|
||||||
|
embedding: List[float]
|
||||||
|
|
||||||
|
class EntityAligner:
|
||||||
|
"""实体对齐器 - 使用 embedding 进行相似度匹配"""
|
||||||
|
|
||||||
|
def __init__(self, similarity_threshold: float = 0.85):
|
||||||
|
self.similarity_threshold = similarity_threshold
|
||||||
|
self.embedding_cache: Dict[str, List[float]] = {}
|
||||||
|
|
||||||
|
def get_embedding(self, text: str) -> Optional[List[float]]:
|
||||||
|
"""
|
||||||
|
使用 Kimi API 获取文本的 embedding
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: 输入文本
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
embedding 向量或 None
|
||||||
|
"""
|
||||||
|
if not KIMI_API_KEY:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# 检查缓存
|
||||||
|
cache_key = hash(text)
|
||||||
|
if cache_key in self.embedding_cache:
|
||||||
|
return self.embedding_cache[cache_key]
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = httpx.post(
|
||||||
|
f"{KIMI_BASE_URL}/v1/embeddings",
|
||||||
|
headers={"Authorization": f"Bearer {KIMI_API_KEY}", "Content-Type": "application/json"},
|
||||||
|
json={
|
||||||
|
"model": "k2p5",
|
||||||
|
"input": text[:500] # 限制长度
|
||||||
|
},
|
||||||
|
timeout=30.0
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
result = response.json()
|
||||||
|
|
||||||
|
embedding = result["data"][0]["embedding"]
|
||||||
|
self.embedding_cache[cache_key] = embedding
|
||||||
|
return embedding
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Embedding API failed: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def compute_similarity(self, embedding1: List[float], embedding2: List[float]) -> float:
|
||||||
|
"""
|
||||||
|
计算两个 embedding 的余弦相似度
|
||||||
|
|
||||||
|
Args:
|
||||||
|
embedding1: 第一个向量
|
||||||
|
embedding2: 第二个向量
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
相似度分数 (0-1)
|
||||||
|
"""
|
||||||
|
vec1 = np.array(embedding1)
|
||||||
|
vec2 = np.array(embedding2)
|
||||||
|
|
||||||
|
# 余弦相似度
|
||||||
|
dot_product = np.dot(vec1, vec2)
|
||||||
|
norm1 = np.linalg.norm(vec1)
|
||||||
|
norm2 = np.linalg.norm(vec2)
|
||||||
|
|
||||||
|
if norm1 == 0 or norm2 == 0:
|
||||||
|
return 0.0
|
||||||
|
|
||||||
|
return float(dot_product / (norm1 * norm2))
|
||||||
|
|
||||||
|
def get_entity_text(self, name: str, definition: str = "") -> str:
|
||||||
|
"""
|
||||||
|
构建用于 embedding 的实体文本
|
||||||
|
|
||||||
|
Args:
|
||||||
|
name: 实体名称
|
||||||
|
definition: 实体定义
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
组合文本
|
||||||
|
"""
|
||||||
|
if definition:
|
||||||
|
return f"{name}: {definition}"
|
||||||
|
return name
|
||||||
|
|
||||||
|
def find_similar_entity(
|
||||||
|
self,
|
||||||
|
project_id: str,
|
||||||
|
name: str,
|
||||||
|
definition: str = "",
|
||||||
|
exclude_id: Optional[str] = None,
|
||||||
|
threshold: Optional[float] = None
|
||||||
|
) -> Optional[object]:
|
||||||
|
"""
|
||||||
|
查找相似的实体
|
||||||
|
|
||||||
|
Args:
|
||||||
|
project_id: 项目 ID
|
||||||
|
name: 实体名称
|
||||||
|
definition: 实体定义
|
||||||
|
exclude_id: 要排除的实体 ID
|
||||||
|
threshold: 相似度阈值
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
相似的实体或 None
|
||||||
|
"""
|
||||||
|
if threshold is None:
|
||||||
|
threshold = self.similarity_threshold
|
||||||
|
|
||||||
|
try:
|
||||||
|
from db_manager import get_db_manager
|
||||||
|
db = get_db_manager()
|
||||||
|
except ImportError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# 获取项目的所有实体
|
||||||
|
entities = db.get_all_entities_for_embedding(project_id)
|
||||||
|
|
||||||
|
if not entities:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# 获取查询实体的 embedding
|
||||||
|
query_text = self.get_entity_text(name, definition)
|
||||||
|
query_embedding = self.get_embedding(query_text)
|
||||||
|
|
||||||
|
if query_embedding is None:
|
||||||
|
# 如果 embedding API 失败,回退到简单匹配
|
||||||
|
return self._fallback_similarity_match(entities, name, exclude_id)
|
||||||
|
|
||||||
|
best_match = None
|
||||||
|
best_score = threshold
|
||||||
|
|
||||||
|
for entity in entities:
|
||||||
|
if exclude_id and entity.id == exclude_id:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# 获取实体的 embedding
|
||||||
|
entity_text = self.get_entity_text(entity.name, entity.definition)
|
||||||
|
entity_embedding = self.get_embedding(entity_text)
|
||||||
|
|
||||||
|
if entity_embedding is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# 计算相似度
|
||||||
|
similarity = self.compute_similarity(query_embedding, entity_embedding)
|
||||||
|
|
||||||
|
if similarity > best_score:
|
||||||
|
best_score = similarity
|
||||||
|
best_match = entity
|
||||||
|
|
||||||
|
return best_match
|
||||||
|
|
||||||
|
def _fallback_similarity_match(
|
||||||
|
self,
|
||||||
|
entities: List[object],
|
||||||
|
name: str,
|
||||||
|
exclude_id: Optional[str] = None
|
||||||
|
) -> Optional[object]:
|
||||||
|
"""
|
||||||
|
回退到简单的相似度匹配(不使用 embedding)
|
||||||
|
|
||||||
|
Args:
|
||||||
|
entities: 实体列表
|
||||||
|
name: 查询名称
|
||||||
|
exclude_id: 要排除的实体 ID
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
最相似的实体或 None
|
||||||
|
"""
|
||||||
|
name_lower = name.lower()
|
||||||
|
|
||||||
|
# 1. 精确匹配
|
||||||
|
for entity in entities:
|
||||||
|
if exclude_id and entity.id == exclude_id:
|
||||||
|
continue
|
||||||
|
if entity.name.lower() == name_lower:
|
||||||
|
return entity
|
||||||
|
if entity.aliases and name_lower in [a.lower() for a in entity.aliases]:
|
||||||
|
return entity
|
||||||
|
|
||||||
|
# 2. 包含匹配
|
||||||
|
for entity in entities:
|
||||||
|
if exclude_id and entity.id == exclude_id:
|
||||||
|
continue
|
||||||
|
if name_lower in entity.name.lower() or entity.name.lower() in name_lower:
|
||||||
|
return entity
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def batch_align_entities(
|
||||||
|
self,
|
||||||
|
project_id: str,
|
||||||
|
new_entities: List[Dict],
|
||||||
|
threshold: Optional[float] = None
|
||||||
|
) -> List[Dict]:
|
||||||
|
"""
|
||||||
|
批量对齐实体
|
||||||
|
|
||||||
|
Args:
|
||||||
|
project_id: 项目 ID
|
||||||
|
new_entities: 新实体列表 [{"name": "...", "definition": "..."}]
|
||||||
|
threshold: 相似度阈值
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
对齐结果列表 [{"new_entity": {...}, "matched_entity": {...}, "similarity": 0.9}]
|
||||||
|
"""
|
||||||
|
if threshold is None:
|
||||||
|
threshold = self.similarity_threshold
|
||||||
|
|
||||||
|
results = []
|
||||||
|
|
||||||
|
for new_ent in new_entities:
|
||||||
|
matched = self.find_similar_entity(
|
||||||
|
project_id,
|
||||||
|
new_ent["name"],
|
||||||
|
new_ent.get("definition", ""),
|
||||||
|
threshold=threshold
|
||||||
|
)
|
||||||
|
|
||||||
|
result = {
|
||||||
|
"new_entity": new_ent,
|
||||||
|
"matched_entity": None,
|
||||||
|
"similarity": 0.0,
|
||||||
|
"should_merge": False
|
||||||
|
}
|
||||||
|
|
||||||
|
if matched:
|
||||||
|
# 计算相似度
|
||||||
|
query_text = self.get_entity_text(new_ent["name"], new_ent.get("definition", ""))
|
||||||
|
matched_text = self.get_entity_text(matched.name, matched.definition)
|
||||||
|
|
||||||
|
query_emb = self.get_embedding(query_text)
|
||||||
|
matched_emb = self.get_embedding(matched_text)
|
||||||
|
|
||||||
|
if query_emb and matched_emb:
|
||||||
|
similarity = self.compute_similarity(query_emb, matched_emb)
|
||||||
|
result["matched_entity"] = {
|
||||||
|
"id": matched.id,
|
||||||
|
"name": matched.name,
|
||||||
|
"type": matched.type,
|
||||||
|
"definition": matched.definition
|
||||||
|
}
|
||||||
|
result["similarity"] = similarity
|
||||||
|
result["should_merge"] = similarity >= threshold
|
||||||
|
|
||||||
|
results.append(result)
|
||||||
|
|
||||||
|
return results
|
||||||
|
|
||||||
|
def suggest_entity_aliases(self, entity_name: str, entity_definition: str = "") -> List[str]:
|
||||||
|
"""
|
||||||
|
使用 LLM 建议实体的别名
|
||||||
|
|
||||||
|
Args:
|
||||||
|
entity_name: 实体名称
|
||||||
|
entity_definition: 实体定义
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
建议的别名列表
|
||||||
|
"""
|
||||||
|
if not KIMI_API_KEY:
|
||||||
|
return []
|
||||||
|
|
||||||
|
prompt = f"""为以下实体生成可能的别名或简称:
|
||||||
|
|
||||||
|
实体名称:{entity_name}
|
||||||
|
定义:{entity_definition}
|
||||||
|
|
||||||
|
请返回 JSON 格式的别名列表:
|
||||||
|
{{"aliases": ["别名1", "别名2", "别名3"]}}
|
||||||
|
|
||||||
|
只返回 JSON,不要其他内容。"""
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = httpx.post(
|
||||||
|
f"{KIMI_BASE_URL}/v1/chat/completions",
|
||||||
|
headers={"Authorization": f"Bearer {KIMI_API_KEY}", "Content-Type": "application/json"},
|
||||||
|
json={
|
||||||
|
"model": "k2p5",
|
||||||
|
"messages": [{"role": "user", "content": prompt}],
|
||||||
|
"temperature": 0.3
|
||||||
|
},
|
||||||
|
timeout=30.0
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
result = response.json()
|
||||||
|
content = result["choices"][0]["message"]["content"]
|
||||||
|
|
||||||
|
import re
|
||||||
|
json_match = re.search(r'\{{.*?\}}', content, re.DOTALL)
|
||||||
|
if json_match:
|
||||||
|
data = json.loads(json_match.group())
|
||||||
|
return data.get("aliases", [])
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Alias suggestion failed: {e}")
|
||||||
|
|
||||||
|
return []
|
||||||
|
|
||||||
|
|
||||||
|
# 简单的字符串相似度计算(不使用 embedding)
|
||||||
|
def simple_similarity(str1: str, str2: str) -> float:
|
||||||
|
"""
|
||||||
|
计算两个字符串的简单相似度
|
||||||
|
|
||||||
|
Args:
|
||||||
|
str1: 第一个字符串
|
||||||
|
str2: 第二个字符串
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
相似度分数 (0-1)
|
||||||
|
"""
|
||||||
|
if str1 == str2:
|
||||||
|
return 1.0
|
||||||
|
|
||||||
|
if not str1 or not str2:
|
||||||
|
return 0.0
|
||||||
|
|
||||||
|
# 转换为小写
|
||||||
|
s1 = str1.lower()
|
||||||
|
s2 = str2.lower()
|
||||||
|
|
||||||
|
# 包含关系
|
||||||
|
if s1 in s2 or s2 in s1:
|
||||||
|
return 0.8
|
||||||
|
|
||||||
|
# 计算编辑距离相似度
|
||||||
|
from difflib import SequenceMatcher
|
||||||
|
return SequenceMatcher(None, s1, s2).ratio()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# 测试
|
||||||
|
aligner = EntityAligner()
|
||||||
|
|
||||||
|
# 测试 embedding
|
||||||
|
test_text = "Kubernetes 容器编排平台"
|
||||||
|
embedding = aligner.get_embedding(test_text)
|
||||||
|
if embedding:
|
||||||
|
print(f"Embedding dimension: {len(embedding)}")
|
||||||
|
print(f"First 5 values: {embedding[:5]}")
|
||||||
|
else:
|
||||||
|
print("Embedding API not available")
|
||||||
|
|
||||||
|
# 测试相似度计算
|
||||||
|
emb1 = [1.0, 0.0, 0.0]
|
||||||
|
emb2 = [0.9, 0.1, 0.0]
|
||||||
|
sim = aligner.compute_similarity(emb1, emb2)
|
||||||
|
print(f"Similarity: {sim:.4f}")
|
||||||
613
backend/main.py
613
backend/main.py
@@ -1,7 +1,7 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
"""
|
"""
|
||||||
InsightFlow Backend - Phase 3 (Production Ready)
|
InsightFlow Backend - Phase 3 (Memory & Growth)
|
||||||
Knowledge Growth: Multi-file fusion + Entity Alignment
|
Knowledge Growth: Multi-file fusion + Entity Alignment + Document Import
|
||||||
ASR: 阿里云听悟 + OSS
|
ASR: 阿里云听悟 + OSS
|
||||||
"""
|
"""
|
||||||
|
|
||||||
@@ -9,6 +9,7 @@ import os
|
|||||||
import json
|
import json
|
||||||
import httpx
|
import httpx
|
||||||
import uuid
|
import uuid
|
||||||
|
import re
|
||||||
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from fastapi.staticfiles import StaticFiles
|
from fastapi.staticfiles import StaticFiles
|
||||||
@@ -35,6 +36,18 @@ try:
|
|||||||
except ImportError:
|
except ImportError:
|
||||||
DB_AVAILABLE = False
|
DB_AVAILABLE = False
|
||||||
|
|
||||||
|
try:
|
||||||
|
from document_processor import DocumentProcessor
|
||||||
|
DOC_PROCESSOR_AVAILABLE = True
|
||||||
|
except ImportError:
|
||||||
|
DOC_PROCESSOR_AVAILABLE = False
|
||||||
|
|
||||||
|
try:
|
||||||
|
from entity_aligner import EntityAligner
|
||||||
|
ALIGNER_AVAILABLE = True
|
||||||
|
except ImportError:
|
||||||
|
ALIGNER_AVAILABLE = False
|
||||||
|
|
||||||
app = FastAPI(title="InsightFlow", version="0.3.0")
|
app = FastAPI(title="InsightFlow", version="0.3.0")
|
||||||
|
|
||||||
app.add_middleware(
|
app.add_middleware(
|
||||||
@@ -71,9 +84,270 @@ class ProjectCreate(BaseModel):
|
|||||||
name: str
|
name: str
|
||||||
description: str = ""
|
description: str = ""
|
||||||
|
|
||||||
|
class EntityUpdate(BaseModel):
|
||||||
|
name: Optional[str] = None
|
||||||
|
type: Optional[str] = None
|
||||||
|
definition: Optional[str] = None
|
||||||
|
aliases: Optional[List[str]] = None
|
||||||
|
|
||||||
|
class RelationCreate(BaseModel):
|
||||||
|
source_entity_id: str
|
||||||
|
target_entity_id: str
|
||||||
|
relation_type: str
|
||||||
|
evidence: Optional[str] = ""
|
||||||
|
|
||||||
|
class TranscriptUpdate(BaseModel):
|
||||||
|
full_text: str
|
||||||
|
|
||||||
|
class EntityMergeRequest(BaseModel):
|
||||||
|
source_entity_id: str
|
||||||
|
target_entity_id: str
|
||||||
|
|
||||||
|
class GlossaryTermCreate(BaseModel):
|
||||||
|
term: str
|
||||||
|
pronunciation: Optional[str] = ""
|
||||||
|
|
||||||
# API Keys
|
# API Keys
|
||||||
KIMI_API_KEY = os.getenv("KIMI_API_KEY", "")
|
KIMI_API_KEY = os.getenv("KIMI_API_KEY", "")
|
||||||
KIMI_BASE_URL = "https://api.kimi.com/coding"
|
KIMI_BASE_URL = os.getenv("KIMI_BASE_URL", "https://api.kimi.com/coding")
|
||||||
|
|
||||||
|
# Phase 3: Entity Aligner singleton
|
||||||
|
_aligner = None
|
||||||
|
def get_aligner():
|
||||||
|
global _aligner
|
||||||
|
if _aligner is None and ALIGNER_AVAILABLE:
|
||||||
|
_aligner = EntityAligner()
|
||||||
|
return _aligner
|
||||||
|
|
||||||
|
# Phase 3: Document Processor singleton
|
||||||
|
_doc_processor = None
|
||||||
|
def get_doc_processor():
|
||||||
|
global _doc_processor
|
||||||
|
if _doc_processor is None and DOC_PROCESSOR_AVAILABLE:
|
||||||
|
_doc_processor = DocumentProcessor()
|
||||||
|
return _doc_processor
|
||||||
|
|
||||||
|
# Phase 2: Entity Edit API
|
||||||
|
@app.put("/api/v1/entities/{entity_id}")
|
||||||
|
async def update_entity(entity_id: str, update: EntityUpdate):
|
||||||
|
"""更新实体信息(名称、类型、定义、别名)"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
entity = db.get_entity(entity_id)
|
||||||
|
if not entity:
|
||||||
|
raise HTTPException(status_code=404, detail="Entity not found")
|
||||||
|
|
||||||
|
# 更新字段
|
||||||
|
update_data = {k: v for k, v in update.dict().items() if v is not None}
|
||||||
|
updated = db.update_entity(entity_id, **update_data)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": updated.id,
|
||||||
|
"name": updated.name,
|
||||||
|
"type": updated.type,
|
||||||
|
"definition": updated.definition,
|
||||||
|
"aliases": updated.aliases
|
||||||
|
}
|
||||||
|
|
||||||
|
@app.delete("/api/v1/entities/{entity_id}")
|
||||||
|
async def delete_entity(entity_id: str):
|
||||||
|
"""删除实体"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
entity = db.get_entity(entity_id)
|
||||||
|
if not entity:
|
||||||
|
raise HTTPException(status_code=404, detail="Entity not found")
|
||||||
|
|
||||||
|
db.delete_entity(entity_id)
|
||||||
|
return {"success": True, "message": f"Entity {entity_id} deleted"}
|
||||||
|
|
||||||
|
@app.post("/api/v1/entities/{entity_id}/merge")
|
||||||
|
async def merge_entities_endpoint(entity_id: str, merge_req: EntityMergeRequest):
|
||||||
|
"""合并两个实体"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
|
||||||
|
# 验证两个实体都存在
|
||||||
|
source = db.get_entity(merge_req.source_entity_id)
|
||||||
|
target = db.get_entity(merge_req.target_entity_id)
|
||||||
|
|
||||||
|
if not source or not target:
|
||||||
|
raise HTTPException(status_code=404, detail="Entity not found")
|
||||||
|
|
||||||
|
result = db.merge_entities(merge_req.target_entity_id, merge_req.source_entity_id)
|
||||||
|
return {
|
||||||
|
"success": True,
|
||||||
|
"merged_entity": {
|
||||||
|
"id": result.id,
|
||||||
|
"name": result.name,
|
||||||
|
"type": result.type,
|
||||||
|
"definition": result.definition,
|
||||||
|
"aliases": result.aliases
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Phase 2: Relation Edit API
|
||||||
|
@app.post("/api/v1/projects/{project_id}/relations")
|
||||||
|
async def create_relation_endpoint(project_id: str, relation: RelationCreate):
|
||||||
|
"""创建新的实体关系"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
|
||||||
|
# 验证实体存在
|
||||||
|
source = db.get_entity(relation.source_entity_id)
|
||||||
|
target = db.get_entity(relation.target_entity_id)
|
||||||
|
|
||||||
|
if not source or not target:
|
||||||
|
raise HTTPException(status_code=404, detail="Source or target entity not found")
|
||||||
|
|
||||||
|
relation_id = db.create_relation(
|
||||||
|
project_id=project_id,
|
||||||
|
source_entity_id=relation.source_entity_id,
|
||||||
|
target_entity_id=relation.target_entity_id,
|
||||||
|
relation_type=relation.relation_type,
|
||||||
|
evidence=relation.evidence
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": relation_id,
|
||||||
|
"source_id": relation.source_entity_id,
|
||||||
|
"target_id": relation.target_entity_id,
|
||||||
|
"type": relation.relation_type,
|
||||||
|
"success": True
|
||||||
|
}
|
||||||
|
|
||||||
|
@app.delete("/api/v1/relations/{relation_id}")
|
||||||
|
async def delete_relation(relation_id: str):
|
||||||
|
"""删除关系"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
db.delete_relation(relation_id)
|
||||||
|
return {"success": True, "message": f"Relation {relation_id} deleted"}
|
||||||
|
|
||||||
|
@app.put("/api/v1/relations/{relation_id}")
|
||||||
|
async def update_relation(relation_id: str, relation: RelationCreate):
|
||||||
|
"""更新关系"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
updated = db.update_relation(
|
||||||
|
relation_id=relation_id,
|
||||||
|
relation_type=relation.relation_type,
|
||||||
|
evidence=relation.evidence
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": relation_id,
|
||||||
|
"type": updated["relation_type"],
|
||||||
|
"evidence": updated["evidence"],
|
||||||
|
"success": True
|
||||||
|
}
|
||||||
|
|
||||||
|
# Phase 2: Transcript Edit API
|
||||||
|
@app.get("/api/v1/transcripts/{transcript_id}")
|
||||||
|
async def get_transcript(transcript_id: str):
|
||||||
|
"""获取转录详情"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
transcript = db.get_transcript(transcript_id)
|
||||||
|
|
||||||
|
if not transcript:
|
||||||
|
raise HTTPException(status_code=404, detail="Transcript not found")
|
||||||
|
|
||||||
|
return transcript
|
||||||
|
|
||||||
|
@app.put("/api/v1/transcripts/{transcript_id}")
|
||||||
|
async def update_transcript(transcript_id: str, update: TranscriptUpdate):
|
||||||
|
"""更新转录文本(人工修正)"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
transcript = db.get_transcript(transcript_id)
|
||||||
|
|
||||||
|
if not transcript:
|
||||||
|
raise HTTPException(status_code=404, detail="Transcript not found")
|
||||||
|
|
||||||
|
updated = db.update_transcript(transcript_id, update.full_text)
|
||||||
|
return {
|
||||||
|
"id": transcript_id,
|
||||||
|
"full_text": updated["full_text"],
|
||||||
|
"updated_at": updated["updated_at"],
|
||||||
|
"success": True
|
||||||
|
}
|
||||||
|
|
||||||
|
# Phase 2: Manual Entity Creation
|
||||||
|
class ManualEntityCreate(BaseModel):
|
||||||
|
name: str
|
||||||
|
type: str = "OTHER"
|
||||||
|
definition: str = ""
|
||||||
|
transcript_id: Optional[str] = None
|
||||||
|
start_pos: Optional[int] = None
|
||||||
|
end_pos: Optional[int] = None
|
||||||
|
|
||||||
|
@app.post("/api/v1/projects/{project_id}/entities")
|
||||||
|
async def create_manual_entity(project_id: str, entity: ManualEntityCreate):
|
||||||
|
"""手动创建实体(划词新建)"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
|
||||||
|
# 检查是否已存在
|
||||||
|
existing = db.get_entity_by_name(project_id, entity.name)
|
||||||
|
if existing:
|
||||||
|
return {
|
||||||
|
"id": existing.id,
|
||||||
|
"name": existing.name,
|
||||||
|
"type": existing.type,
|
||||||
|
"existed": True
|
||||||
|
}
|
||||||
|
|
||||||
|
entity_id = str(uuid.uuid4())[:8]
|
||||||
|
new_entity = db.create_entity(Entity(
|
||||||
|
id=entity_id,
|
||||||
|
project_id=project_id,
|
||||||
|
name=entity.name,
|
||||||
|
type=entity.type,
|
||||||
|
definition=entity.definition
|
||||||
|
))
|
||||||
|
|
||||||
|
# 如果有提及位置信息,保存提及
|
||||||
|
if entity.transcript_id and entity.start_pos is not None and entity.end_pos is not None:
|
||||||
|
transcript = db.get_transcript(entity.transcript_id)
|
||||||
|
if transcript:
|
||||||
|
text = transcript["full_text"]
|
||||||
|
mention = EntityMention(
|
||||||
|
id=str(uuid.uuid4())[:8],
|
||||||
|
entity_id=entity_id,
|
||||||
|
transcript_id=entity.transcript_id,
|
||||||
|
start_pos=entity.start_pos,
|
||||||
|
end_pos=entity.end_pos,
|
||||||
|
text_snippet=text[max(0, entity.start_pos-20):min(len(text), entity.end_pos+20)],
|
||||||
|
confidence=1.0
|
||||||
|
)
|
||||||
|
db.add_mention(mention)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": new_entity.id,
|
||||||
|
"name": new_entity.name,
|
||||||
|
"type": new_entity.type,
|
||||||
|
"definition": new_entity.definition,
|
||||||
|
"success": True
|
||||||
|
}
|
||||||
|
|
||||||
def transcribe_audio(audio_data: bytes, filename: str) -> dict:
|
def transcribe_audio(audio_data: bytes, filename: str) -> dict:
|
||||||
"""转录音频:OSS上传 + 听悟转录"""
|
"""转录音频:OSS上传 + 听悟转录"""
|
||||||
@@ -165,12 +439,21 @@ def extract_entities_with_llm(text: str) -> tuple[List[dict], List[dict]]:
|
|||||||
|
|
||||||
return [], []
|
return [], []
|
||||||
|
|
||||||
def align_entity(project_id: str, name: str, db) -> Optional[Entity]:
|
def align_entity(project_id: str, name: str, db, definition: str = "") -> Optional[Entity]:
|
||||||
"""实体对齐"""
|
"""实体对齐 - Phase 3: 使用 embedding 对齐"""
|
||||||
|
# 1. 首先尝试精确匹配
|
||||||
existing = db.get_entity_by_name(project_id, name)
|
existing = db.get_entity_by_name(project_id, name)
|
||||||
if existing:
|
if existing:
|
||||||
return existing
|
return existing
|
||||||
|
|
||||||
|
# 2. 使用 embedding 对齐(如果可用)
|
||||||
|
aligner = get_aligner()
|
||||||
|
if aligner:
|
||||||
|
similar = aligner.find_similar_entity(project_id, name, definition)
|
||||||
|
if similar:
|
||||||
|
return similar
|
||||||
|
|
||||||
|
# 3. 回退到简单相似度匹配
|
||||||
similar = db.find_similar_entities(project_id, name)
|
similar = db.find_similar_entities(project_id, name)
|
||||||
if similar:
|
if similar:
|
||||||
return similar[0]
|
return similar[0]
|
||||||
@@ -202,7 +485,7 @@ async def list_projects():
|
|||||||
|
|
||||||
@app.post("/api/v1/projects/{project_id}/upload", response_model=AnalysisResult)
|
@app.post("/api/v1/projects/{project_id}/upload", response_model=AnalysisResult)
|
||||||
async def upload_audio(project_id: str, file: UploadFile = File(...)):
|
async def upload_audio(project_id: str, file: UploadFile = File(...)):
|
||||||
"""上传音频到指定项目"""
|
"""上传音频到指定项目 - Phase 3: 支持多文件融合"""
|
||||||
if not DB_AVAILABLE:
|
if not DB_AVAILABLE:
|
||||||
raise HTTPException(status_code=500, detail="Database not available")
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
@@ -230,12 +513,12 @@ async def upload_audio(project_id: str, file: UploadFile = File(...)):
|
|||||||
full_text=tw_result["full_text"]
|
full_text=tw_result["full_text"]
|
||||||
)
|
)
|
||||||
|
|
||||||
# 实体对齐并保存
|
# 实体对齐并保存 - Phase 3: 使用增强对齐
|
||||||
aligned_entities = []
|
aligned_entities = []
|
||||||
entity_name_to_id = {} # 用于关系映射
|
entity_name_to_id = {} # 用于关系映射
|
||||||
|
|
||||||
for raw_ent in raw_entities:
|
for raw_ent in raw_entities:
|
||||||
existing = align_entity(project_id, raw_ent["name"], db)
|
existing = align_entity(project_id, raw_ent["name"], db, raw_ent.get("definition", ""))
|
||||||
|
|
||||||
if existing:
|
if existing:
|
||||||
ent_model = EntityModel(
|
ent_model = EntityModel(
|
||||||
@@ -310,6 +593,302 @@ async def upload_audio(project_id: str, file: UploadFile = File(...)):
|
|||||||
created_at=datetime.now().isoformat()
|
created_at=datetime.now().isoformat()
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Phase 3: Document Upload API
|
||||||
|
@app.post("/api/v1/projects/{project_id}/upload-document")
|
||||||
|
async def upload_document(project_id: str, file: UploadFile = File(...)):
|
||||||
|
"""上传 PDF/DOCX 文档到指定项目"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
if not DOC_PROCESSOR_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Document processor not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
project = db.get_project(project_id)
|
||||||
|
if not project:
|
||||||
|
raise HTTPException(status_code=404, detail="Project not found")
|
||||||
|
|
||||||
|
content = await file.read()
|
||||||
|
|
||||||
|
# 处理文档
|
||||||
|
processor = get_doc_processor()
|
||||||
|
try:
|
||||||
|
result = processor.process(content, file.filename)
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(status_code=400, detail=f"Document processing failed: {str(e)}")
|
||||||
|
|
||||||
|
# 保存文档转录记录
|
||||||
|
transcript_id = str(uuid.uuid4())[:8]
|
||||||
|
db.save_transcript(
|
||||||
|
transcript_id=transcript_id,
|
||||||
|
project_id=project_id,
|
||||||
|
filename=file.filename,
|
||||||
|
full_text=result["text"],
|
||||||
|
transcript_type="document"
|
||||||
|
)
|
||||||
|
|
||||||
|
# 提取实体和关系
|
||||||
|
raw_entities, raw_relations = extract_entities_with_llm(result["text"])
|
||||||
|
|
||||||
|
# 实体对齐并保存
|
||||||
|
aligned_entities = []
|
||||||
|
entity_name_to_id = {}
|
||||||
|
|
||||||
|
for raw_ent in raw_entities:
|
||||||
|
existing = align_entity(project_id, raw_ent["name"], db, raw_ent.get("definition", ""))
|
||||||
|
|
||||||
|
if existing:
|
||||||
|
entity_name_to_id[raw_ent["name"]] = existing.id
|
||||||
|
aligned_entities.append(EntityModel(
|
||||||
|
id=existing.id,
|
||||||
|
name=existing.name,
|
||||||
|
type=existing.type,
|
||||||
|
definition=existing.definition,
|
||||||
|
aliases=existing.aliases
|
||||||
|
))
|
||||||
|
else:
|
||||||
|
new_ent = db.create_entity(Entity(
|
||||||
|
id=str(uuid.uuid4())[:8],
|
||||||
|
project_id=project_id,
|
||||||
|
name=raw_ent["name"],
|
||||||
|
type=raw_ent.get("type", "OTHER"),
|
||||||
|
definition=raw_ent.get("definition", "")
|
||||||
|
))
|
||||||
|
entity_name_to_id[raw_ent["name"]] = new_ent.id
|
||||||
|
aligned_entities.append(EntityModel(
|
||||||
|
id=new_ent.id,
|
||||||
|
name=new_ent.name,
|
||||||
|
type=new_ent.type,
|
||||||
|
definition=new_ent.definition
|
||||||
|
))
|
||||||
|
|
||||||
|
# 保存实体提及位置
|
||||||
|
full_text = result["text"]
|
||||||
|
name = raw_ent["name"]
|
||||||
|
start_pos = 0
|
||||||
|
while True:
|
||||||
|
pos = full_text.find(name, start_pos)
|
||||||
|
if pos == -1:
|
||||||
|
break
|
||||||
|
mention = EntityMention(
|
||||||
|
id=str(uuid.uuid4())[:8],
|
||||||
|
entity_id=entity_name_to_id[name],
|
||||||
|
transcript_id=transcript_id,
|
||||||
|
start_pos=pos,
|
||||||
|
end_pos=pos + len(name),
|
||||||
|
text_snippet=full_text[max(0, pos-20):min(len(full_text), pos+len(name)+20)],
|
||||||
|
confidence=1.0
|
||||||
|
)
|
||||||
|
db.add_mention(mention)
|
||||||
|
start_pos = pos + 1
|
||||||
|
|
||||||
|
# 保存关系
|
||||||
|
for rel in raw_relations:
|
||||||
|
source_id = entity_name_to_id.get(rel.get("source", ""))
|
||||||
|
target_id = entity_name_to_id.get(rel.get("target", ""))
|
||||||
|
if source_id and target_id:
|
||||||
|
db.create_relation(
|
||||||
|
project_id=project_id,
|
||||||
|
source_entity_id=source_id,
|
||||||
|
target_entity_id=target_id,
|
||||||
|
relation_type=rel.get("type", "related"),
|
||||||
|
evidence=result["text"][:200],
|
||||||
|
transcript_id=transcript_id
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"transcript_id": transcript_id,
|
||||||
|
"project_id": project_id,
|
||||||
|
"filename": file.filename,
|
||||||
|
"text_length": len(result["text"]),
|
||||||
|
"entities": [e.dict() for e in aligned_entities],
|
||||||
|
"created_at": datetime.now().isoformat()
|
||||||
|
}
|
||||||
|
|
||||||
|
# Phase 3: Knowledge Base API
|
||||||
|
@app.get("/api/v1/projects/{project_id}/knowledge-base")
|
||||||
|
async def get_knowledge_base(project_id: str):
|
||||||
|
"""获取项目知识库 - 包含所有实体、关系、术语表"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
project = db.get_project(project_id)
|
||||||
|
if not project:
|
||||||
|
raise HTTPException(status_code=404, detail="Project not found")
|
||||||
|
|
||||||
|
# 获取所有实体
|
||||||
|
entities = db.list_project_entities(project_id)
|
||||||
|
|
||||||
|
# 获取所有关系
|
||||||
|
relations = db.list_project_relations(project_id)
|
||||||
|
|
||||||
|
# 获取所有转录
|
||||||
|
transcripts = db.list_project_transcripts(project_id)
|
||||||
|
|
||||||
|
# 获取术语表
|
||||||
|
glossary = db.list_glossary(project_id)
|
||||||
|
|
||||||
|
# 构建实体统计
|
||||||
|
entity_stats = {}
|
||||||
|
for ent in entities:
|
||||||
|
mentions = db.get_entity_mentions(ent.id)
|
||||||
|
entity_stats[ent.id] = {
|
||||||
|
"mention_count": len(mentions),
|
||||||
|
"transcript_ids": list(set([m.transcript_id for m in mentions]))
|
||||||
|
}
|
||||||
|
|
||||||
|
# 构建实体名称映射
|
||||||
|
entity_map = {e.id: e.name for e in entities}
|
||||||
|
|
||||||
|
return {
|
||||||
|
"project": {
|
||||||
|
"id": project.id,
|
||||||
|
"name": project.name,
|
||||||
|
"description": project.description
|
||||||
|
},
|
||||||
|
"stats": {
|
||||||
|
"entity_count": len(entities),
|
||||||
|
"relation_count": len(relations),
|
||||||
|
"transcript_count": len(transcripts),
|
||||||
|
"glossary_count": len(glossary)
|
||||||
|
},
|
||||||
|
"entities": [
|
||||||
|
{
|
||||||
|
"id": e.id,
|
||||||
|
"name": e.name,
|
||||||
|
"type": e.type,
|
||||||
|
"definition": e.definition,
|
||||||
|
"aliases": e.aliases,
|
||||||
|
"mention_count": entity_stats.get(e.id, {}).get("mention_count", 0),
|
||||||
|
"appears_in": entity_stats.get(e.id, {}).get("transcript_ids", [])
|
||||||
|
}
|
||||||
|
for e in entities
|
||||||
|
],
|
||||||
|
"relations": [
|
||||||
|
{
|
||||||
|
"id": r["id"],
|
||||||
|
"source_id": r["source_entity_id"],
|
||||||
|
"source_name": entity_map.get(r["source_entity_id"], "Unknown"),
|
||||||
|
"target_id": r["target_entity_id"],
|
||||||
|
"target_name": entity_map.get(r["target_entity_id"], "Unknown"),
|
||||||
|
"type": r["relation_type"],
|
||||||
|
"evidence": r["evidence"]
|
||||||
|
}
|
||||||
|
for r in relations
|
||||||
|
],
|
||||||
|
"glossary": [
|
||||||
|
{
|
||||||
|
"id": g["id"],
|
||||||
|
"term": g["term"],
|
||||||
|
"pronunciation": g["pronunciation"],
|
||||||
|
"frequency": g["frequency"]
|
||||||
|
}
|
||||||
|
for g in glossary
|
||||||
|
],
|
||||||
|
"transcripts": [
|
||||||
|
{
|
||||||
|
"id": t["id"],
|
||||||
|
"filename": t["filename"],
|
||||||
|
"type": t.get("type", "audio"),
|
||||||
|
"created_at": t["created_at"]
|
||||||
|
}
|
||||||
|
for t in transcripts
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Phase 3: Glossary API
|
||||||
|
@app.post("/api/v1/projects/{project_id}/glossary")
|
||||||
|
async def add_glossary_term(project_id: str, term: GlossaryTermCreate):
|
||||||
|
"""添加术语到项目术语表"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
project = db.get_project(project_id)
|
||||||
|
if not project:
|
||||||
|
raise HTTPException(status_code=404, detail="Project not found")
|
||||||
|
|
||||||
|
term_id = db.add_glossary_term(
|
||||||
|
project_id=project_id,
|
||||||
|
term=term.term,
|
||||||
|
pronunciation=term.pronunciation
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": term_id,
|
||||||
|
"term": term.term,
|
||||||
|
"pronunciation": term.pronunciation,
|
||||||
|
"success": True
|
||||||
|
}
|
||||||
|
|
||||||
|
@app.get("/api/v1/projects/{project_id}/glossary")
|
||||||
|
async def get_glossary(project_id: str):
|
||||||
|
"""获取项目术语表"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
glossary = db.list_glossary(project_id)
|
||||||
|
return glossary
|
||||||
|
|
||||||
|
@app.delete("/api/v1/glossary/{term_id}")
|
||||||
|
async def delete_glossary_term(term_id: str):
|
||||||
|
"""删除术语"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
db.delete_glossary_term(term_id)
|
||||||
|
return {"success": True}
|
||||||
|
|
||||||
|
# Phase 3: Entity Alignment API
|
||||||
|
@app.post("/api/v1/projects/{project_id}/align-entities")
|
||||||
|
async def align_project_entities(project_id: str, threshold: float = 0.85):
|
||||||
|
"""运行实体对齐算法,合并相似实体"""
|
||||||
|
if not DB_AVAILABLE:
|
||||||
|
raise HTTPException(status_code=500, detail="Database not available")
|
||||||
|
|
||||||
|
aligner = get_aligner()
|
||||||
|
if not aligner:
|
||||||
|
raise HTTPException(status_code=500, detail="Entity aligner not available")
|
||||||
|
|
||||||
|
db = get_db_manager()
|
||||||
|
entities = db.list_project_entities(project_id)
|
||||||
|
|
||||||
|
merged_count = 0
|
||||||
|
merged_pairs = []
|
||||||
|
|
||||||
|
# 使用 embedding 对齐
|
||||||
|
for i, entity in enumerate(entities):
|
||||||
|
# 跳过已合并的实体
|
||||||
|
existing = db.get_entity(entity.id)
|
||||||
|
if not existing:
|
||||||
|
continue
|
||||||
|
|
||||||
|
similar = aligner.find_similar_entity(
|
||||||
|
project_id,
|
||||||
|
entity.name,
|
||||||
|
entity.definition,
|
||||||
|
exclude_id=entity.id,
|
||||||
|
threshold=threshold
|
||||||
|
)
|
||||||
|
|
||||||
|
if similar:
|
||||||
|
# 合并实体
|
||||||
|
db.merge_entities(similar.id, entity.id)
|
||||||
|
merged_count += 1
|
||||||
|
merged_pairs.append({
|
||||||
|
"source": entity.name,
|
||||||
|
"target": similar.name
|
||||||
|
})
|
||||||
|
|
||||||
|
return {
|
||||||
|
"success": True,
|
||||||
|
"merged_count": merged_count,
|
||||||
|
"merged_pairs": merged_pairs
|
||||||
|
}
|
||||||
|
|
||||||
@app.get("/api/v1/projects/{project_id}/entities")
|
@app.get("/api/v1/projects/{project_id}/entities")
|
||||||
async def get_project_entities(project_id: str):
|
async def get_project_entities(project_id: str):
|
||||||
"""获取项目的全局实体列表"""
|
"""获取项目的全局实体列表"""
|
||||||
@@ -318,7 +897,7 @@ async def get_project_entities(project_id: str):
|
|||||||
|
|
||||||
db = get_db_manager()
|
db = get_db_manager()
|
||||||
entities = db.list_project_entities(project_id)
|
entities = db.list_project_entities(project_id)
|
||||||
return [{"id": e.id, "name": e.name, "type": e.type, "definition": e.definition} for e in entities]
|
return [{"id": e.id, "name": e.name, "type": e.type, "definition": e.definition, "aliases": e.aliases} for e in entities]
|
||||||
|
|
||||||
|
|
||||||
@app.get("/api/v1/projects/{project_id}/relations")
|
@app.get("/api/v1/projects/{project_id}/relations")
|
||||||
@@ -356,6 +935,7 @@ async def get_project_transcripts(project_id: str):
|
|||||||
return [{
|
return [{
|
||||||
"id": t["id"],
|
"id": t["id"],
|
||||||
"filename": t["filename"],
|
"filename": t["filename"],
|
||||||
|
"type": t.get("type", "audio"),
|
||||||
"created_at": t["created_at"],
|
"created_at": t["created_at"],
|
||||||
"preview": t["full_text"][:100] + "..." if len(t["full_text"]) > 100 else t["full_text"]
|
"preview": t["full_text"][:100] + "..." if len(t["full_text"]) > 100 else t["full_text"]
|
||||||
} for t in transcripts]
|
} for t in transcripts]
|
||||||
@@ -378,25 +958,18 @@ async def get_entity_mentions(entity_id: str):
|
|||||||
"confidence": m.confidence
|
"confidence": m.confidence
|
||||||
} for m in mentions]
|
} for m in mentions]
|
||||||
|
|
||||||
@app.post("/api/v1/entities/{entity_id}/merge")
|
|
||||||
async def merge_entities(entity_id: str, target_entity_id: str):
|
|
||||||
"""合并两个实体"""
|
|
||||||
if not DB_AVAILABLE:
|
|
||||||
raise HTTPException(status_code=500, detail="Database not available")
|
|
||||||
|
|
||||||
db = get_db_manager()
|
|
||||||
result = db.merge_entities(target_entity_id, entity_id)
|
|
||||||
return {"success": True, "merged_entity": {"id": result.id, "name": result.name}}
|
|
||||||
|
|
||||||
# Health check
|
# Health check
|
||||||
@app.get("/health")
|
@app.get("/health")
|
||||||
async def health_check():
|
async def health_check():
|
||||||
return {
|
return {
|
||||||
"status": "ok",
|
"status": "ok",
|
||||||
"version": "0.3.0",
|
"version": "0.3.0",
|
||||||
|
"phase": "Phase 3 - Memory & Growth",
|
||||||
"oss_available": OSS_AVAILABLE,
|
"oss_available": OSS_AVAILABLE,
|
||||||
"tingwu_available": TINGWU_AVAILABLE,
|
"tingwu_available": TINGWU_AVAILABLE,
|
||||||
"db_available": DB_AVAILABLE
|
"db_available": DB_AVAILABLE,
|
||||||
|
"doc_processor_available": DOC_PROCESSOR_AVAILABLE,
|
||||||
|
"aligner_available": ALIGNER_AVAILABLE
|
||||||
}
|
}
|
||||||
|
|
||||||
# Serve frontend
|
# Serve frontend
|
||||||
|
|||||||
24
backend/requirements.txt
Normal file
24
backend/requirements.txt
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
# InsightFlow Backend Dependencies
|
||||||
|
|
||||||
|
# Web Framework
|
||||||
|
fastapi==0.109.0
|
||||||
|
uvicorn[standard]==0.27.0
|
||||||
|
python-multipart==0.0.6
|
||||||
|
|
||||||
|
# HTTP Client
|
||||||
|
httpx==0.26.0
|
||||||
|
|
||||||
|
# Document Processing
|
||||||
|
PyPDF2==3.0.1
|
||||||
|
python-docx==1.1.0
|
||||||
|
|
||||||
|
# Data Processing
|
||||||
|
numpy==1.26.3
|
||||||
|
|
||||||
|
# Aliyun SDK
|
||||||
|
aliyun-python-sdk-core==2.14.0
|
||||||
|
aliyun-python-sdk-oss==2.18.5
|
||||||
|
oss2==2.18.5
|
||||||
|
|
||||||
|
# Utilities
|
||||||
|
python-dotenv==1.0.0
|
||||||
@@ -16,7 +16,9 @@ CREATE TABLE IF NOT EXISTS transcripts (
|
|||||||
project_id TEXT NOT NULL,
|
project_id TEXT NOT NULL,
|
||||||
filename TEXT,
|
filename TEXT,
|
||||||
full_text TEXT,
|
full_text TEXT,
|
||||||
|
type TEXT DEFAULT 'audio', -- 'audio' 或 'document'
|
||||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||||
|
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||||
FOREIGN KEY (project_id) REFERENCES projects(id)
|
FOREIGN KEY (project_id) REFERENCES projects(id)
|
||||||
);
|
);
|
||||||
|
|
||||||
@@ -29,6 +31,7 @@ CREATE TABLE IF NOT EXISTS entities (
|
|||||||
type TEXT,
|
type TEXT,
|
||||||
definition TEXT,
|
definition TEXT,
|
||||||
aliases TEXT, -- JSON 数组:["别名1", "别名2"]
|
aliases TEXT, -- JSON 数组:["别名1", "别名2"]
|
||||||
|
embedding TEXT, -- JSON 数组:实体名称+定义的 embedding
|
||||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||||
FOREIGN KEY (project_id) REFERENCES projects(id)
|
FOREIGN KEY (project_id) REFERENCES projects(id)
|
||||||
@@ -71,3 +74,12 @@ CREATE TABLE IF NOT EXISTS glossary (
|
|||||||
frequency INTEGER DEFAULT 1,
|
frequency INTEGER DEFAULT 1,
|
||||||
FOREIGN KEY (project_id) REFERENCES projects(id)
|
FOREIGN KEY (project_id) REFERENCES projects(id)
|
||||||
);
|
);
|
||||||
|
|
||||||
|
-- 创建索引以提高查询性能
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_entities_project ON entities(project_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_entities_name ON entities(name);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_transcripts_project ON transcripts(project_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_mentions_entity ON entity_mentions(entity_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_mentions_transcript ON entity_mentions(transcript_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_relations_project ON entity_relations(project_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_glossary_project ON glossary(project_id);
|
||||||
|
|||||||
80
deploy.sh
Executable file
80
deploy.sh
Executable file
@@ -0,0 +1,80 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# InsightFlow Phase 3 部署脚本
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
echo "🚀 InsightFlow Phase 3 部署脚本"
|
||||||
|
echo "================================"
|
||||||
|
|
||||||
|
# 检查环境
|
||||||
|
if ! command -v docker &> /dev/null; then
|
||||||
|
echo "❌ Docker 未安装,请先安装 Docker"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! command -v git &> /dev/null; then
|
||||||
|
echo "❌ Git 未安装,请先安装 Git"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# 配置
|
||||||
|
IMAGE_NAME="insightflow"
|
||||||
|
IMAGE_TAG="phase3"
|
||||||
|
CONTAINER_NAME="insightflow-app"
|
||||||
|
PORT="18000"
|
||||||
|
DATA_DIR="/opt/data/insightflow"
|
||||||
|
|
||||||
|
# 检查环境变量
|
||||||
|
if [ -z "$KIMI_API_KEY" ]; then
|
||||||
|
echo "⚠️ 警告: KIMI_API_KEY 未设置"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -z "$ALIYUN_ACCESS_KEY_ID" ]; then
|
||||||
|
echo "⚠️ 警告: ALIYUN_ACCESS_KEY_ID 未设置"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -z "$ALIYUN_ACCESS_KEY_SECRET" ]; then
|
||||||
|
echo "⚠️ 警告: ALIYUN_ACCESS_KEY_SECRET 未设置"
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "📦 构建 Docker 镜像..."
|
||||||
|
docker build -t ${IMAGE_NAME}:${IMAGE_TAG} .
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "🛑 停止旧容器..."
|
||||||
|
docker stop ${CONTAINER_NAME} 2>/dev/null || true
|
||||||
|
docker rm ${CONTAINER_NAME} 2>/dev/null || true
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "📁 创建数据目录..."
|
||||||
|
mkdir -p ${DATA_DIR}
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "🚀 启动新容器..."
|
||||||
|
docker run -d \
|
||||||
|
--name ${CONTAINER_NAME} \
|
||||||
|
-p ${PORT}:8000 \
|
||||||
|
-v ${DATA_DIR}:/app/data \
|
||||||
|
-e KIMI_API_KEY="${KIMI_API_KEY}" \
|
||||||
|
-e KIMI_BASE_URL="${KIMI_BASE_URL:-https://api.kimi.com/coding}" \
|
||||||
|
-e ALIYUN_ACCESS_KEY_ID="${ALIYUN_ACCESS_KEY_ID}" \
|
||||||
|
-e ALIYUN_ACCESS_KEY_SECRET="${ALIYUN_ACCESS_KEY_SECRET}" \
|
||||||
|
-e DB_PATH="/app/data/insightflow.db" \
|
||||||
|
--restart unless-stopped \
|
||||||
|
${IMAGE_NAME}:${IMAGE_TAG}
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "⏳ 等待服务启动..."
|
||||||
|
sleep 3
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "✅ 部署完成!"
|
||||||
|
echo ""
|
||||||
|
echo "📊 服务状态:"
|
||||||
|
docker ps --filter "name=${CONTAINER_NAME}" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "🔗 访问地址: http://localhost:${PORT}"
|
||||||
|
echo "📋 查看日志: docker logs -f ${CONTAINER_NAME}"
|
||||||
|
echo ""
|
||||||
198
docs/PHASE3_FEATURES.md
Normal file
198
docs/PHASE3_FEATURES.md
Normal file
@@ -0,0 +1,198 @@
|
|||||||
|
# InsightFlow Phase 3 功能说明
|
||||||
|
|
||||||
|
## 概述
|
||||||
|
|
||||||
|
Phase 3 实现了 InsightFlow 的"记忆与生长"能力,支持多文件知识融合、文档导入和项目级知识库管理。
|
||||||
|
|
||||||
|
## 功能清单
|
||||||
|
|
||||||
|
### 1. 多文件图谱融合 ✅
|
||||||
|
|
||||||
|
#### 功能描述
|
||||||
|
- 支持向同一项目上传多个音频文件
|
||||||
|
- 系统自动对齐新文件中的实体与已有实体
|
||||||
|
- 合并知识图谱,保持实体一致性
|
||||||
|
- 跨文件追踪实体提及
|
||||||
|
|
||||||
|
#### 使用方式
|
||||||
|
1. 在工作台点击"+ 上传文件"
|
||||||
|
2. 选择音频文件(MP3/WAV/M4A)
|
||||||
|
3. 系统自动转录并提取实体
|
||||||
|
4. 新实体与已有实体自动对齐
|
||||||
|
5. 使用"📁 选择文件"切换不同转录内容
|
||||||
|
|
||||||
|
#### API
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/upload
|
||||||
|
Content-Type: multipart/form-data
|
||||||
|
file: <音频文件>
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. 实体对齐算法优化 ✅
|
||||||
|
|
||||||
|
#### 功能描述
|
||||||
|
- 使用 Kimi API 的 embedding 服务计算语义相似度
|
||||||
|
- 余弦相似度匹配算法
|
||||||
|
- 支持阈值调整(默认 0.85)
|
||||||
|
- 自动别名建议
|
||||||
|
- 失败时回退到字符串匹配
|
||||||
|
|
||||||
|
#### 实现模块
|
||||||
|
- `backend/entity_aligner.py`
|
||||||
|
|
||||||
|
#### 核心算法
|
||||||
|
```python
|
||||||
|
# 余弦相似度计算
|
||||||
|
def compute_similarity(embedding1, embedding2):
|
||||||
|
vec1 = np.array(embedding1)
|
||||||
|
vec2 = np.array(embedding2)
|
||||||
|
dot_product = np.dot(vec1, vec2)
|
||||||
|
norm1 = np.linalg.norm(vec1)
|
||||||
|
norm2 = np.linalg.norm(vec2)
|
||||||
|
return dot_product / (norm1 * norm2)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### API
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/align-entities?threshold=0.85
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. PDF/DOCX 文档导入 ✅
|
||||||
|
|
||||||
|
#### 功能描述
|
||||||
|
- 支持 PDF、DOCX、DOC、TXT、MD 格式
|
||||||
|
- 自动提取文档文本
|
||||||
|
- 文本参与实体提取和关系构建
|
||||||
|
- 文档类型标记和区分
|
||||||
|
|
||||||
|
#### 使用方式
|
||||||
|
1. 在工作台点击"+ 上传文件"
|
||||||
|
2. 切换到"📄 文档"标签
|
||||||
|
3. 选择文档文件
|
||||||
|
4. 系统自动解析并提取知识
|
||||||
|
|
||||||
|
#### 实现模块
|
||||||
|
- `backend/document_processor.py`
|
||||||
|
|
||||||
|
#### API
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/upload-document
|
||||||
|
Content-Type: multipart/form-data
|
||||||
|
file: <文档文件>
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. 项目知识库面板 ✅
|
||||||
|
|
||||||
|
#### 功能描述
|
||||||
|
- 项目级全域知识库视图
|
||||||
|
- 统计面板(实体数、关系数、文件数、术语数)
|
||||||
|
- 实体网格展示(带提及统计)
|
||||||
|
- 关系列表展示
|
||||||
|
- 术语表管理
|
||||||
|
- 文件列表(区分音频/文档)
|
||||||
|
|
||||||
|
#### 使用方式
|
||||||
|
1. 在工作台点击左侧"📚"图标
|
||||||
|
2. 查看项目统计概览
|
||||||
|
3. 切换侧边栏标签浏览不同内容
|
||||||
|
4. 点击实体可跳转回工作台查看详情
|
||||||
|
|
||||||
|
#### API
|
||||||
|
```
|
||||||
|
GET /api/v1/projects/{project_id}/knowledge-base
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. 术语表管理 ✅
|
||||||
|
|
||||||
|
#### 功能描述
|
||||||
|
- 项目级术语表
|
||||||
|
- 支持添加术语和发音提示
|
||||||
|
- 频率统计
|
||||||
|
- 用于 ASR 热词优化
|
||||||
|
|
||||||
|
#### 使用方式
|
||||||
|
1. 在知识库面板切换到"📖 术语表"
|
||||||
|
2. 点击"+ 添加术语"
|
||||||
|
3. 输入术语和发音提示
|
||||||
|
4. 可删除不需要的术语
|
||||||
|
|
||||||
|
#### API
|
||||||
|
```
|
||||||
|
POST /api/v1/projects/{project_id}/glossary
|
||||||
|
GET /api/v1/projects/{project_id}/glossary
|
||||||
|
DELETE /api/v1/glossary/{term_id}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 数据库 Schema 更新
|
||||||
|
|
||||||
|
### transcripts 表
|
||||||
|
```sql
|
||||||
|
ALTER TABLE transcripts ADD COLUMN type TEXT DEFAULT 'audio';
|
||||||
|
-- 'audio' 或 'document'
|
||||||
|
```
|
||||||
|
|
||||||
|
### entities 表
|
||||||
|
```sql
|
||||||
|
ALTER TABLE entities ADD COLUMN embedding TEXT;
|
||||||
|
-- JSON 数组存储 embedding 向量
|
||||||
|
```
|
||||||
|
|
||||||
|
### glossary 表(新增)
|
||||||
|
```sql
|
||||||
|
CREATE TABLE glossary (
|
||||||
|
id TEXT PRIMARY KEY,
|
||||||
|
project_id TEXT NOT NULL,
|
||||||
|
term TEXT NOT NULL,
|
||||||
|
pronunciation TEXT,
|
||||||
|
frequency INTEGER DEFAULT 1
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
## 前端更新
|
||||||
|
|
||||||
|
### 新增组件
|
||||||
|
1. **侧边栏导航** - 切换工作台/知识库视图
|
||||||
|
2. **文件选择器** - 切换不同转录文件
|
||||||
|
3. **上传标签页** - 区分音频/文档上传
|
||||||
|
4. **知识库面板** - 统计卡片、实体网格、关系列表、术语表
|
||||||
|
|
||||||
|
### 更新文件
|
||||||
|
- `frontend/workbench.html` - 新增知识库 UI
|
||||||
|
- `frontend/app.js` - 新增知识库逻辑、多文件支持
|
||||||
|
|
||||||
|
## 部署说明
|
||||||
|
|
||||||
|
### 环境变量
|
||||||
|
```bash
|
||||||
|
KIMI_API_KEY=your_kimi_api_key
|
||||||
|
KIMI_BASE_URL=https://api.kimi.com/coding
|
||||||
|
ALIYUN_ACCESS_KEY_ID=your_aliyun_key
|
||||||
|
ALIYUN_ACCESS_KEY_SECRET=your_aliyun_secret
|
||||||
|
```
|
||||||
|
|
||||||
|
### 部署命令
|
||||||
|
```bash
|
||||||
|
# 使用部署脚本
|
||||||
|
./deploy.sh
|
||||||
|
|
||||||
|
# 或手动部署
|
||||||
|
docker build -t insightflow:phase3 .
|
||||||
|
docker run -d \
|
||||||
|
-p 18000:8000 \
|
||||||
|
-v /opt/data:/app/data \
|
||||||
|
-e KIMI_API_KEY=$KIMI_API_KEY \
|
||||||
|
insightflow:phase3
|
||||||
|
```
|
||||||
|
|
||||||
|
## 测试检查清单
|
||||||
|
|
||||||
|
- [ ] 上传多个音频文件到同一项目
|
||||||
|
- [ ] 检查实体是否正确对齐
|
||||||
|
- [ ] 上传 PDF 文档
|
||||||
|
- [ ] 上传 DOCX 文档
|
||||||
|
- [ ] 切换不同转录文件
|
||||||
|
- [ ] 查看知识库面板统计
|
||||||
|
- [ ] 添加术语到术语表
|
||||||
|
- [ ] 删除术语
|
||||||
|
- [ ] 实体合并功能
|
||||||
|
- [ ] 关系创建/删除
|
||||||
780
frontend/app.js
780
frontend/app.js
@@ -1,4 +1,5 @@
|
|||||||
// InsightFlow Frontend - Production Version
|
// InsightFlow Frontend - Phase 3 (Memory & Growth)
|
||||||
|
// Knowledge Growth: Multi-file fusion + Entity Alignment + Document Import
|
||||||
const API_BASE = '/api/v1';
|
const API_BASE = '/api/v1';
|
||||||
|
|
||||||
let currentProject = null;
|
let currentProject = null;
|
||||||
@@ -6,6 +7,12 @@ let currentData = null;
|
|||||||
let selectedEntity = null;
|
let selectedEntity = null;
|
||||||
let projectRelations = [];
|
let projectRelations = [];
|
||||||
let projectEntities = [];
|
let projectEntities = [];
|
||||||
|
let currentTranscript = null;
|
||||||
|
let projectTranscripts = [];
|
||||||
|
let editMode = false;
|
||||||
|
let contextMenuTarget = null;
|
||||||
|
let currentUploadTab = 'audio';
|
||||||
|
let knowledgeBaseData = null;
|
||||||
|
|
||||||
// Init
|
// Init
|
||||||
document.addEventListener('DOMContentLoaded', () => {
|
document.addEventListener('DOMContentLoaded', () => {
|
||||||
@@ -37,6 +44,8 @@ async function initWorkbench() {
|
|||||||
if (nameEl) nameEl.textContent = currentProject.name;
|
if (nameEl) nameEl.textContent = currentProject.name;
|
||||||
|
|
||||||
initUpload();
|
initUpload();
|
||||||
|
initContextMenu();
|
||||||
|
initTextSelection();
|
||||||
await loadProjectData();
|
await loadProjectData();
|
||||||
|
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
@@ -65,12 +74,131 @@ async function uploadAudio(file) {
|
|||||||
return await res.json();
|
return await res.json();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Phase 3: Document Upload API
|
||||||
|
async function uploadDocument(file) {
|
||||||
|
const formData = new FormData();
|
||||||
|
formData.append('file', file);
|
||||||
|
|
||||||
|
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/upload-document`, {
|
||||||
|
method: 'POST',
|
||||||
|
body: formData
|
||||||
|
});
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
const error = await res.json();
|
||||||
|
throw new Error(error.detail || 'Document upload failed');
|
||||||
|
}
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 3: Knowledge Base API
|
||||||
|
async function fetchKnowledgeBase() {
|
||||||
|
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/knowledge-base`);
|
||||||
|
if (!res.ok) throw new Error('Failed to fetch knowledge base');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 3: Glossary API
|
||||||
|
async function addGlossaryTerm(term, pronunciation = '') {
|
||||||
|
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/glossary`, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify({ term, pronunciation })
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to add glossary term');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function deleteGlossaryTerm(termId) {
|
||||||
|
const res = await fetch(`${API_BASE}/glossary/${termId}`, {
|
||||||
|
method: 'DELETE'
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to delete glossary term');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 2: Entity Edit API
|
||||||
|
async function updateEntity(entityId, data) {
|
||||||
|
const res = await fetch(`${API_BASE}/entities/${entityId}`, {
|
||||||
|
method: 'PUT',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify(data)
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to update entity');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function deleteEntityApi(entityId) {
|
||||||
|
const res = await fetch(`${API_BASE}/entities/${entityId}`, {
|
||||||
|
method: 'DELETE'
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to delete entity');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function mergeEntitiesApi(sourceId, targetId) {
|
||||||
|
const res = await fetch(`${API_BASE}/entities/${sourceId}/merge`, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify({ source_entity_id: sourceId, target_entity_id: targetId })
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to merge entities');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function createEntityApi(data) {
|
||||||
|
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/entities`, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify(data)
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to create entity');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 2: Relation API
|
||||||
|
async function createRelationApi(data) {
|
||||||
|
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/relations`, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify(data)
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to create relation');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function deleteRelationApi(relationId) {
|
||||||
|
const res = await fetch(`${API_BASE}/relations/${relationId}`, {
|
||||||
|
method: 'DELETE'
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to delete relation');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 2: Transcript API
|
||||||
|
async function getTranscript(transcriptId) {
|
||||||
|
const res = await fetch(`${API_BASE}/transcripts/${transcriptId}`);
|
||||||
|
if (!res.ok) throw new Error('Failed to get transcript');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
async function updateTranscript(transcriptId, fullText) {
|
||||||
|
const res = await fetch(`${API_BASE}/transcripts/${transcriptId}`, {
|
||||||
|
method: 'PUT',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify({ full_text: fullText })
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new Error('Failed to update transcript');
|
||||||
|
return await res.json();
|
||||||
|
}
|
||||||
|
|
||||||
async function loadProjectData() {
|
async function loadProjectData() {
|
||||||
try {
|
try {
|
||||||
// 并行加载实体和关系
|
// 并行加载实体、关系和转录列表
|
||||||
const [entitiesRes, relationsRes] = await Promise.all([
|
const [entitiesRes, relationsRes, transcriptsRes] = await Promise.all([
|
||||||
fetch(`${API_BASE}/projects/${currentProject.id}/entities`),
|
fetch(`${API_BASE}/projects/${currentProject.id}/entities`),
|
||||||
fetch(`${API_BASE}/projects/${currentProject.id}/relations`)
|
fetch(`${API_BASE}/projects/${currentProject.id}/relations`),
|
||||||
|
fetch(`${API_BASE}/projects/${currentProject.id}/transcripts`)
|
||||||
]);
|
]);
|
||||||
|
|
||||||
if (entitiesRes.ok) {
|
if (entitiesRes.ok) {
|
||||||
@@ -79,39 +207,228 @@ async function loadProjectData() {
|
|||||||
if (relationsRes.ok) {
|
if (relationsRes.ok) {
|
||||||
projectRelations = await relationsRes.json();
|
projectRelations = await relationsRes.json();
|
||||||
}
|
}
|
||||||
|
if (transcriptsRes.ok) {
|
||||||
|
projectTranscripts = await transcriptsRes.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
// 加载最新的转录
|
||||||
|
if (projectTranscripts.length > 0) {
|
||||||
|
currentTranscript = await getTranscript(projectTranscripts[0].id);
|
||||||
currentData = {
|
currentData = {
|
||||||
transcript_id: 'project_view',
|
transcript_id: currentTranscript.id,
|
||||||
project_id: currentProject.id,
|
project_id: currentProject.id,
|
||||||
segments: [],
|
segments: [{ speaker: '全文', text: currentTranscript.full_text }],
|
||||||
entities: projectEntities,
|
entities: projectEntities,
|
||||||
full_text: '',
|
full_text: currentTranscript.full_text,
|
||||||
created_at: new Date().toISOString()
|
created_at: currentTranscript.created_at
|
||||||
};
|
};
|
||||||
|
renderTranscript();
|
||||||
|
}
|
||||||
|
|
||||||
renderGraph();
|
renderGraph();
|
||||||
renderEntityList();
|
renderEntityList();
|
||||||
|
renderTranscriptDropdown();
|
||||||
|
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
console.error('Load project data failed:', err);
|
console.error('Load project data failed:', err);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Phase 3: View Switching
|
||||||
|
window.switchView = function(viewName) {
|
||||||
|
// Update sidebar buttons
|
||||||
|
document.querySelectorAll('.sidebar-btn').forEach(btn => {
|
||||||
|
btn.classList.remove('active');
|
||||||
|
});
|
||||||
|
event.target.classList.add('active');
|
||||||
|
|
||||||
|
if (viewName === 'workbench') {
|
||||||
|
document.getElementById('workbenchView').style.display = 'flex';
|
||||||
|
document.getElementById('knowledgeBaseView').classList.remove('show');
|
||||||
|
} else if (viewName === 'knowledge-base') {
|
||||||
|
document.getElementById('workbenchView').style.display = 'none';
|
||||||
|
document.getElementById('knowledgeBaseView').classList.add('show');
|
||||||
|
loadKnowledgeBase();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 3: Load Knowledge Base
|
||||||
|
async function loadKnowledgeBase() {
|
||||||
|
try {
|
||||||
|
knowledgeBaseData = await fetchKnowledgeBase();
|
||||||
|
renderKnowledgeBase();
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Load knowledge base failed:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 3: Render Knowledge Base
|
||||||
|
function renderKnowledgeBase() {
|
||||||
|
if (!knowledgeBaseData) return;
|
||||||
|
|
||||||
|
// Update stats
|
||||||
|
document.getElementById('kbEntityCount').textContent = knowledgeBaseData.stats.entity_count;
|
||||||
|
document.getElementById('kbRelationCount').textContent = knowledgeBaseData.stats.relation_count;
|
||||||
|
document.getElementById('kbTranscriptCount').textContent = knowledgeBaseData.stats.transcript_count;
|
||||||
|
document.getElementById('kbGlossaryCount').textContent = knowledgeBaseData.stats.glossary_count;
|
||||||
|
|
||||||
|
// Render entities
|
||||||
|
const entityGrid = document.getElementById('kbEntityGrid');
|
||||||
|
entityGrid.innerHTML = knowledgeBaseData.entities.map(e => `
|
||||||
|
<div class="kb-entity-card" onclick="selectEntity('${e.id}'); switchView('workbench');">
|
||||||
|
<span class="entity-type-badge type-${e.type}">${e.type}</span>
|
||||||
|
<div class="kb-entity-name">${e.name}</div>
|
||||||
|
<div class="kb-entity-def">${e.definition || '暂无定义'}</div>
|
||||||
|
<div class="kb-entity-meta">提及 ${e.mention_count} 次 | 出现在 ${e.appears_in.length} 个文件中</div>
|
||||||
|
</div>
|
||||||
|
`).join('');
|
||||||
|
|
||||||
|
// Render relations
|
||||||
|
const relationsList = document.getElementById('kbRelationsList');
|
||||||
|
relationsList.innerHTML = knowledgeBaseData.relations.map(r => `
|
||||||
|
<div class="kb-glossary-item">
|
||||||
|
<div>
|
||||||
|
<strong>${r.source_name}</strong>
|
||||||
|
<span style="color:#666;">→ ${r.type} →</span>
|
||||||
|
<strong>${r.target_name}</strong>
|
||||||
|
<div style="font-size:0.8rem;color:#666;margin-top:4px;">${r.evidence || '无证据'}</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
`).join('');
|
||||||
|
|
||||||
|
// Render glossary
|
||||||
|
const glossaryList = document.getElementById('kbGlossaryList');
|
||||||
|
glossaryList.innerHTML = knowledgeBaseData.glossary.map(g => `
|
||||||
|
<div class="kb-glossary-item">
|
||||||
|
<div>
|
||||||
|
<strong>${g.term}</strong>
|
||||||
|
${g.pronunciation ? `<span style="color:#666;font-size:0.85rem;"> (${g.pronunciation})</span>` : ''}
|
||||||
|
<span style="color:#00d4ff;font-size:0.8rem;margin-left:8px;">出现 ${g.frequency} 次</span>
|
||||||
|
</div>
|
||||||
|
<button class="btn-icon" onclick="deleteGlossaryTerm('${g.id}').then(loadKnowledgeBase)">删除</button>
|
||||||
|
</div>
|
||||||
|
`).join('');
|
||||||
|
|
||||||
|
// Render transcripts
|
||||||
|
const transcriptsList = document.getElementById('kbTranscriptsList');
|
||||||
|
transcriptsList.innerHTML = knowledgeBaseData.transcripts.map(t => `
|
||||||
|
<div class="kb-transcript-item">
|
||||||
|
<div>
|
||||||
|
<span class="file-type-icon type-${t.type}">${t.type === 'audio' ? '🎵' : '📄'}</span>
|
||||||
|
<span style="margin-left:8px;">${t.filename}</span>
|
||||||
|
</div>
|
||||||
|
<span style="color:#666;font-size:0.8rem;">${new Date(t.created_at).toLocaleDateString()}</span>
|
||||||
|
</div>
|
||||||
|
`).join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 3: KB Tab Switching
|
||||||
|
window.switchKBTab = function(tabName) {
|
||||||
|
document.querySelectorAll('.kb-nav-item').forEach(item => {
|
||||||
|
item.classList.remove('active');
|
||||||
|
});
|
||||||
|
event.target.classList.add('active');
|
||||||
|
|
||||||
|
document.querySelectorAll('.kb-section').forEach(section => {
|
||||||
|
section.classList.remove('active');
|
||||||
|
});
|
||||||
|
document.getElementById(`kb${tabName.charAt(0).toUpperCase() + tabName.slice(1)}Section`).classList.add('active');
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 3: Transcript Dropdown
|
||||||
|
window.toggleTranscriptDropdown = function() {
|
||||||
|
const dropdown = document.getElementById('transcriptDropdown');
|
||||||
|
dropdown.classList.toggle('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
function renderTranscriptDropdown() {
|
||||||
|
const dropdown = document.getElementById('transcriptDropdown');
|
||||||
|
if (!dropdown || projectTranscripts.length === 0) return;
|
||||||
|
|
||||||
|
dropdown.innerHTML = projectTranscripts.map(t => `
|
||||||
|
<div class="transcript-option ${currentTranscript && currentTranscript.id === t.id ? 'active' : ''}"
|
||||||
|
onclick="switchTranscript('${t.id}')">
|
||||||
|
<span class="file-type-icon type-${t.type || 'audio'}">${(t.type || 'audio') === 'audio' ? '🎵' : '📄'}</span>
|
||||||
|
<span style="margin-left:4px;">${t.filename}</span>
|
||||||
|
</div>
|
||||||
|
`).join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
window.switchTranscript = async function(transcriptId) {
|
||||||
|
try {
|
||||||
|
currentTranscript = await getTranscript(transcriptId);
|
||||||
|
currentData = {
|
||||||
|
transcript_id: currentTranscript.id,
|
||||||
|
project_id: currentProject.id,
|
||||||
|
segments: [{ speaker: '全文', text: currentTranscript.full_text }],
|
||||||
|
entities: projectEntities,
|
||||||
|
full_text: currentTranscript.full_text,
|
||||||
|
created_at: currentTranscript.created_at
|
||||||
|
};
|
||||||
|
renderTranscript();
|
||||||
|
renderTranscriptDropdown();
|
||||||
|
document.getElementById('transcriptDropdown').classList.remove('show');
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Switch transcript failed:', err);
|
||||||
|
alert('切换文件失败');
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 2: Transcript Edit Mode
|
||||||
|
window.toggleEditMode = function() {
|
||||||
|
editMode = !editMode;
|
||||||
|
const editBtn = document.getElementById('editBtn');
|
||||||
|
const saveBtn = document.getElementById('saveBtn');
|
||||||
|
const content = document.getElementById('transcriptContent');
|
||||||
|
|
||||||
|
if (editMode) {
|
||||||
|
editBtn.style.display = 'none';
|
||||||
|
saveBtn.style.display = 'inline-block';
|
||||||
|
content.contentEditable = 'true';
|
||||||
|
content.style.background = '#0f0f0f';
|
||||||
|
content.style.border = '1px solid #00d4ff';
|
||||||
|
content.focus();
|
||||||
|
} else {
|
||||||
|
editBtn.style.display = 'inline-block';
|
||||||
|
saveBtn.style.display = 'none';
|
||||||
|
content.contentEditable = 'false';
|
||||||
|
content.style.background = '';
|
||||||
|
content.style.border = '';
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
window.saveTranscript = async function() {
|
||||||
|
if (!currentTranscript) return;
|
||||||
|
|
||||||
|
const content = document.getElementById('transcriptContent');
|
||||||
|
const fullText = content.innerText;
|
||||||
|
|
||||||
|
try {
|
||||||
|
await updateTranscript(currentTranscript.id, fullText);
|
||||||
|
currentTranscript.full_text = fullText;
|
||||||
|
toggleEditMode();
|
||||||
|
alert('转录文本已保存');
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Save failed:', err);
|
||||||
|
alert('保存失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
// Render transcript with entity highlighting
|
// Render transcript with entity highlighting
|
||||||
function renderTranscript() {
|
function renderTranscript() {
|
||||||
const container = document.getElementById('transcriptContent');
|
const container = document.getElementById('transcriptContent');
|
||||||
if (!container || !currentData || !currentData.segments) return;
|
if (!container || !currentData) return;
|
||||||
|
|
||||||
container.innerHTML = '';
|
container.innerHTML = '';
|
||||||
|
|
||||||
currentData.segments.forEach((seg, idx) => {
|
if (editMode) {
|
||||||
const div = document.createElement('div');
|
container.innerText = currentData.full_text || '';
|
||||||
div.className = 'segment';
|
return;
|
||||||
div.dataset.index = idx;
|
}
|
||||||
|
|
||||||
// 高亮实体
|
// 高亮实体
|
||||||
let text = seg.text;
|
let text = currentData.full_text || '';
|
||||||
const entities = findEntitiesInText(seg.text);
|
const entities = findEntitiesInText(text);
|
||||||
|
|
||||||
// 按位置倒序替换,避免位置偏移
|
// 按位置倒序替换,避免位置偏移
|
||||||
entities.sort((a, b) => b.start - a.start);
|
entities.sort((a, b) => b.start - a.start);
|
||||||
@@ -123,13 +440,14 @@ function renderTranscript() {
|
|||||||
text = before + `<span class="entity" data-id="${ent.id}" onclick="window.selectEntity('${ent.id}')">${name}</span>` + after;
|
text = before + `<span class="entity" data-id="${ent.id}" onclick="window.selectEntity('${ent.id}')">${name}</span>` + after;
|
||||||
});
|
});
|
||||||
|
|
||||||
|
const div = document.createElement('div');
|
||||||
|
div.className = 'segment';
|
||||||
div.innerHTML = `
|
div.innerHTML = `
|
||||||
<div class="speaker">${seg.speaker}</div>
|
<div class="speaker">${currentTranscript.filename || '转录文本'}</div>
|
||||||
<div class="segment-text">${text}</div>
|
<div class="segment-text">${text}</div>
|
||||||
`;
|
`;
|
||||||
|
|
||||||
container.appendChild(div);
|
container.appendChild(div);
|
||||||
});
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// 在文本中查找实体位置
|
// 在文本中查找实体位置
|
||||||
@@ -181,7 +499,7 @@ function renderGraph() {
|
|||||||
.attr('y', '50%')
|
.attr('y', '50%')
|
||||||
.attr('text-anchor', 'middle')
|
.attr('text-anchor', 'middle')
|
||||||
.attr('fill', '#666')
|
.attr('fill', '#666')
|
||||||
.text('暂无实体数据,请上传音频');
|
.text('暂无实体数据,请上传音频或文档');
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -201,6 +519,7 @@ function renderGraph() {
|
|||||||
|
|
||||||
// 使用数据库中的关系
|
// 使用数据库中的关系
|
||||||
const links = projectRelations.map(r => ({
|
const links = projectRelations.map(r => ({
|
||||||
|
id: r.id,
|
||||||
source: r.source_id,
|
source: r.source_id,
|
||||||
target: r.target_id,
|
target: r.target_id,
|
||||||
type: r.type
|
type: r.type
|
||||||
@@ -256,7 +575,11 @@ function renderGraph() {
|
|||||||
.on('start', dragstarted)
|
.on('start', dragstarted)
|
||||||
.on('drag', dragged)
|
.on('drag', dragged)
|
||||||
.on('end', dragended))
|
.on('end', dragended))
|
||||||
.on('click', (e, d) => window.selectEntity(d.id));
|
.on('click', (e, d) => window.selectEntity(d.id))
|
||||||
|
.on('contextmenu', (e, d) => {
|
||||||
|
e.preventDefault();
|
||||||
|
showContextMenu(e, d.id);
|
||||||
|
});
|
||||||
|
|
||||||
// 节点圆圈
|
// 节点圆圈
|
||||||
node.append('circle')
|
node.append('circle')
|
||||||
@@ -323,7 +646,7 @@ function renderEntityList() {
|
|||||||
container.innerHTML = '<h3 style="margin-bottom:12px;color:#888;font-size:0.9rem;">项目实体</h3>';
|
container.innerHTML = '<h3 style="margin-bottom:12px;color:#888;font-size:0.9rem;">项目实体</h3>';
|
||||||
|
|
||||||
if (!projectEntities || projectEntities.length === 0) {
|
if (!projectEntities || projectEntities.length === 0) {
|
||||||
container.innerHTML += '<p style="color:#666;font-size:0.85rem;">暂无实体,请上传音频文件</p>';
|
container.innerHTML += '<p style="color:#666;font-size:0.85rem;">暂无实体,请上传音频或文档文件</p>';
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -332,9 +655,13 @@ function renderEntityList() {
|
|||||||
div.className = 'entity-item';
|
div.className = 'entity-item';
|
||||||
div.dataset.id = ent.id;
|
div.dataset.id = ent.id;
|
||||||
div.onclick = () => window.selectEntity(ent.id);
|
div.onclick = () => window.selectEntity(ent.id);
|
||||||
|
div.oncontextmenu = (e) => {
|
||||||
|
e.preventDefault();
|
||||||
|
showContextMenu(e, ent.id);
|
||||||
|
};
|
||||||
|
|
||||||
div.innerHTML = `
|
div.innerHTML = `
|
||||||
<span class="entity-type-badge type-${ent.type.toLowerCase()}">${ent.type}</span>
|
<span class="entity-type-badge type-${ent.type}">${ent.type}</span>
|
||||||
<div>
|
<div>
|
||||||
<div style="font-weight:500;">${ent.name}</div>
|
<div style="font-weight:500;">${ent.name}</div>
|
||||||
<div style="font-size:0.8rem;color:#666;">${ent.definition || '暂无定义'}</div>
|
<div style="font-size:0.8rem;color:#666;">${ent.definition || '暂无定义'}</div>
|
||||||
@@ -354,11 +681,9 @@ window.selectEntity = function(entityId) {
|
|||||||
// 高亮文本中的实体
|
// 高亮文本中的实体
|
||||||
document.querySelectorAll('.entity').forEach(el => {
|
document.querySelectorAll('.entity').forEach(el => {
|
||||||
if (el.dataset.id === entityId) {
|
if (el.dataset.id === entityId) {
|
||||||
el.style.background = '#ff6b6b';
|
el.classList.add('selected');
|
||||||
el.style.color = '#fff';
|
|
||||||
} else {
|
} else {
|
||||||
el.style.background = '';
|
el.classList.remove('selected');
|
||||||
el.style.color = '';
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -371,17 +696,308 @@ window.selectEntity = function(entityId) {
|
|||||||
// 高亮实体列表
|
// 高亮实体列表
|
||||||
document.querySelectorAll('.entity-item').forEach(el => {
|
document.querySelectorAll('.entity-item').forEach(el => {
|
||||||
if (el.dataset.id === entityId) {
|
if (el.dataset.id === entityId) {
|
||||||
el.style.background = '#2a2a2a';
|
el.classList.add('selected');
|
||||||
el.style.borderLeft = '3px solid #ff6b6b';
|
|
||||||
} else {
|
} else {
|
||||||
el.style.background = '';
|
el.classList.remove('selected');
|
||||||
el.style.borderLeft = '';
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
console.log('Selected:', entity.name, entity.definition);
|
console.log('Selected:', entity.name, entity.definition);
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Phase 2: Context Menu
|
||||||
|
function initContextMenu() {
|
||||||
|
document.addEventListener('click', () => {
|
||||||
|
hideContextMenu();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function showContextMenu(e, entityId) {
|
||||||
|
contextMenuTarget = entityId;
|
||||||
|
const menu = document.getElementById('contextMenu');
|
||||||
|
menu.style.left = e.pageX + 'px';
|
||||||
|
menu.style.top = e.pageY + 'px';
|
||||||
|
menu.classList.add('show');
|
||||||
|
}
|
||||||
|
|
||||||
|
function hideContextMenu() {
|
||||||
|
const menu = document.getElementById('contextMenu');
|
||||||
|
menu.classList.remove('show');
|
||||||
|
contextMenuTarget = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 2: Entity Editor Modal
|
||||||
|
window.editEntity = function() {
|
||||||
|
hideContextMenu();
|
||||||
|
if (!contextMenuTarget && !selectedEntity) return;
|
||||||
|
|
||||||
|
const entityId = contextMenuTarget || selectedEntity;
|
||||||
|
const entity = projectEntities.find(e => e.id === entityId);
|
||||||
|
if (!entity) return;
|
||||||
|
|
||||||
|
document.getElementById('entityName').value = entity.name;
|
||||||
|
document.getElementById('entityType').value = entity.type;
|
||||||
|
document.getElementById('entityDefinition').value = entity.definition || '';
|
||||||
|
document.getElementById('entityAliases').value = (entity.aliases || []).join(', ');
|
||||||
|
|
||||||
|
// 显示关系编辑器
|
||||||
|
document.getElementById('relationEditor').style.display = 'block';
|
||||||
|
renderRelationList(entityId);
|
||||||
|
|
||||||
|
document.getElementById('entityModal').dataset.entityId = entityId;
|
||||||
|
document.getElementById('entityModal').classList.add('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
function renderRelationList(entityId) {
|
||||||
|
const container = document.getElementById('relationList');
|
||||||
|
const entityRelations = projectRelations.filter(r =>
|
||||||
|
r.source_id === entityId || r.target_id === entityId
|
||||||
|
);
|
||||||
|
|
||||||
|
if (entityRelations.length === 0) {
|
||||||
|
container.innerHTML = '<p style="color:#666;font-size:0.8rem;">暂无关系</p>';
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
container.innerHTML = entityRelations.map(r => {
|
||||||
|
const isSource = r.source_id === entityId;
|
||||||
|
const otherId = isSource ? r.target_id : r.source_id;
|
||||||
|
const other = projectEntities.find(e => e.id === otherId);
|
||||||
|
const otherName = other ? other.name : 'Unknown';
|
||||||
|
const arrow = isSource ? '→' : '←';
|
||||||
|
|
||||||
|
return `
|
||||||
|
<div class="relation-item">
|
||||||
|
<span>${arrow} ${otherName} (${r.type})</span>
|
||||||
|
<button onclick="deleteRelation('${r.id}')">删除</button>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}).join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
window.hideEntityModal = function() {
|
||||||
|
document.getElementById('entityModal').classList.remove('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
window.saveEntity = async function() {
|
||||||
|
const entityId = document.getElementById('entityModal').dataset.entityId;
|
||||||
|
if (!entityId) return;
|
||||||
|
|
||||||
|
const data = {
|
||||||
|
name: document.getElementById('entityName').value,
|
||||||
|
type: document.getElementById('entityType').value,
|
||||||
|
definition: document.getElementById('entityDefinition').value,
|
||||||
|
aliases: document.getElementById('entityAliases').value.split(',').map(s => s.trim()).filter(s => s)
|
||||||
|
};
|
||||||
|
|
||||||
|
try {
|
||||||
|
await updateEntity(entityId, data);
|
||||||
|
await loadProjectData();
|
||||||
|
hideEntityModal();
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Save failed:', err);
|
||||||
|
alert('保存失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
window.deleteEntity = async function() {
|
||||||
|
const entityId = document.getElementById('entityModal').dataset.entityId;
|
||||||
|
if (!entityId) return;
|
||||||
|
|
||||||
|
if (!confirm('确定要删除这个实体吗?相关的提及和关系也会被删除。')) return;
|
||||||
|
|
||||||
|
try {
|
||||||
|
await deleteEntityApi(entityId);
|
||||||
|
await loadProjectData();
|
||||||
|
hideEntityModal();
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Delete failed:', err);
|
||||||
|
alert('删除失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 2: Merge Modal
|
||||||
|
window.showMergeModal = function() {
|
||||||
|
hideContextMenu();
|
||||||
|
if (!contextMenuTarget && !selectedEntity) return;
|
||||||
|
|
||||||
|
const sourceId = contextMenuTarget || selectedEntity;
|
||||||
|
const source = projectEntities.find(e => e.id === sourceId);
|
||||||
|
if (!source) return;
|
||||||
|
|
||||||
|
document.getElementById('mergeSource').value = source.name;
|
||||||
|
document.getElementById('mergeModal').dataset.sourceId = sourceId;
|
||||||
|
|
||||||
|
// 填充目标实体选项(排除自己)
|
||||||
|
const select = document.getElementById('mergeTarget');
|
||||||
|
select.innerHTML = projectEntities
|
||||||
|
.filter(e => e.id !== sourceId)
|
||||||
|
.map(e => `<option value="${e.id}">${e.name} (${e.type})</option>`)
|
||||||
|
.join('');
|
||||||
|
|
||||||
|
document.getElementById('mergeModal').classList.add('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
window.hideMergeModal = function() {
|
||||||
|
document.getElementById('mergeModal').classList.remove('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
window.confirmMerge = async function() {
|
||||||
|
const sourceId = document.getElementById('mergeModal').dataset.sourceId;
|
||||||
|
const targetId = document.getElementById('mergeTarget').value;
|
||||||
|
|
||||||
|
if (!sourceId || !targetId) return;
|
||||||
|
|
||||||
|
try {
|
||||||
|
await mergeEntitiesApi(sourceId, targetId);
|
||||||
|
await loadProjectData();
|
||||||
|
hideMergeModal();
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Merge failed:', err);
|
||||||
|
alert('合并失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 2: Relation Modal
|
||||||
|
window.showAddRelation = function() {
|
||||||
|
const entityId = document.getElementById('entityModal').dataset.entityId;
|
||||||
|
if (!entityId) return;
|
||||||
|
|
||||||
|
const entity = projectEntities.find(e => e.id === entityId);
|
||||||
|
document.getElementById('relationModal').dataset.sourceId = entityId;
|
||||||
|
|
||||||
|
// 填充目标选项
|
||||||
|
const select = document.getElementById('relationTarget');
|
||||||
|
select.innerHTML = projectEntities
|
||||||
|
.filter(e => e.id !== entityId)
|
||||||
|
.map(e => `<option value="${e.id}">${e.name}</option>`)
|
||||||
|
.join('');
|
||||||
|
|
||||||
|
document.getElementById('relationModal').classList.add('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
window.hideRelationModal = function() {
|
||||||
|
document.getElementById('relationModal').classList.remove('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
window.saveRelation = async function() {
|
||||||
|
const sourceId = document.getElementById('relationModal').dataset.sourceId;
|
||||||
|
const targetId = document.getElementById('relationTarget').value;
|
||||||
|
const type = document.getElementById('relationType').value;
|
||||||
|
const evidence = document.getElementById('relationEvidence').value;
|
||||||
|
|
||||||
|
if (!sourceId || !targetId) return;
|
||||||
|
|
||||||
|
try {
|
||||||
|
await createRelationApi({
|
||||||
|
source_entity_id: sourceId,
|
||||||
|
target_entity_id: targetId,
|
||||||
|
relation_type: type,
|
||||||
|
evidence: evidence
|
||||||
|
});
|
||||||
|
await loadProjectData();
|
||||||
|
renderRelationList(sourceId);
|
||||||
|
hideRelationModal();
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Create relation failed:', err);
|
||||||
|
alert('创建关系失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
window.deleteRelation = async function(relationId) {
|
||||||
|
if (!confirm('确定要删除这个关系吗?')) return;
|
||||||
|
|
||||||
|
try {
|
||||||
|
await deleteRelationApi(relationId);
|
||||||
|
await loadProjectData();
|
||||||
|
const entityId = document.getElementById('entityModal').dataset.entityId;
|
||||||
|
if (entityId) renderRelationList(entityId);
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Delete relation failed:', err);
|
||||||
|
alert('删除关系失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 2: Text Selection - Create Entity
|
||||||
|
function initTextSelection() {
|
||||||
|
document.addEventListener('selectionchange', () => {
|
||||||
|
const selection = window.getSelection();
|
||||||
|
const text = selection.toString().trim();
|
||||||
|
|
||||||
|
if (text.length > 0 && text.length < 50) {
|
||||||
|
showSelectionToolbar();
|
||||||
|
} else {
|
||||||
|
hideSelectionToolbar();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function showSelectionToolbar() {
|
||||||
|
document.getElementById('selectionToolbar').classList.add('show');
|
||||||
|
}
|
||||||
|
|
||||||
|
window.hideSelectionToolbar = function() {
|
||||||
|
document.getElementById('selectionToolbar').classList.remove('show');
|
||||||
|
window.getSelection().removeAllRanges();
|
||||||
|
};
|
||||||
|
|
||||||
|
window.createEntityFromSelection = async function() {
|
||||||
|
const selection = window.getSelection();
|
||||||
|
const text = selection.toString().trim();
|
||||||
|
|
||||||
|
if (!text) return;
|
||||||
|
|
||||||
|
// 获取选中文本在全文中的位置
|
||||||
|
const container = document.getElementById('transcriptContent');
|
||||||
|
const fullText = currentTranscript ? currentTranscript.full_text : '';
|
||||||
|
const startPos = fullText.indexOf(text);
|
||||||
|
|
||||||
|
try {
|
||||||
|
const result = await createEntityApi({
|
||||||
|
name: text,
|
||||||
|
type: 'OTHER',
|
||||||
|
definition: '',
|
||||||
|
transcript_id: currentTranscript ? currentTranscript.id : null,
|
||||||
|
start_pos: startPos >= 0 ? startPos : null,
|
||||||
|
end_pos: startPos >= 0 ? startPos + text.length : null
|
||||||
|
});
|
||||||
|
|
||||||
|
hideSelectionToolbar();
|
||||||
|
await loadProjectData();
|
||||||
|
|
||||||
|
if (!result.existed) {
|
||||||
|
alert(`已创建实体: ${text}`);
|
||||||
|
} else {
|
||||||
|
alert(`实体 "${text}" 已存在`);
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Create entity failed:', err);
|
||||||
|
alert('创建实体失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Phase 3: Upload Tab Switching
|
||||||
|
window.switchUploadTab = function(tab) {
|
||||||
|
currentUploadTab = tab;
|
||||||
|
document.querySelectorAll('.upload-tab').forEach(t => t.classList.remove('active'));
|
||||||
|
event.target.classList.add('active');
|
||||||
|
|
||||||
|
const hint = document.getElementById('uploadHint');
|
||||||
|
if (tab === 'audio') {
|
||||||
|
hint.textContent = '支持 MP3, WAV, M4A (最大 500MB)';
|
||||||
|
} else {
|
||||||
|
hint.textContent = '支持 PDF, DOCX, DOC, TXT, MD';
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
window.triggerFileSelect = function() {
|
||||||
|
if (currentUploadTab === 'audio') {
|
||||||
|
document.getElementById('fileInput').click();
|
||||||
|
} else {
|
||||||
|
document.getElementById('docInput').click();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
// Show/hide upload
|
// Show/hide upload
|
||||||
window.showUpload = function() {
|
window.showUpload = function() {
|
||||||
const el = document.getElementById('uploadOverlay');
|
const el = document.getElementById('uploadOverlay');
|
||||||
@@ -393,46 +1009,105 @@ window.hideUpload = function() {
|
|||||||
if (el) el.classList.remove('show');
|
if (el) el.classList.remove('show');
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Phase 3: Glossary Modal
|
||||||
|
window.showAddTermModal = function() {
|
||||||
|
document.getElementById('glossaryModal').classList.add('show');
|
||||||
|
};
|
||||||
|
|
||||||
|
window.hideGlossaryModal = function() {
|
||||||
|
document.getElementById('glossaryModal').classList.remove('show');
|
||||||
|
document.getElementById('glossaryTerm').value = '';
|
||||||
|
document.getElementById('glossaryPronunciation').value = '';
|
||||||
|
};
|
||||||
|
|
||||||
|
window.saveGlossaryTerm = async function() {
|
||||||
|
const term = document.getElementById('glossaryTerm').value.trim();
|
||||||
|
const pronunciation = document.getElementById('glossaryPronunciation').value.trim();
|
||||||
|
|
||||||
|
if (!term) {
|
||||||
|
alert('请输入术语');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
await addGlossaryTerm(term, pronunciation);
|
||||||
|
hideGlossaryModal();
|
||||||
|
loadKnowledgeBase();
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Add term failed:', err);
|
||||||
|
alert('添加术语失败: ' + err.message);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
// Upload handling
|
// Upload handling
|
||||||
function initUpload() {
|
function initUpload() {
|
||||||
const input = document.getElementById('fileInput');
|
// Audio upload
|
||||||
|
const audioInput = document.getElementById('fileInput');
|
||||||
|
if (audioInput) {
|
||||||
|
audioInput.addEventListener('change', async (e) => {
|
||||||
|
if (!e.target.files.length) return;
|
||||||
|
await handleFileUpload(e.target.files[0], 'audio');
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Document upload
|
||||||
|
const docInput = document.getElementById('docInput');
|
||||||
|
if (docInput) {
|
||||||
|
docInput.addEventListener('change', async (e) => {
|
||||||
|
if (!e.target.files.length) return;
|
||||||
|
await handleFileUpload(e.target.files[0], 'document');
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function handleFileUpload(file, type) {
|
||||||
const overlay = document.getElementById('uploadOverlay');
|
const overlay = document.getElementById('uploadOverlay');
|
||||||
|
|
||||||
if (!input) return;
|
|
||||||
|
|
||||||
input.addEventListener('change', async (e) => {
|
|
||||||
if (!e.target.files.length) return;
|
|
||||||
|
|
||||||
const file = e.target.files[0];
|
|
||||||
if (overlay) {
|
|
||||||
overlay.innerHTML = `
|
overlay.innerHTML = `
|
||||||
<div style="text-align:center;">
|
<div style="text-align:center;">
|
||||||
<h2>正在分析...</h2>
|
<h2>正在分析...</h2>
|
||||||
<p style="color:#666;margin-top:10px;">${file.name}</p>
|
<p style="color:#666;margin-top:10px;">${file.name}</p>
|
||||||
<p style="color:#888;margin-top:20px;font-size:0.9rem;">ASR转录 + 实体提取中</p>
|
<p style="color:#888;margin-top:20px;font-size:0.9rem;">${type === 'audio' ? 'ASR转录 + 实体提取中' : '文档解析 + 实体提取中'}</p>
|
||||||
</div>
|
</div>
|
||||||
`;
|
`;
|
||||||
}
|
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = await uploadAudio(file);
|
let result;
|
||||||
|
if (type === 'audio') {
|
||||||
|
result = await uploadAudio(file);
|
||||||
|
} else {
|
||||||
|
result = await uploadDocument(file);
|
||||||
|
}
|
||||||
|
|
||||||
// 更新当前数据
|
// 更新当前数据
|
||||||
currentData = result;
|
currentData = result;
|
||||||
|
|
||||||
// 重新加载项目数据(包含新实体和关系)
|
// 重新加载项目数据
|
||||||
await loadProjectData();
|
await loadProjectData();
|
||||||
|
|
||||||
// 渲染转录文本
|
// 重置上传界面
|
||||||
if (result.segments && result.segments.length > 0) {
|
overlay.innerHTML = `
|
||||||
renderTranscript();
|
<div class="upload-box">
|
||||||
}
|
<h2 style="margin-bottom:10px;">上传文件</h2>
|
||||||
|
<div class="upload-tabs">
|
||||||
|
<div class="upload-tab active" onclick="switchUploadTab('audio')">🎵 音频</div>
|
||||||
|
<div class="upload-tab" onclick="switchUploadTab('document')">📄 文档</div>
|
||||||
|
</div>
|
||||||
|
<p style="color:#666;" id="uploadHint">支持 MP3, WAV, M4A (最大 500MB)</p>
|
||||||
|
<input type="file" id="fileInput" accept="audio/*" hidden>
|
||||||
|
<input type="file" id="docInput" accept=".pdf,.docx,.doc,.txt,.md" hidden>
|
||||||
|
<button class="btn" onclick="triggerFileSelect()">选择文件</button>
|
||||||
|
<br><br>
|
||||||
|
<button class="btn btn-secondary" onclick="hideUpload()">取消</button>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
|
||||||
if (overlay) overlay.classList.remove('show');
|
// 重新绑定事件
|
||||||
|
initUpload();
|
||||||
|
overlay.classList.remove('show');
|
||||||
|
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
console.error('Upload failed:', err);
|
console.error('Upload failed:', err);
|
||||||
if (overlay) {
|
|
||||||
overlay.innerHTML = `
|
overlay.innerHTML = `
|
||||||
<div style="text-align:center;">
|
<div style="text-align:center;">
|
||||||
<h2 style="color:#ff6b6b;">分析失败</h2>
|
<h2 style="color:#ff6b6b;">分析失败</h2>
|
||||||
@@ -441,6 +1116,13 @@ function initUpload() {
|
|||||||
</div>
|
</div>
|
||||||
`;
|
`;
|
||||||
}
|
}
|
||||||
}
|
|
||||||
});
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Close dropdown when clicking outside
|
||||||
|
document.addEventListener('click', (e) => {
|
||||||
|
const dropdown = document.getElementById('transcriptDropdown');
|
||||||
|
const selector = document.querySelector('.transcript-selector');
|
||||||
|
if (dropdown && selector && !selector.contains(e.target)) {
|
||||||
|
dropdown.classList.remove('show');
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|||||||
@@ -3,7 +3,7 @@
|
|||||||
<head>
|
<head>
|
||||||
<meta charset="UTF-8">
|
<meta charset="UTF-8">
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
<title>InsightFlow - 知识工作台</title>
|
<title>InsightFlow - 知识工作台 (Phase 3)</title>
|
||||||
<script src="https://d3js.org/d3.v7.min.js"></script>
|
<script src="https://d3js.org/d3.v7.min.js"></script>
|
||||||
<style>
|
<style>
|
||||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||||
@@ -46,10 +46,44 @@
|
|||||||
color: #888;
|
color: #888;
|
||||||
font-size: 0.9rem;
|
font-size: 0.9rem;
|
||||||
}
|
}
|
||||||
|
.header-actions {
|
||||||
|
display: flex;
|
||||||
|
gap: 10px;
|
||||||
|
}
|
||||||
.main {
|
.main {
|
||||||
display: flex;
|
display: flex;
|
||||||
height: calc(100vh - 50px);
|
height: calc(100vh - 50px);
|
||||||
}
|
}
|
||||||
|
.sidebar {
|
||||||
|
width: 60px;
|
||||||
|
background: #111;
|
||||||
|
border-right: 1px solid #222;
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
align-items: center;
|
||||||
|
padding: 10px 0;
|
||||||
|
}
|
||||||
|
.sidebar-btn {
|
||||||
|
width: 44px;
|
||||||
|
height: 44px;
|
||||||
|
background: transparent;
|
||||||
|
border: none;
|
||||||
|
color: #666;
|
||||||
|
font-size: 1.2rem;
|
||||||
|
cursor: pointer;
|
||||||
|
border-radius: 8px;
|
||||||
|
margin-bottom: 8px;
|
||||||
|
transition: all 0.2s;
|
||||||
|
}
|
||||||
|
.sidebar-btn:hover, .sidebar-btn.active {
|
||||||
|
background: #1a1a1a;
|
||||||
|
color: #00d4ff;
|
||||||
|
}
|
||||||
|
.content-area {
|
||||||
|
flex: 1;
|
||||||
|
display: flex;
|
||||||
|
overflow: hidden;
|
||||||
|
}
|
||||||
.editor-panel {
|
.editor-panel {
|
||||||
width: 50%;
|
width: 50%;
|
||||||
border-right: 1px solid #222;
|
border-right: 1px solid #222;
|
||||||
@@ -66,6 +100,23 @@
|
|||||||
justify-content: space-between;
|
justify-content: space-between;
|
||||||
align-items: center;
|
align-items: center;
|
||||||
}
|
}
|
||||||
|
.panel-actions {
|
||||||
|
display: flex;
|
||||||
|
gap: 8px;
|
||||||
|
}
|
||||||
|
.btn-icon {
|
||||||
|
background: transparent;
|
||||||
|
border: 1px solid #333;
|
||||||
|
color: #888;
|
||||||
|
padding: 4px 10px;
|
||||||
|
border-radius: 4px;
|
||||||
|
cursor: pointer;
|
||||||
|
font-size: 0.8rem;
|
||||||
|
}
|
||||||
|
.btn-icon:hover {
|
||||||
|
border-color: #00d4ff;
|
||||||
|
color: #00d4ff;
|
||||||
|
}
|
||||||
.transcript-content {
|
.transcript-content {
|
||||||
flex: 1;
|
flex: 1;
|
||||||
padding: 20px;
|
padding: 20px;
|
||||||
@@ -92,6 +143,13 @@
|
|||||||
}
|
}
|
||||||
.segment-text {
|
.segment-text {
|
||||||
color: #e0e0e0;
|
color: #e0e0e0;
|
||||||
|
outline: none;
|
||||||
|
}
|
||||||
|
.segment-text[contenteditable="true"] {
|
||||||
|
background: #1a1a1a;
|
||||||
|
padding: 8px;
|
||||||
|
border-radius: 4px;
|
||||||
|
border: 1px solid #00d4ff;
|
||||||
}
|
}
|
||||||
.entity {
|
.entity {
|
||||||
background: rgba(123, 44, 191, 0.3);
|
background: rgba(123, 44, 191, 0.3);
|
||||||
@@ -99,10 +157,16 @@
|
|||||||
padding: 0 4px;
|
padding: 0 4px;
|
||||||
border-radius: 3px;
|
border-radius: 3px;
|
||||||
cursor: pointer;
|
cursor: pointer;
|
||||||
|
position: relative;
|
||||||
}
|
}
|
||||||
.entity:hover {
|
.entity:hover {
|
||||||
background: rgba(123, 44, 191, 0.5);
|
background: rgba(123, 44, 191, 0.5);
|
||||||
}
|
}
|
||||||
|
.entity.selected {
|
||||||
|
background: #ff6b6b;
|
||||||
|
border-color: #ff6b6b;
|
||||||
|
color: #fff;
|
||||||
|
}
|
||||||
.graph-panel {
|
.graph-panel {
|
||||||
width: 50%;
|
width: 50%;
|
||||||
display: flex;
|
display: flex;
|
||||||
@@ -127,10 +191,15 @@
|
|||||||
border-radius: 6px;
|
border-radius: 6px;
|
||||||
margin-bottom: 8px;
|
margin-bottom: 8px;
|
||||||
cursor: pointer;
|
cursor: pointer;
|
||||||
|
transition: all 0.2s;
|
||||||
}
|
}
|
||||||
.entity-item:hover {
|
.entity-item:hover {
|
||||||
background: #222;
|
background: #222;
|
||||||
}
|
}
|
||||||
|
.entity-item.selected {
|
||||||
|
background: #2a2a2a;
|
||||||
|
border-left: 3px solid #ff6b6b;
|
||||||
|
}
|
||||||
.entity-type-badge {
|
.entity-type-badge {
|
||||||
padding: 2px 8px;
|
padding: 2px 8px;
|
||||||
border-radius: 4px;
|
border-radius: 4px;
|
||||||
@@ -139,11 +208,11 @@
|
|||||||
margin-right: 12px;
|
margin-right: 12px;
|
||||||
text-transform: uppercase;
|
text-transform: uppercase;
|
||||||
}
|
}
|
||||||
.type-project { background: #7b2cbf; }
|
.type-PROJECT { background: #7b2cbf; }
|
||||||
.type-tech { background: #00d4ff; color: #000; }
|
.type-TECH { background: #00d4ff; color: #000; }
|
||||||
.type-person { background: #ff6b6b; }
|
.type-PERSON { background: #ff6b6b; }
|
||||||
.type-org { background: #4ecdc4; color: #000; }
|
.type-ORG { background: #4ecdc4; color: #000; }
|
||||||
.type-other { background: #666; }
|
.type-OTHER { background: #666; }
|
||||||
.upload-overlay {
|
.upload-overlay {
|
||||||
position: fixed;
|
position: fixed;
|
||||||
top: 0;
|
top: 0;
|
||||||
@@ -164,10 +233,29 @@
|
|||||||
border-radius: 16px;
|
border-radius: 16px;
|
||||||
padding: 60px;
|
padding: 60px;
|
||||||
text-align: center;
|
text-align: center;
|
||||||
|
max-width: 500px;
|
||||||
}
|
}
|
||||||
.upload-box:hover {
|
.upload-box:hover {
|
||||||
border-color: #00d4ff;
|
border-color: #00d4ff;
|
||||||
}
|
}
|
||||||
|
.upload-tabs {
|
||||||
|
display: flex;
|
||||||
|
gap: 10px;
|
||||||
|
margin-bottom: 20px;
|
||||||
|
justify-content: center;
|
||||||
|
}
|
||||||
|
.upload-tab {
|
||||||
|
padding: 8px 16px;
|
||||||
|
background: #1a1a1a;
|
||||||
|
border: 1px solid #333;
|
||||||
|
border-radius: 6px;
|
||||||
|
cursor: pointer;
|
||||||
|
color: #888;
|
||||||
|
}
|
||||||
|
.upload-tab.active {
|
||||||
|
border-color: #00d4ff;
|
||||||
|
color: #00d4ff;
|
||||||
|
}
|
||||||
.btn {
|
.btn {
|
||||||
background: linear-gradient(90deg, #00d4ff, #7b2cbf);
|
background: linear-gradient(90deg, #00d4ff, #7b2cbf);
|
||||||
color: white;
|
color: white;
|
||||||
@@ -186,10 +274,316 @@
|
|||||||
font-size: 0.85rem;
|
font-size: 0.85rem;
|
||||||
margin-top: 0;
|
margin-top: 0;
|
||||||
}
|
}
|
||||||
|
.btn-danger {
|
||||||
|
background: #ff6b6b;
|
||||||
|
}
|
||||||
|
.btn-secondary {
|
||||||
|
background: #333;
|
||||||
|
}
|
||||||
.empty-state {
|
.empty-state {
|
||||||
text-align: center;
|
text-align: center;
|
||||||
padding: 60px 20px;
|
padding: 60px 20px;
|
||||||
}
|
}
|
||||||
|
/* Phase 2: Entity Editor Modal */
|
||||||
|
.modal-overlay {
|
||||||
|
position: fixed;
|
||||||
|
top: 0;
|
||||||
|
left: 0;
|
||||||
|
right: 0;
|
||||||
|
bottom: 0;
|
||||||
|
background: rgba(0,0,0,0.8);
|
||||||
|
display: none;
|
||||||
|
align-items: center;
|
||||||
|
justify-content: center;
|
||||||
|
z-index: 3000;
|
||||||
|
}
|
||||||
|
.modal-overlay.show {
|
||||||
|
display: flex;
|
||||||
|
}
|
||||||
|
.modal {
|
||||||
|
background: #1a1a1a;
|
||||||
|
border-radius: 12px;
|
||||||
|
padding: 24px;
|
||||||
|
width: 90%;
|
||||||
|
max-width: 500px;
|
||||||
|
max-height: 80vh;
|
||||||
|
overflow-y: auto;
|
||||||
|
}
|
||||||
|
.modal-header {
|
||||||
|
font-size: 1.2rem;
|
||||||
|
margin-bottom: 20px;
|
||||||
|
color: #fff;
|
||||||
|
}
|
||||||
|
.form-group {
|
||||||
|
margin-bottom: 16px;
|
||||||
|
}
|
||||||
|
.form-group label {
|
||||||
|
display: block;
|
||||||
|
margin-bottom: 6px;
|
||||||
|
color: #888;
|
||||||
|
font-size: 0.85rem;
|
||||||
|
}
|
||||||
|
.form-group input,
|
||||||
|
.form-group select,
|
||||||
|
.form-group textarea {
|
||||||
|
width: 100%;
|
||||||
|
padding: 10px 12px;
|
||||||
|
background: #0a0a0a;
|
||||||
|
border: 1px solid #333;
|
||||||
|
border-radius: 6px;
|
||||||
|
color: #e0e0e0;
|
||||||
|
font-size: 0.95rem;
|
||||||
|
}
|
||||||
|
.form-group input:focus,
|
||||||
|
.form-group select:focus,
|
||||||
|
.form-group textarea:focus {
|
||||||
|
outline: none;
|
||||||
|
border-color: #00d4ff;
|
||||||
|
}
|
||||||
|
.form-group textarea {
|
||||||
|
min-height: 80px;
|
||||||
|
resize: vertical;
|
||||||
|
}
|
||||||
|
.modal-actions {
|
||||||
|
display: flex;
|
||||||
|
gap: 10px;
|
||||||
|
justify-content: flex-end;
|
||||||
|
margin-top: 20px;
|
||||||
|
}
|
||||||
|
/* Phase 2: Context Menu */
|
||||||
|
.context-menu {
|
||||||
|
position: absolute;
|
||||||
|
background: #1a1a1a;
|
||||||
|
border: 1px solid #333;
|
||||||
|
border-radius: 6px;
|
||||||
|
padding: 6px 0;
|
||||||
|
z-index: 4000;
|
||||||
|
display: none;
|
||||||
|
min-width: 160px;
|
||||||
|
}
|
||||||
|
.context-menu.show {
|
||||||
|
display: block;
|
||||||
|
}
|
||||||
|
.context-menu-item {
|
||||||
|
padding: 8px 16px;
|
||||||
|
cursor: pointer;
|
||||||
|
font-size: 0.9rem;
|
||||||
|
color: #e0e0e0;
|
||||||
|
}
|
||||||
|
.context-menu-item:hover {
|
||||||
|
background: #2a2a2a;
|
||||||
|
}
|
||||||
|
.context-menu-divider {
|
||||||
|
height: 1px;
|
||||||
|
background: #333;
|
||||||
|
margin: 6px 0;
|
||||||
|
}
|
||||||
|
/* Phase 2: Relation Editor */
|
||||||
|
.relation-editor {
|
||||||
|
margin-top: 16px;
|
||||||
|
padding-top: 16px;
|
||||||
|
border-top: 1px solid #333;
|
||||||
|
}
|
||||||
|
.relation-item {
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
gap: 8px;
|
||||||
|
padding: 8px;
|
||||||
|
background: #0a0a0a;
|
||||||
|
border-radius: 4px;
|
||||||
|
margin-bottom: 8px;
|
||||||
|
font-size: 0.85rem;
|
||||||
|
}
|
||||||
|
.relation-item button {
|
||||||
|
background: transparent;
|
||||||
|
border: none;
|
||||||
|
color: #ff6b6b;
|
||||||
|
cursor: pointer;
|
||||||
|
font-size: 0.8rem;
|
||||||
|
}
|
||||||
|
/* Phase 2: Selection toolbar */
|
||||||
|
.selection-toolbar {
|
||||||
|
position: fixed;
|
||||||
|
bottom: 20px;
|
||||||
|
left: 50%;
|
||||||
|
transform: translateX(-50%);
|
||||||
|
background: #1a1a1a;
|
||||||
|
border: 1px solid #333;
|
||||||
|
border-radius: 8px;
|
||||||
|
padding: 10px 20px;
|
||||||
|
display: none;
|
||||||
|
gap: 10px;
|
||||||
|
z-index: 3500;
|
||||||
|
box-shadow: 0 4px 20px rgba(0,0,0,0.5);
|
||||||
|
}
|
||||||
|
.selection-toolbar.show {
|
||||||
|
display: flex;
|
||||||
|
}
|
||||||
|
/* Graph node styles */
|
||||||
|
.node-circle {
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
.node-label {
|
||||||
|
pointer-events: none;
|
||||||
|
}
|
||||||
|
/* Phase 3: Knowledge Base Panel */
|
||||||
|
.kb-panel {
|
||||||
|
width: 100%;
|
||||||
|
height: 100%;
|
||||||
|
display: none;
|
||||||
|
flex-direction: column;
|
||||||
|
background: #0a0a0a;
|
||||||
|
}
|
||||||
|
.kb-panel.show {
|
||||||
|
display: flex;
|
||||||
|
}
|
||||||
|
.kb-header {
|
||||||
|
padding: 16px 20px;
|
||||||
|
background: #141414;
|
||||||
|
border-bottom: 1px solid #222;
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
}
|
||||||
|
.kb-stats {
|
||||||
|
display: flex;
|
||||||
|
gap: 24px;
|
||||||
|
}
|
||||||
|
.kb-stat {
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
.kb-stat-value {
|
||||||
|
font-size: 1.5rem;
|
||||||
|
font-weight: 600;
|
||||||
|
color: #00d4ff;
|
||||||
|
}
|
||||||
|
.kb-stat-label {
|
||||||
|
font-size: 0.75rem;
|
||||||
|
color: #666;
|
||||||
|
}
|
||||||
|
.kb-content {
|
||||||
|
flex: 1;
|
||||||
|
display: flex;
|
||||||
|
overflow: hidden;
|
||||||
|
}
|
||||||
|
.kb-sidebar {
|
||||||
|
width: 200px;
|
||||||
|
background: #111;
|
||||||
|
border-right: 1px solid #222;
|
||||||
|
padding: 16px 0;
|
||||||
|
}
|
||||||
|
.kb-nav-item {
|
||||||
|
padding: 12px 20px;
|
||||||
|
cursor: pointer;
|
||||||
|
color: #888;
|
||||||
|
border-left: 3px solid transparent;
|
||||||
|
}
|
||||||
|
.kb-nav-item:hover {
|
||||||
|
background: #1a1a1a;
|
||||||
|
color: #e0e0e0;
|
||||||
|
}
|
||||||
|
.kb-nav-item.active {
|
||||||
|
background: #1a1a1a;
|
||||||
|
color: #00d4ff;
|
||||||
|
border-left-color: #00d4ff;
|
||||||
|
}
|
||||||
|
.kb-main {
|
||||||
|
flex: 1;
|
||||||
|
padding: 20px;
|
||||||
|
overflow-y: auto;
|
||||||
|
}
|
||||||
|
.kb-section {
|
||||||
|
display: none;
|
||||||
|
}
|
||||||
|
.kb-section.active {
|
||||||
|
display: block;
|
||||||
|
}
|
||||||
|
.kb-entity-grid {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
|
||||||
|
gap: 16px;
|
||||||
|
}
|
||||||
|
.kb-entity-card {
|
||||||
|
background: #141414;
|
||||||
|
border: 1px solid #222;
|
||||||
|
border-radius: 8px;
|
||||||
|
padding: 16px;
|
||||||
|
cursor: pointer;
|
||||||
|
transition: all 0.2s;
|
||||||
|
}
|
||||||
|
.kb-entity-card:hover {
|
||||||
|
border-color: #00d4ff;
|
||||||
|
}
|
||||||
|
.kb-entity-name {
|
||||||
|
font-weight: 600;
|
||||||
|
margin-bottom: 4px;
|
||||||
|
}
|
||||||
|
.kb-entity-def {
|
||||||
|
font-size: 0.85rem;
|
||||||
|
color: #888;
|
||||||
|
margin-bottom: 8px;
|
||||||
|
}
|
||||||
|
.kb-entity-meta {
|
||||||
|
font-size: 0.75rem;
|
||||||
|
color: #666;
|
||||||
|
}
|
||||||
|
.kb-glossary-item {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
padding: 12px 16px;
|
||||||
|
background: #141414;
|
||||||
|
border-radius: 6px;
|
||||||
|
margin-bottom: 8px;
|
||||||
|
}
|
||||||
|
.kb-transcript-item {
|
||||||
|
padding: 12px 16px;
|
||||||
|
background: #141414;
|
||||||
|
border-radius: 6px;
|
||||||
|
margin-bottom: 8px;
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
}
|
||||||
|
.file-type-icon {
|
||||||
|
padding: 4px 8px;
|
||||||
|
border-radius: 4px;
|
||||||
|
font-size: 0.7rem;
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
.type-audio { background: #7b2cbf; }
|
||||||
|
.type-document { background: #00d4ff; color: #000; }
|
||||||
|
/* Transcript selector */
|
||||||
|
.transcript-selector {
|
||||||
|
position: relative;
|
||||||
|
}
|
||||||
|
.transcript-dropdown {
|
||||||
|
position: absolute;
|
||||||
|
top: 100%;
|
||||||
|
right: 0;
|
||||||
|
background: #1a1a1a;
|
||||||
|
border: 1px solid #333;
|
||||||
|
border-radius: 8px;
|
||||||
|
min-width: 200px;
|
||||||
|
max-height: 300px;
|
||||||
|
overflow-y: auto;
|
||||||
|
display: none;
|
||||||
|
z-index: 100;
|
||||||
|
}
|
||||||
|
.transcript-dropdown.show {
|
||||||
|
display: block;
|
||||||
|
}
|
||||||
|
.transcript-option {
|
||||||
|
padding: 10px 16px;
|
||||||
|
cursor: pointer;
|
||||||
|
border-bottom: 1px solid #222;
|
||||||
|
}
|
||||||
|
.transcript-option:hover {
|
||||||
|
background: #2a2a2a;
|
||||||
|
}
|
||||||
|
.transcript-option.active {
|
||||||
|
background: #00d4ff22;
|
||||||
|
}
|
||||||
</style>
|
</style>
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
@@ -198,19 +592,40 @@
|
|||||||
<a href="/" class="back-link">← 返回项目列表</a>
|
<a href="/" class="back-link">← 返回项目列表</a>
|
||||||
<span class="project-name" id="projectName">加载中...</span>
|
<span class="project-name" id="projectName">加载中...</span>
|
||||||
</div>
|
</div>
|
||||||
<button class="btn btn-small" onclick="showUpload()">+ 上传音频</button>
|
<div class="header-actions">
|
||||||
|
<button class="btn btn-small" onclick="showUpload()">+ 上传文件</button>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div class="main">
|
<div class="main">
|
||||||
|
<!-- Sidebar -->
|
||||||
|
<div class="sidebar">
|
||||||
|
<button class="sidebar-btn active" onclick="switchView('workbench')" title="工作台">📝</button>
|
||||||
|
<button class="sidebar-btn" onclick="switchView('knowledge-base')" title="知识库">📚</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Content Area -->
|
||||||
|
<div class="content-area">
|
||||||
|
<!-- Workbench View -->
|
||||||
|
<div id="workbenchView" class="workbench-view" style="display: flex; width: 100%;">
|
||||||
<div class="editor-panel">
|
<div class="editor-panel">
|
||||||
<div class="panel-header">
|
<div class="panel-header">
|
||||||
|
<div style="display: flex; align-items: center; gap: 12px;">
|
||||||
<span>📄 转录文本</span>
|
<span>📄 转录文本</span>
|
||||||
<span style="font-size:0.8rem;color:#666;">点击实体高亮</span>
|
<div class="transcript-selector">
|
||||||
|
<button class="btn-icon" onclick="toggleTranscriptDropdown()">📁 选择文件</button>
|
||||||
|
<div class="transcript-dropdown" id="transcriptDropdown"></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="panel-actions">
|
||||||
|
<button class="btn-icon" onclick="toggleEditMode()" id="editBtn">✏️ 编辑</button>
|
||||||
|
<button class="btn-icon" onclick="saveTranscript()" id="saveBtn" style="display:none;">💾 保存</button>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="transcript-content" id="transcriptContent">
|
<div class="transcript-content" id="transcriptContent">
|
||||||
<div class="empty-state">
|
<div class="empty-state">
|
||||||
<p style="color:#666;">暂无转录内容</p>
|
<p style="color:#666;">暂无转录内容</p>
|
||||||
<button class="btn" onclick="showUpload()">上传音频</button>
|
<button class="btn" onclick="showUpload()">上传音频或文档</button>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
@@ -218,7 +633,7 @@
|
|||||||
<div class="graph-panel">
|
<div class="graph-panel">
|
||||||
<div class="panel-header">
|
<div class="panel-header">
|
||||||
<span>🔗 知识图谱</span>
|
<span>🔗 知识图谱</span>
|
||||||
<span style="font-size:0.8rem;color:#666;">拖拽节点查看关系</span>
|
<span style="font-size:0.8rem;color:#666;">右键节点编辑 | 拖拽建立关系</span>
|
||||||
</div>
|
</div>
|
||||||
<svg id="graph-svg"></svg>
|
<svg id="graph-svg"></svg>
|
||||||
<div class="entity-list" id="entityList">
|
<div class="entity-list" id="entityList">
|
||||||
@@ -228,15 +643,203 @@
|
|||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<!-- Knowledge Base View -->
|
||||||
|
<div id="knowledgeBaseView" class="kb-panel">
|
||||||
|
<div class="kb-header">
|
||||||
|
<h2>📚 项目知识库</h2>
|
||||||
|
<div class="kb-stats">
|
||||||
|
<div class="kb-stat">
|
||||||
|
<div class="kb-stat-value" id="kbEntityCount">0</div>
|
||||||
|
<div class="kb-stat-label">实体</div>
|
||||||
|
</div>
|
||||||
|
<div class="kb-stat">
|
||||||
|
<div class="kb-stat-value" id="kbRelationCount">0</div>
|
||||||
|
<div class="kb-stat-label">关系</div>
|
||||||
|
</div>
|
||||||
|
<div class="kb-stat">
|
||||||
|
<div class="kb-stat-value" id="kbTranscriptCount">0</div>
|
||||||
|
<div class="kb-stat-label">文件</div>
|
||||||
|
</div>
|
||||||
|
<div class="kb-stat">
|
||||||
|
<div class="kb-stat-value" id="kbGlossaryCount">0</div>
|
||||||
|
<div class="kb-stat-label">术语</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="kb-content">
|
||||||
|
<div class="kb-sidebar">
|
||||||
|
<div class="kb-nav-item active" onclick="switchKBTab('entities')">🏷️ 实体</div>
|
||||||
|
<div class="kb-nav-item" onclick="switchKBTab('relations')">🔗 关系</div>
|
||||||
|
<div class="kb-nav-item" onclick="switchKBTab('glossary')">📖 术语表</div>
|
||||||
|
<div class="kb-nav-item" onclick="switchKBTab('transcripts')">📁 文件</div>
|
||||||
|
</div>
|
||||||
|
<div class="kb-main">
|
||||||
|
<!-- Entities Section -->
|
||||||
|
<div class="kb-section active" id="kbEntitiesSection">
|
||||||
|
<h3 style="margin-bottom:16px;">所有实体</h3>
|
||||||
|
<div class="kb-entity-grid" id="kbEntityGrid"></div>
|
||||||
|
</div>
|
||||||
|
<!-- Relations Section -->
|
||||||
|
<div class="kb-section" id="kbRelationsSection">
|
||||||
|
<h3 style="margin-bottom:16px;">所有关系</h3>
|
||||||
|
<div id="kbRelationsList"></div>
|
||||||
|
</div>
|
||||||
|
<!-- Glossary Section -->
|
||||||
|
<div class="kb-section" id="kbGlossarySection">
|
||||||
|
<div style="display:flex;justify-content:space-between;align-items:center;margin-bottom:16px;">
|
||||||
|
<h3>术语表</h3>
|
||||||
|
<button class="btn btn-small" onclick="showAddTermModal()">+ 添加术语</button>
|
||||||
|
</div>
|
||||||
|
<div id="kbGlossaryList"></div>
|
||||||
|
</div>
|
||||||
|
<!-- Transcripts Section -->
|
||||||
|
<div class="kb-section" id="kbTranscriptsSection">
|
||||||
|
<h3 style="margin-bottom:16px;">所有文件</h3>
|
||||||
|
<div id="kbTranscriptsList"></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Upload Modal -->
|
||||||
<div class="upload-overlay" id="uploadOverlay">
|
<div class="upload-overlay" id="uploadOverlay">
|
||||||
<div class="upload-box">
|
<div class="upload-box">
|
||||||
<h2 style="margin-bottom:10px;">上传音频分析</h2>
|
<h2 style="margin-bottom:10px;">上传文件</h2>
|
||||||
<p style="color:#666;">支持 MP3, WAV, M4A (最大 500MB)</p>
|
<div class="upload-tabs">
|
||||||
<input type="file" id="fileInput" accept="audio/*" hidden>
|
<div class="upload-tab active" onclick="switchUploadTab('audio')">🎵 音频</div>
|
||||||
<button class="btn" onclick="document.getElementById('fileInput').click()">选择文件</button>
|
<div class="upload-tab" onclick="switchUploadTab('document')">📄 文档</div>
|
||||||
<br><br>
|
|
||||||
<button class="btn" style="background:#333;" onclick="hideUpload()">取消</button>
|
|
||||||
</div>
|
</div>
|
||||||
|
<p style="color:#666;" id="uploadHint">支持 MP3, WAV, M4A (最大 500MB)</p>
|
||||||
|
<input type="file" id="fileInput" accept="audio/*" hidden>
|
||||||
|
<input type="file" id="docInput" accept=".pdf,.docx,.doc,.txt,.md" hidden>
|
||||||
|
<button class="btn" onclick="triggerFileSelect()">选择文件</button>
|
||||||
|
<br><br>
|
||||||
|
<button class="btn btn-secondary" onclick="hideUpload()">取消</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Entity Editor Modal -->
|
||||||
|
<div class="modal-overlay" id="entityModal">
|
||||||
|
<div class="modal">
|
||||||
|
<h3 class="modal-header">编辑实体</h3>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>实体名称</label>
|
||||||
|
<input type="text" id="entityName" placeholder="实体名称">
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>实体类型</label>
|
||||||
|
<select id="entityType">
|
||||||
|
<option value="PROJECT">项目 (PROJECT)</option>
|
||||||
|
<option value="TECH">技术 (TECH)</option>
|
||||||
|
<option value="PERSON">人物 (PERSON)</option>
|
||||||
|
<option value="ORG">组织 (ORG)</option>
|
||||||
|
<option value="OTHER">其他 (OTHER)</option>
|
||||||
|
</select>
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>定义描述</label>
|
||||||
|
<textarea id="entityDefinition" placeholder="一句话描述这个实体..."></textarea>
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>别名 (用逗号分隔)</label>
|
||||||
|
<input type="text" id="entityAliases" placeholder="别名1, 别名2, 别名3">
|
||||||
|
</div>
|
||||||
|
<div class="relation-editor" id="relationEditor" style="display:none;">
|
||||||
|
<h4 style="color:#888;font-size:0.9rem;margin-bottom:12px;">实体关系</h4>
|
||||||
|
<div id="relationList"></div>
|
||||||
|
<button class="btn-icon" onclick="showAddRelation()" style="margin-top:8px;">+ 添加关系</button>
|
||||||
|
</div>
|
||||||
|
<div class="modal-actions">
|
||||||
|
<button class="btn btn-danger" onclick="deleteEntity()" id="deleteEntityBtn">删除</button>
|
||||||
|
<button class="btn btn-secondary" onclick="hideEntityModal()">取消</button>
|
||||||
|
<button class="btn" onclick="saveEntity()">保存</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Add Relation Modal -->
|
||||||
|
<div class="modal-overlay" id="relationModal">
|
||||||
|
<div class="modal">
|
||||||
|
<h3 class="modal-header">添加关系</h3>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>目标实体</label>
|
||||||
|
<select id="relationTarget"></select>
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>关系类型</label>
|
||||||
|
<select id="relationType">
|
||||||
|
<option value="belongs_to">属于 (belongs_to)</option>
|
||||||
|
<option value="works_with">合作 (works_with)</option>
|
||||||
|
<option value="depends_on">依赖 (depends_on)</option>
|
||||||
|
<option value="mentions">提及 (mentions)</option>
|
||||||
|
<option value="related">相关 (related)</option>
|
||||||
|
</select>
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>关系证据/说明</label>
|
||||||
|
<textarea id="relationEvidence" placeholder="描述这个关系的依据..."></textarea>
|
||||||
|
</div>
|
||||||
|
<div class="modal-actions">
|
||||||
|
<button class="btn btn-secondary" onclick="hideRelationModal()">取消</button>
|
||||||
|
<button class="btn" onclick="saveRelation()">添加</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Merge Entities Modal -->
|
||||||
|
<div class="modal-overlay" id="mergeModal">
|
||||||
|
<div class="modal">
|
||||||
|
<h3 class="modal-header">合并实体</h3>
|
||||||
|
<p style="color:#888;margin-bottom:16px;font-size:0.9rem;">将选中的实体合并到目标实体中</p>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>源实体</label>
|
||||||
|
<input type="text" id="mergeSource" disabled>
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>目标实体 (保留)</label>
|
||||||
|
<select id="mergeTarget"></select>
|
||||||
|
</div>
|
||||||
|
<div class="modal-actions">
|
||||||
|
<button class="btn btn-secondary" onclick="hideMergeModal()">取消</button>
|
||||||
|
<button class="btn" onclick="confirmMerge()">合并</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Add Glossary Term Modal -->
|
||||||
|
<div class="modal-overlay" id="glossaryModal">
|
||||||
|
<div class="modal">
|
||||||
|
<h3 class="modal-header">添加术语</h3>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>术语</label>
|
||||||
|
<input type="text" id="glossaryTerm" placeholder="术语名称">
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label>发音提示 (可选)</label>
|
||||||
|
<input type="text" id="glossaryPronunciation" placeholder="如: K8s 发音为 Kubernetes">
|
||||||
|
</div>
|
||||||
|
<div class="modal-actions">
|
||||||
|
<button class="btn btn-secondary" onclick="hideGlossaryModal()">取消</button>
|
||||||
|
<button class="btn" onclick="saveGlossaryTerm()">添加</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Context Menu -->
|
||||||
|
<div class="context-menu" id="contextMenu">
|
||||||
|
<div class="context-menu-item" onclick="editEntity()">✏️ 编辑实体</div>
|
||||||
|
<div class="context-menu-item" onclick="showMergeModal()">🔄 合并实体</div>
|
||||||
|
<div class="context-menu-divider"></div>
|
||||||
|
<div class="context-menu-item" onclick="createEntityFromSelection()">➕ 标记为实体</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Selection Toolbar -->
|
||||||
|
<div class="selection-toolbar" id="selectionToolbar">
|
||||||
|
<span style="color:#888;font-size:0.85rem;">选中文本:</span>
|
||||||
|
<button class="btn-icon" onclick="createEntityFromSelection()">标记为实体</button>
|
||||||
|
<button class="btn-icon" onclick="hideSelectionToolbar()">取消</button>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<script src="app.js"></script>
|
<script src="app.js"></script>
|
||||||
|
|||||||
Reference in New Issue
Block a user