Phase 3: Memory & Growth - Multi-file fusion, Entity alignment with embedding, Document import, Knowledge base panel

This commit is contained in:
OpenClaw Bot
2026-02-18 12:12:39 +08:00
parent 643fe46780
commit da8a4db985
11 changed files with 1842 additions and 167 deletions

View File

@@ -1,29 +1,33 @@
# InsightFlow - Audio to Knowledge Graph Platform
# Phase 3: Memory & Growth
FROM python:3.11-slim FROM python:3.11-slim
WORKDIR /app WORKDIR /app
# Install uv # Install system dependencies
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
# Install system deps
RUN apt-get update && apt-get install -y \ RUN apt-get update && apt-get install -y \
ffmpeg \ gcc \
git \ libpq-dev \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
# Copy project files # Copy backend requirements
COPY backend/pyproject.toml backend/uv.lock ./ COPY backend/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Install dependencies using uv sync # Copy application code
RUN uv sync --frozen --no-install-project
# Copy code
COPY backend/ ./backend/ COPY backend/ ./backend/
COPY frontend/ ./frontend/ COPY frontend/ ./frontend/
# Install project # Create data directory
RUN uv sync --frozen RUN mkdir -p /app/data
# Set environment variables
ENV PYTHONPATH=/app
ENV DB_PATH=/app/data/insightflow.db
# Expose port
EXPOSE 8000 EXPOSE 8000
CMD ["uv", "run", "python", "backend/main.py"] # Run the application
CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]

103
README.md
View File

@@ -1,27 +1,88 @@
# InsightFlow # InsightFlow - Audio to Knowledge Graph Platform
音频与文档的领域知识构建平台 ## Phase 3: Memory & Growth - Completed ✅
## 产品定位 ### 新增功能
将会议录音和文档转化为结构化的知识图谱,通过人机回圈(Human-in-the-Loop)实现知识持续生长。
## 核心特性 #### 1. 多文件图谱融合 ✅
- 🎙️ ASR 语音识别 + 热词注入 - 支持上传多个音频文件到同一项目
- 🧠 LLM 实体抽取与解释 - 系统自动对齐实体,合并图谱
- 🔗 双视图联动(文档视图 + 图谱视图) - 实体提及跨文件追踪
- 📈 知识生长(多文件实体对齐) - 文件选择器切换不同转录内容
## 技术栈 #### 2. 实体对齐算法优化 ✅
- 前端: Next.js + Tailwind - 新增 `entity_aligner.py` 模块
- 后端: Node.js / Python - 支持使用 Kimi API embedding 进行语义相似度匹配
- 数据库: MySQL + Neo4j - 余弦相似度计算
- ASR: Whisper - 自动别名建议
- LLM: OpenAI / Kimi - 批量实体对齐 API
## 开发阶段 #### 3. PDF/DOCX 文档导入 ✅
- [ ] Phase 1: 骨架与单体分析 (MVP) - 新增 `document_processor.py` 模块
- [ ] Phase 2: 交互与纠错工作台 - 支持 PDF、DOCX、TXT、MD 格式
- [ ] Phase 3: 记忆与生长 - 文档文本提取并参与实体提取
- 文档类型标记(音频/文档)
## 文档 #### 4. 项目知识库面板 ✅
- [PRD v2.0](docs/PRD-v2.0.md) - 全新的知识库视图
- 统计面板:实体数、关系数、文件数、术语数
- 实体网格展示(带提及统计)
- 关系列表展示
- 术语表管理(添加/删除)
- 文件列表展示
### 技术栈
- 后端: FastAPI + SQLite
- 前端: 原生 HTML/JS + D3.js
- ASR: 阿里云听悟
- LLM: Kimi API
- 文档处理: PyPDF2, python-docx
### 部署
```bash
# 构建 Docker 镜像
docker build -t insightflow:phase3 .
# 运行容器
docker run -d \
-p 18000:8000 \
-v /opt/data:/app/data \
-e KIMI_API_KEY=your_key \
-e ALIYUN_ACCESS_KEY_ID=your_key \
-e ALIYUN_ACCESS_KEY_SECRET=your_secret \
insightflow:phase3
```
### API 文档
#### 新增 API
**文档上传**
```
POST /api/v1/projects/{project_id}/upload-document
Content-Type: multipart/form-data
file: <文件>
```
**知识库查询**
```
GET /api/v1/projects/{project_id}/knowledge-base
```
**术语表管理**
```
POST /api/v1/projects/{project_id}/glossary
GET /api/v1/projects/{project_id}/glossary
DELETE /api/v1/glossary/{term_id}
```
**实体对齐**
```
POST /api/v1/projects/{project_id}/align-entities?threshold=0.85
```
### 数据库 Schema 更新
- `transcripts` 表新增 `type` 字段audio/document
- `entities` 表新增 `embedding` 字段
- 新增索引优化查询性能

View File

@@ -4,7 +4,7 @@
## 当前阶段 ## 当前阶段
Phase 2: 交互与纠错工作台 - **已完成 ✅** Phase 3: 记忆与生长 - **已完成 ✅**
## 已完成 ## 已完成
@@ -64,22 +64,70 @@ Phase 2: 交互与纠错工作台 - **已完成 ✅**
- ✅ update_relation() - 更新关系 - ✅ update_relation() - 更新关系
- ✅ update_transcript() - 更新转录文本 - ✅ update_transcript() - 更新转录文本
## Phase 3 计划 (记忆与生长) - **即将开始** ### Phase 3: 记忆与生长 ✅
- 多文件图谱融合 #### 多文件图谱融合
- 实体对齐算法优化 - ✅ 支持上传多个音频文件到同一项目
- PDF/DOCX 文档导入 - ✅ 系统自动对齐实体,合并图谱
- 项目知识库面板 - ✅ 实体提及跨文件追踪
- ✅ 文件选择器切换不同转录内容
- ✅ 转录列表 API 返回文件类型
#### 实体对齐算法优化
- ✅ 新增 `entity_aligner.py` 模块
- ✅ 使用 Kimi API embedding 进行语义相似度匹配
- ✅ 余弦相似度计算
- ✅ 自动别名建议
- ✅ 批量实体对齐 API
- ✅ 实体对齐回退机制(字符串匹配)
#### PDF/DOCX 文档导入
- ✅ 新增 `document_processor.py` 模块
- ✅ 支持 PDF、DOCX、TXT、MD 格式
- ✅ 文档文本提取并参与实体提取
- ✅ 文档上传 API (/api/v1/projects/{id}/upload-document)
- ✅ 文档类型标记audio/document
#### 项目知识库面板
- ✅ 全新的知识库视图
- ✅ 侧边栏导航切换(工作台/知识库)
- ✅ 统计面板:实体数、关系数、文件数、术语数
- ✅ 实体网格展示(带提及统计)
- ✅ 关系列表展示
- ✅ 术语表管理(添加/删除)
- ✅ 文件列表展示(区分音频/文档)
#### 术语表功能
- ✅ 术语表数据库表 (glossary)
- ✅ 添加术语 API
- ✅ 获取术语列表 API
- ✅ 删除术语 API
- ✅ 前端术语表管理界面
#### 数据库更新
- ✅ transcripts 表新增 `type` 字段
- ✅ entities 表新增 `embedding` 字段
- ✅ 新增 glossary 表
- ✅ 新增索引优化查询性能
## 技术债务 ## 技术债务
- 听悟 SDK fallback 到 mock 需要更好的错误处理 - 听悟 SDK fallback 到 mock 需要更好的错误处理
- 实体相似度匹配目前只是简单字符串包含,需要 embedding 方案
- 前端需要状态管理(目前使用全局变量) - 前端需要状态管理(目前使用全局变量)
- 需要添加 API 文档 (OpenAPI/Swagger) - 需要添加 API 文档 (OpenAPI/Swagger)
- Embedding 缓存需要持久化
- 实体对齐算法需要更多测试
## 部署信息 ## 部署信息
- 服务器: 122.51.127.111 - 服务器: 122.51.127.111
- 项目路径: /opt/projects/insightflow - 项目路径: /opt/projects/insightflow
- 端口: 18000 - 端口: 18000
- Docker 镜像: insightflow:phase3
## 下一步 (Phase 4)
- 知识推理与问答
- 实体属性扩展
- 时间线视图
- 导出功能PDF/图片)

View File

@@ -1,7 +1,8 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
""" """
InsightFlow Database Manager InsightFlow Database Manager - Phase 3
处理项目、实体、关系的持久化 处理项目、实体、关系的持久化
支持文档类型和多文件融合
""" """
import os import os
@@ -166,6 +167,18 @@ class DatabaseManager:
(target_id, source_id) (target_id, source_id)
) )
# 更新关系 - source 作为 source_entity_id
conn.execute(
"UPDATE entity_relations SET source_entity_id = ? WHERE source_entity_id = ?",
(target_id, source_id)
)
# 更新关系 - source 作为 target_entity_id
conn.execute(
"UPDATE entity_relations SET target_entity_id = ? WHERE target_entity_id = ?",
(target_id, source_id)
)
# 删除源实体 # 删除源实体
conn.execute("DELETE FROM entities WHERE id = ?", (source_id,)) conn.execute("DELETE FROM entities WHERE id = ?", (source_id,))
@@ -222,13 +235,13 @@ class DatabaseManager:
return [EntityMention(**dict(r)) for r in rows] return [EntityMention(**dict(r)) for r in rows]
# Transcript operations # Transcript operations
def save_transcript(self, transcript_id: str, project_id: str, filename: str, full_text: str): def save_transcript(self, transcript_id: str, project_id: str, filename: str, full_text: str, transcript_type: str = "audio"):
"""保存转录记录""" """保存转录记录"""
conn = self.get_conn() conn = self.get_conn()
now = datetime.now().isoformat() now = datetime.now().isoformat()
conn.execute( conn.execute(
"INSERT INTO transcripts (id, project_id, filename, full_text, created_at) VALUES (?, ?, ?, ?, ?)", "INSERT INTO transcripts (id, project_id, filename, full_text, type, created_at) VALUES (?, ?, ?, ?, ?, ?)",
(transcript_id, project_id, filename, full_text, now) (transcript_id, project_id, filename, full_text, transcript_type, now)
) )
conn.commit() conn.commit()
conn.close() conn.close()
@@ -389,6 +402,58 @@ class DatabaseManager:
return dict(row) if row else None return dict(row) if row else None
# Phase 3: Glossary operations
def add_glossary_term(self, project_id: str, term: str, pronunciation: str = "") -> str:
"""添加术语到术语表"""
conn = self.get_conn()
# 检查是否已存在
existing = conn.execute(
"SELECT * FROM glossary WHERE project_id = ? AND term = ?",
(project_id, term)
).fetchone()
if existing:
# 更新频率
conn.execute(
"UPDATE glossary SET frequency = frequency + 1 WHERE id = ?",
(existing['id'],)
)
conn.commit()
conn.close()
return existing['id']
term_id = str(uuid.uuid4())[:8]
conn.execute(
"INSERT INTO glossary (id, project_id, term, pronunciation, frequency) VALUES (?, ?, ?, ?, ?)",
(term_id, project_id, term, pronunciation, 1)
)
conn.commit()
conn.close()
return term_id
def list_glossary(self, project_id: str) -> List[dict]:
"""列出项目术语表"""
conn = self.get_conn()
rows = conn.execute(
"SELECT * FROM glossary WHERE project_id = ? ORDER BY frequency DESC",
(project_id,)
).fetchall()
conn.close()
return [dict(r) for r in rows]
def delete_glossary_term(self, term_id: str):
"""删除术语"""
conn = self.get_conn()
conn.execute("DELETE FROM glossary WHERE id = ?", (term_id,))
conn.commit()
conn.close()
# Phase 3: Get all entities for embedding
def get_all_entities_for_embedding(self, project_id: str) -> List[Entity]:
"""获取所有实体用于 embedding 计算"""
return self.list_project_entities(project_id)
# Singleton instance # Singleton instance
_db_manager = None _db_manager = None

View File

@@ -0,0 +1,180 @@
#!/usr/bin/env python3
"""
Document Processor - Phase 3
支持 PDF 和 DOCX 文档导入
"""
import os
import io
from typing import Dict, Optional
class DocumentProcessor:
"""文档处理器 - 提取 PDF/DOCX 文本"""
def __init__(self):
self.supported_formats = {
'.pdf': self._extract_pdf,
'.docx': self._extract_docx,
'.doc': self._extract_docx,
'.txt': self._extract_txt,
'.md': self._extract_txt,
}
def process(self, content: bytes, filename: str) -> Dict[str, str]:
"""
处理文档并提取文本
Args:
content: 文件二进制内容
filename: 文件名
Returns:
{"text": "提取的文本内容", "format": "文件格式"}
"""
ext = os.path.splitext(filename.lower())[1]
if ext not in self.supported_formats:
raise ValueError(f"Unsupported file format: {ext}. Supported: {list(self.supported_formats.keys())}")
extractor = self.supported_formats[ext]
text = extractor(content)
# 清理文本
text = self._clean_text(text)
return {
"text": text,
"format": ext,
"filename": filename
}
def _extract_pdf(self, content: bytes) -> str:
"""提取 PDF 文本"""
try:
import PyPDF2
pdf_file = io.BytesIO(content)
reader = PyPDF2.PdfReader(pdf_file)
text_parts = []
for page in reader.pages:
page_text = page.extract_text()
if page_text:
text_parts.append(page_text)
return "\n\n".join(text_parts)
except ImportError:
# Fallback: 尝试使用 pdfplumber
try:
import pdfplumber
text_parts = []
with pdfplumber.open(io.BytesIO(content)) as pdf:
for page in pdf.pages:
page_text = page.extract_text()
if page_text:
text_parts.append(page_text)
return "\n\n".join(text_parts)
except ImportError:
raise ImportError("PDF processing requires PyPDF2 or pdfplumber. Install with: pip install PyPDF2")
except Exception as e:
raise ValueError(f"PDF extraction failed: {str(e)}")
def _extract_docx(self, content: bytes) -> str:
"""提取 DOCX 文本"""
try:
import docx
doc_file = io.BytesIO(content)
doc = docx.Document(doc_file)
text_parts = []
for para in doc.paragraphs:
if para.text.strip():
text_parts.append(para.text)
# 提取表格中的文本
for table in doc.tables:
for row in table.rows:
row_text = []
for cell in row.cells:
if cell.text.strip():
row_text.append(cell.text.strip())
if row_text:
text_parts.append(" | ".join(row_text))
return "\n\n".join(text_parts)
except ImportError:
raise ImportError("DOCX processing requires python-docx. Install with: pip install python-docx")
except Exception as e:
raise ValueError(f"DOCX extraction failed: {str(e)}")
def _extract_txt(self, content: bytes) -> str:
"""提取纯文本"""
# 尝试多种编码
encodings = ['utf-8', 'gbk', 'gb2312', 'latin-1']
for encoding in encodings:
try:
return content.decode(encoding)
except UnicodeDecodeError:
continue
# 如果都失败了,使用 latin-1 并忽略错误
return content.decode('latin-1', errors='ignore')
def _clean_text(self, text: str) -> str:
"""清理提取的文本"""
if not text:
return ""
# 移除多余的空白字符
lines = text.split('\n')
cleaned_lines = []
for line in lines:
line = line.strip()
# 移除空行,但保留段落分隔
if line:
cleaned_lines.append(line)
# 合并行,保留段落结构
text = '\n\n'.join(cleaned_lines)
# 移除多余的空格
text = ' '.join(text.split())
# 移除控制字符
text = ''.join(char for char in text if ord(char) >= 32 or char in '\n\r\t')
return text.strip()
def is_supported(self, filename: str) -> bool:
"""检查文件格式是否支持"""
ext = os.path.splitext(filename.lower())[1]
return ext in self.supported_formats
# 简单的文本提取器(不需要外部依赖)
class SimpleTextExtractor:
"""简单的文本提取器,用于测试"""
def extract(self, content: bytes, filename: str) -> str:
"""尝试提取文本"""
encodings = ['utf-8', 'gbk', 'latin-1']
for encoding in encodings:
try:
return content.decode(encoding)
except UnicodeDecodeError:
continue
return content.decode('latin-1', errors='ignore')
if __name__ == "__main__":
# 测试
processor = DocumentProcessor()
# 测试文本提取
test_text = "Hello World\n\nThis is a test document.\n\nMultiple paragraphs."
result = processor.process(test_text.encode('utf-8'), "test.txt")
print(f"Text extraction test: {len(result['text'])} chars")
print(result['text'][:100])

372
backend/entity_aligner.py Normal file
View File

@@ -0,0 +1,372 @@
#!/usr/bin/env python3
"""
Entity Aligner - Phase 3
使用 embedding 进行实体对齐
"""
import os
import json
import httpx
import numpy as np
from typing import List, Optional, Dict
from dataclasses import dataclass
# API Keys
KIMI_API_KEY = os.getenv("KIMI_API_KEY", "")
KIMI_BASE_URL = os.getenv("KIMI_BASE_URL", "https://api.kimi.com/coding")
@dataclass
class EntityEmbedding:
entity_id: str
name: str
definition: str
embedding: List[float]
class EntityAligner:
"""实体对齐器 - 使用 embedding 进行相似度匹配"""
def __init__(self, similarity_threshold: float = 0.85):
self.similarity_threshold = similarity_threshold
self.embedding_cache: Dict[str, List[float]] = {}
def get_embedding(self, text: str) -> Optional[List[float]]:
"""
使用 Kimi API 获取文本的 embedding
Args:
text: 输入文本
Returns:
embedding 向量或 None
"""
if not KIMI_API_KEY:
return None
# 检查缓存
cache_key = hash(text)
if cache_key in self.embedding_cache:
return self.embedding_cache[cache_key]
try:
response = httpx.post(
f"{KIMI_BASE_URL}/v1/embeddings",
headers={"Authorization": f"Bearer {KIMI_API_KEY}", "Content-Type": "application/json"},
json={
"model": "k2p5",
"input": text[:500] # 限制长度
},
timeout=30.0
)
response.raise_for_status()
result = response.json()
embedding = result["data"][0]["embedding"]
self.embedding_cache[cache_key] = embedding
return embedding
except Exception as e:
print(f"Embedding API failed: {e}")
return None
def compute_similarity(self, embedding1: List[float], embedding2: List[float]) -> float:
"""
计算两个 embedding 的余弦相似度
Args:
embedding1: 第一个向量
embedding2: 第二个向量
Returns:
相似度分数 (0-1)
"""
vec1 = np.array(embedding1)
vec2 = np.array(embedding2)
# 余弦相似度
dot_product = np.dot(vec1, vec2)
norm1 = np.linalg.norm(vec1)
norm2 = np.linalg.norm(vec2)
if norm1 == 0 or norm2 == 0:
return 0.0
return float(dot_product / (norm1 * norm2))
def get_entity_text(self, name: str, definition: str = "") -> str:
"""
构建用于 embedding 的实体文本
Args:
name: 实体名称
definition: 实体定义
Returns:
组合文本
"""
if definition:
return f"{name}: {definition}"
return name
def find_similar_entity(
self,
project_id: str,
name: str,
definition: str = "",
exclude_id: Optional[str] = None,
threshold: Optional[float] = None
) -> Optional[object]:
"""
查找相似的实体
Args:
project_id: 项目 ID
name: 实体名称
definition: 实体定义
exclude_id: 要排除的实体 ID
threshold: 相似度阈值
Returns:
相似的实体或 None
"""
if threshold is None:
threshold = self.similarity_threshold
try:
from db_manager import get_db_manager
db = get_db_manager()
except ImportError:
return None
# 获取项目的所有实体
entities = db.get_all_entities_for_embedding(project_id)
if not entities:
return None
# 获取查询实体的 embedding
query_text = self.get_entity_text(name, definition)
query_embedding = self.get_embedding(query_text)
if query_embedding is None:
# 如果 embedding API 失败,回退到简单匹配
return self._fallback_similarity_match(entities, name, exclude_id)
best_match = None
best_score = threshold
for entity in entities:
if exclude_id and entity.id == exclude_id:
continue
# 获取实体的 embedding
entity_text = self.get_entity_text(entity.name, entity.definition)
entity_embedding = self.get_embedding(entity_text)
if entity_embedding is None:
continue
# 计算相似度
similarity = self.compute_similarity(query_embedding, entity_embedding)
if similarity > best_score:
best_score = similarity
best_match = entity
return best_match
def _fallback_similarity_match(
self,
entities: List[object],
name: str,
exclude_id: Optional[str] = None
) -> Optional[object]:
"""
回退到简单的相似度匹配(不使用 embedding
Args:
entities: 实体列表
name: 查询名称
exclude_id: 要排除的实体 ID
Returns:
最相似的实体或 None
"""
name_lower = name.lower()
# 1. 精确匹配
for entity in entities:
if exclude_id and entity.id == exclude_id:
continue
if entity.name.lower() == name_lower:
return entity
if entity.aliases and name_lower in [a.lower() for a in entity.aliases]:
return entity
# 2. 包含匹配
for entity in entities:
if exclude_id and entity.id == exclude_id:
continue
if name_lower in entity.name.lower() or entity.name.lower() in name_lower:
return entity
return None
def batch_align_entities(
self,
project_id: str,
new_entities: List[Dict],
threshold: Optional[float] = None
) -> List[Dict]:
"""
批量对齐实体
Args:
project_id: 项目 ID
new_entities: 新实体列表 [{"name": "...", "definition": "..."}]
threshold: 相似度阈值
Returns:
对齐结果列表 [{"new_entity": {...}, "matched_entity": {...}, "similarity": 0.9}]
"""
if threshold is None:
threshold = self.similarity_threshold
results = []
for new_ent in new_entities:
matched = self.find_similar_entity(
project_id,
new_ent["name"],
new_ent.get("definition", ""),
threshold=threshold
)
result = {
"new_entity": new_ent,
"matched_entity": None,
"similarity": 0.0,
"should_merge": False
}
if matched:
# 计算相似度
query_text = self.get_entity_text(new_ent["name"], new_ent.get("definition", ""))
matched_text = self.get_entity_text(matched.name, matched.definition)
query_emb = self.get_embedding(query_text)
matched_emb = self.get_embedding(matched_text)
if query_emb and matched_emb:
similarity = self.compute_similarity(query_emb, matched_emb)
result["matched_entity"] = {
"id": matched.id,
"name": matched.name,
"type": matched.type,
"definition": matched.definition
}
result["similarity"] = similarity
result["should_merge"] = similarity >= threshold
results.append(result)
return results
def suggest_entity_aliases(self, entity_name: str, entity_definition: str = "") -> List[str]:
"""
使用 LLM 建议实体的别名
Args:
entity_name: 实体名称
entity_definition: 实体定义
Returns:
建议的别名列表
"""
if not KIMI_API_KEY:
return []
prompt = f"""为以下实体生成可能的别名或简称:
实体名称:{entity_name}
定义:{entity_definition}
请返回 JSON 格式的别名列表:
{{"aliases": ["别名1", "别名2", "别名3"]}}
只返回 JSON不要其他内容。"""
try:
response = httpx.post(
f"{KIMI_BASE_URL}/v1/chat/completions",
headers={"Authorization": f"Bearer {KIMI_API_KEY}", "Content-Type": "application/json"},
json={
"model": "k2p5",
"messages": [{"role": "user", "content": prompt}],
"temperature": 0.3
},
timeout=30.0
)
response.raise_for_status()
result = response.json()
content = result["choices"][0]["message"]["content"]
import re
json_match = re.search(r'\{{.*?\}}', content, re.DOTALL)
if json_match:
data = json.loads(json_match.group())
return data.get("aliases", [])
except Exception as e:
print(f"Alias suggestion failed: {e}")
return []
# 简单的字符串相似度计算(不使用 embedding
def simple_similarity(str1: str, str2: str) -> float:
"""
计算两个字符串的简单相似度
Args:
str1: 第一个字符串
str2: 第二个字符串
Returns:
相似度分数 (0-1)
"""
if str1 == str2:
return 1.0
if not str1 or not str2:
return 0.0
# 转换为小写
s1 = str1.lower()
s2 = str2.lower()
# 包含关系
if s1 in s2 or s2 in s1:
return 0.8
# 计算编辑距离相似度
from difflib import SequenceMatcher
return SequenceMatcher(None, s1, s2).ratio()
if __name__ == "__main__":
# 测试
aligner = EntityAligner()
# 测试 embedding
test_text = "Kubernetes 容器编排平台"
embedding = aligner.get_embedding(test_text)
if embedding:
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")
else:
print("Embedding API not available")
# 测试相似度计算
emb1 = [1.0, 0.0, 0.0]
emb2 = [0.9, 0.1, 0.0]
sim = aligner.compute_similarity(emb1, emb2)
print(f"Similarity: {sim:.4f}")

View File

@@ -1,7 +1,7 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
""" """
InsightFlow Backend - Phase 3 (Production Ready) InsightFlow Backend - Phase 3 (Memory & Growth)
Knowledge Growth: Multi-file fusion + Entity Alignment Knowledge Growth: Multi-file fusion + Entity Alignment + Document Import
ASR: 阿里云听悟 + OSS ASR: 阿里云听悟 + OSS
""" """
@@ -9,6 +9,7 @@ import os
import json import json
import httpx import httpx
import uuid import uuid
import re
from fastapi import FastAPI, File, UploadFile, HTTPException, Form from fastapi import FastAPI, File, UploadFile, HTTPException, Form
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles from fastapi.staticfiles import StaticFiles
@@ -35,6 +36,18 @@ try:
except ImportError: except ImportError:
DB_AVAILABLE = False DB_AVAILABLE = False
try:
from document_processor import DocumentProcessor
DOC_PROCESSOR_AVAILABLE = True
except ImportError:
DOC_PROCESSOR_AVAILABLE = False
try:
from entity_aligner import EntityAligner
ALIGNER_AVAILABLE = True
except ImportError:
ALIGNER_AVAILABLE = False
app = FastAPI(title="InsightFlow", version="0.3.0") app = FastAPI(title="InsightFlow", version="0.3.0")
app.add_middleware( app.add_middleware(
@@ -90,9 +103,29 @@ class EntityMergeRequest(BaseModel):
source_entity_id: str source_entity_id: str
target_entity_id: str target_entity_id: str
class GlossaryTermCreate(BaseModel):
term: str
pronunciation: Optional[str] = ""
# API Keys # API Keys
KIMI_API_KEY = os.getenv("KIMI_API_KEY", "") KIMI_API_KEY = os.getenv("KIMI_API_KEY", "")
KIMI_BASE_URL = "https://api.kimi.com/coding" KIMI_BASE_URL = os.getenv("KIMI_BASE_URL", "https://api.kimi.com/coding")
# Phase 3: Entity Aligner singleton
_aligner = None
def get_aligner():
global _aligner
if _aligner is None and ALIGNER_AVAILABLE:
_aligner = EntityAligner()
return _aligner
# Phase 3: Document Processor singleton
_doc_processor = None
def get_doc_processor():
global _doc_processor
if _doc_processor is None and DOC_PROCESSOR_AVAILABLE:
_doc_processor = DocumentProcessor()
return _doc_processor
# Phase 2: Entity Edit API # Phase 2: Entity Edit API
@app.put("/api/v1/entities/{entity_id}") @app.put("/api/v1/entities/{entity_id}")
@@ -406,12 +439,21 @@ def extract_entities_with_llm(text: str) -> tuple[List[dict], List[dict]]:
return [], [] return [], []
def align_entity(project_id: str, name: str, db) -> Optional[Entity]: def align_entity(project_id: str, name: str, db, definition: str = "") -> Optional[Entity]:
"""实体对齐""" """实体对齐 - Phase 3: 使用 embedding 对齐"""
# 1. 首先尝试精确匹配
existing = db.get_entity_by_name(project_id, name) existing = db.get_entity_by_name(project_id, name)
if existing: if existing:
return existing return existing
# 2. 使用 embedding 对齐(如果可用)
aligner = get_aligner()
if aligner:
similar = aligner.find_similar_entity(project_id, name, definition)
if similar:
return similar
# 3. 回退到简单相似度匹配
similar = db.find_similar_entities(project_id, name) similar = db.find_similar_entities(project_id, name)
if similar: if similar:
return similar[0] return similar[0]
@@ -443,7 +485,7 @@ async def list_projects():
@app.post("/api/v1/projects/{project_id}/upload", response_model=AnalysisResult) @app.post("/api/v1/projects/{project_id}/upload", response_model=AnalysisResult)
async def upload_audio(project_id: str, file: UploadFile = File(...)): async def upload_audio(project_id: str, file: UploadFile = File(...)):
"""上传音频到指定项目""" """上传音频到指定项目 - Phase 3: 支持多文件融合"""
if not DB_AVAILABLE: if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available") raise HTTPException(status_code=500, detail="Database not available")
@@ -471,12 +513,12 @@ async def upload_audio(project_id: str, file: UploadFile = File(...)):
full_text=tw_result["full_text"] full_text=tw_result["full_text"]
) )
# 实体对齐并保存 # 实体对齐并保存 - Phase 3: 使用增强对齐
aligned_entities = [] aligned_entities = []
entity_name_to_id = {} # 用于关系映射 entity_name_to_id = {} # 用于关系映射
for raw_ent in raw_entities: for raw_ent in raw_entities:
existing = align_entity(project_id, raw_ent["name"], db) existing = align_entity(project_id, raw_ent["name"], db, raw_ent.get("definition", ""))
if existing: if existing:
ent_model = EntityModel( ent_model = EntityModel(
@@ -551,6 +593,302 @@ async def upload_audio(project_id: str, file: UploadFile = File(...)):
created_at=datetime.now().isoformat() created_at=datetime.now().isoformat()
) )
# Phase 3: Document Upload API
@app.post("/api/v1/projects/{project_id}/upload-document")
async def upload_document(project_id: str, file: UploadFile = File(...)):
"""上传 PDF/DOCX 文档到指定项目"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
if not DOC_PROCESSOR_AVAILABLE:
raise HTTPException(status_code=500, detail="Document processor not available")
db = get_db_manager()
project = db.get_project(project_id)
if not project:
raise HTTPException(status_code=404, detail="Project not found")
content = await file.read()
# 处理文档
processor = get_doc_processor()
try:
result = processor.process(content, file.filename)
except Exception as e:
raise HTTPException(status_code=400, detail=f"Document processing failed: {str(e)}")
# 保存文档转录记录
transcript_id = str(uuid.uuid4())[:8]
db.save_transcript(
transcript_id=transcript_id,
project_id=project_id,
filename=file.filename,
full_text=result["text"],
transcript_type="document"
)
# 提取实体和关系
raw_entities, raw_relations = extract_entities_with_llm(result["text"])
# 实体对齐并保存
aligned_entities = []
entity_name_to_id = {}
for raw_ent in raw_entities:
existing = align_entity(project_id, raw_ent["name"], db, raw_ent.get("definition", ""))
if existing:
entity_name_to_id[raw_ent["name"]] = existing.id
aligned_entities.append(EntityModel(
id=existing.id,
name=existing.name,
type=existing.type,
definition=existing.definition,
aliases=existing.aliases
))
else:
new_ent = db.create_entity(Entity(
id=str(uuid.uuid4())[:8],
project_id=project_id,
name=raw_ent["name"],
type=raw_ent.get("type", "OTHER"),
definition=raw_ent.get("definition", "")
))
entity_name_to_id[raw_ent["name"]] = new_ent.id
aligned_entities.append(EntityModel(
id=new_ent.id,
name=new_ent.name,
type=new_ent.type,
definition=new_ent.definition
))
# 保存实体提及位置
full_text = result["text"]
name = raw_ent["name"]
start_pos = 0
while True:
pos = full_text.find(name, start_pos)
if pos == -1:
break
mention = EntityMention(
id=str(uuid.uuid4())[:8],
entity_id=entity_name_to_id[name],
transcript_id=transcript_id,
start_pos=pos,
end_pos=pos + len(name),
text_snippet=full_text[max(0, pos-20):min(len(full_text), pos+len(name)+20)],
confidence=1.0
)
db.add_mention(mention)
start_pos = pos + 1
# 保存关系
for rel in raw_relations:
source_id = entity_name_to_id.get(rel.get("source", ""))
target_id = entity_name_to_id.get(rel.get("target", ""))
if source_id and target_id:
db.create_relation(
project_id=project_id,
source_entity_id=source_id,
target_entity_id=target_id,
relation_type=rel.get("type", "related"),
evidence=result["text"][:200],
transcript_id=transcript_id
)
return {
"transcript_id": transcript_id,
"project_id": project_id,
"filename": file.filename,
"text_length": len(result["text"]),
"entities": [e.dict() for e in aligned_entities],
"created_at": datetime.now().isoformat()
}
# Phase 3: Knowledge Base API
@app.get("/api/v1/projects/{project_id}/knowledge-base")
async def get_knowledge_base(project_id: str):
"""获取项目知识库 - 包含所有实体、关系、术语表"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
db = get_db_manager()
project = db.get_project(project_id)
if not project:
raise HTTPException(status_code=404, detail="Project not found")
# 获取所有实体
entities = db.list_project_entities(project_id)
# 获取所有关系
relations = db.list_project_relations(project_id)
# 获取所有转录
transcripts = db.list_project_transcripts(project_id)
# 获取术语表
glossary = db.list_glossary(project_id)
# 构建实体统计
entity_stats = {}
for ent in entities:
mentions = db.get_entity_mentions(ent.id)
entity_stats[ent.id] = {
"mention_count": len(mentions),
"transcript_ids": list(set([m.transcript_id for m in mentions]))
}
# 构建实体名称映射
entity_map = {e.id: e.name for e in entities}
return {
"project": {
"id": project.id,
"name": project.name,
"description": project.description
},
"stats": {
"entity_count": len(entities),
"relation_count": len(relations),
"transcript_count": len(transcripts),
"glossary_count": len(glossary)
},
"entities": [
{
"id": e.id,
"name": e.name,
"type": e.type,
"definition": e.definition,
"aliases": e.aliases,
"mention_count": entity_stats.get(e.id, {}).get("mention_count", 0),
"appears_in": entity_stats.get(e.id, {}).get("transcript_ids", [])
}
for e in entities
],
"relations": [
{
"id": r["id"],
"source_id": r["source_entity_id"],
"source_name": entity_map.get(r["source_entity_id"], "Unknown"),
"target_id": r["target_entity_id"],
"target_name": entity_map.get(r["target_entity_id"], "Unknown"),
"type": r["relation_type"],
"evidence": r["evidence"]
}
for r in relations
],
"glossary": [
{
"id": g["id"],
"term": g["term"],
"pronunciation": g["pronunciation"],
"frequency": g["frequency"]
}
for g in glossary
],
"transcripts": [
{
"id": t["id"],
"filename": t["filename"],
"type": t.get("type", "audio"),
"created_at": t["created_at"]
}
for t in transcripts
]
}
# Phase 3: Glossary API
@app.post("/api/v1/projects/{project_id}/glossary")
async def add_glossary_term(project_id: str, term: GlossaryTermCreate):
"""添加术语到项目术语表"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
db = get_db_manager()
project = db.get_project(project_id)
if not project:
raise HTTPException(status_code=404, detail="Project not found")
term_id = db.add_glossary_term(
project_id=project_id,
term=term.term,
pronunciation=term.pronunciation
)
return {
"id": term_id,
"term": term.term,
"pronunciation": term.pronunciation,
"success": True
}
@app.get("/api/v1/projects/{project_id}/glossary")
async def get_glossary(project_id: str):
"""获取项目术语表"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
db = get_db_manager()
glossary = db.list_glossary(project_id)
return glossary
@app.delete("/api/v1/glossary/{term_id}")
async def delete_glossary_term(term_id: str):
"""删除术语"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
db = get_db_manager()
db.delete_glossary_term(term_id)
return {"success": True}
# Phase 3: Entity Alignment API
@app.post("/api/v1/projects/{project_id}/align-entities")
async def align_project_entities(project_id: str, threshold: float = 0.85):
"""运行实体对齐算法,合并相似实体"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
aligner = get_aligner()
if not aligner:
raise HTTPException(status_code=500, detail="Entity aligner not available")
db = get_db_manager()
entities = db.list_project_entities(project_id)
merged_count = 0
merged_pairs = []
# 使用 embedding 对齐
for i, entity in enumerate(entities):
# 跳过已合并的实体
existing = db.get_entity(entity.id)
if not existing:
continue
similar = aligner.find_similar_entity(
project_id,
entity.name,
entity.definition,
exclude_id=entity.id,
threshold=threshold
)
if similar:
# 合并实体
db.merge_entities(similar.id, entity.id)
merged_count += 1
merged_pairs.append({
"source": entity.name,
"target": similar.name
})
return {
"success": True,
"merged_count": merged_count,
"merged_pairs": merged_pairs
}
@app.get("/api/v1/projects/{project_id}/entities") @app.get("/api/v1/projects/{project_id}/entities")
async def get_project_entities(project_id: str): async def get_project_entities(project_id: str):
"""获取项目的全局实体列表""" """获取项目的全局实体列表"""
@@ -559,7 +897,7 @@ async def get_project_entities(project_id: str):
db = get_db_manager() db = get_db_manager()
entities = db.list_project_entities(project_id) entities = db.list_project_entities(project_id)
return [{"id": e.id, "name": e.name, "type": e.type, "definition": e.definition} for e in entities] return [{"id": e.id, "name": e.name, "type": e.type, "definition": e.definition, "aliases": e.aliases} for e in entities]
@app.get("/api/v1/projects/{project_id}/relations") @app.get("/api/v1/projects/{project_id}/relations")
@@ -597,6 +935,7 @@ async def get_project_transcripts(project_id: str):
return [{ return [{
"id": t["id"], "id": t["id"],
"filename": t["filename"], "filename": t["filename"],
"type": t.get("type", "audio"),
"created_at": t["created_at"], "created_at": t["created_at"],
"preview": t["full_text"][:100] + "..." if len(t["full_text"]) > 100 else t["full_text"] "preview": t["full_text"][:100] + "..." if len(t["full_text"]) > 100 else t["full_text"]
} for t in transcripts] } for t in transcripts]
@@ -619,42 +958,18 @@ async def get_entity_mentions(entity_id: str):
"confidence": m.confidence "confidence": m.confidence
} for m in mentions] } for m in mentions]
@app.post("/api/v1/entities/{entity_id}/merge")
async def merge_entities_endpoint(entity_id: str, merge_req: EntityMergeRequest):
"""合并两个实体"""
if not DB_AVAILABLE:
raise HTTPException(status_code=500, detail="Database not available")
db = get_db_manager()
# 验证两个实体都存在
source = db.get_entity(merge_req.source_entity_id)
target = db.get_entity(merge_req.target_entity_id)
if not source or not target:
raise HTTPException(status_code=404, detail="Entity not found")
result = db.merge_entities(merge_req.target_entity_id, merge_req.source_entity_id)
return {
"success": True,
"merged_entity": {
"id": result.id,
"name": result.name,
"type": result.type,
"definition": result.definition,
"aliases": result.aliases
}
}
# Health check # Health check
@app.get("/health") @app.get("/health")
async def health_check(): async def health_check():
return { return {
"status": "ok", "status": "ok",
"version": "0.3.0", "version": "0.3.0",
"phase": "Phase 3 - Memory & Growth",
"oss_available": OSS_AVAILABLE, "oss_available": OSS_AVAILABLE,
"tingwu_available": TINGWU_AVAILABLE, "tingwu_available": TINGWU_AVAILABLE,
"db_available": DB_AVAILABLE "db_available": DB_AVAILABLE,
"doc_processor_available": DOC_PROCESSOR_AVAILABLE,
"aligner_available": ALIGNER_AVAILABLE
} }
# Serve frontend # Serve frontend

24
backend/requirements.txt Normal file
View File

@@ -0,0 +1,24 @@
# InsightFlow Backend Dependencies
# Web Framework
fastapi==0.109.0
uvicorn[standard]==0.27.0
python-multipart==0.0.6
# HTTP Client
httpx==0.26.0
# Document Processing
PyPDF2==3.0.1
python-docx==1.1.0
# Data Processing
numpy==1.26.3
# Aliyun SDK
aliyun-python-sdk-core==2.14.0
aliyun-python-sdk-oss==2.18.5
oss2==2.18.5
# Utilities
python-dotenv==1.0.0

View File

@@ -16,7 +16,9 @@ CREATE TABLE IF NOT EXISTS transcripts (
project_id TEXT NOT NULL, project_id TEXT NOT NULL,
filename TEXT, filename TEXT,
full_text TEXT, full_text TEXT,
type TEXT DEFAULT 'audio', -- 'audio' 或 'document'
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (project_id) REFERENCES projects(id) FOREIGN KEY (project_id) REFERENCES projects(id)
); );
@@ -29,6 +31,7 @@ CREATE TABLE IF NOT EXISTS entities (
type TEXT, type TEXT,
definition TEXT, definition TEXT,
aliases TEXT, -- JSON 数组:["别名1", "别名2"] aliases TEXT, -- JSON 数组:["别名1", "别名2"]
embedding TEXT, -- JSON 数组:实体名称+定义的 embedding
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (project_id) REFERENCES projects(id) FOREIGN KEY (project_id) REFERENCES projects(id)
@@ -71,3 +74,12 @@ CREATE TABLE IF NOT EXISTS glossary (
frequency INTEGER DEFAULT 1, frequency INTEGER DEFAULT 1,
FOREIGN KEY (project_id) REFERENCES projects(id) FOREIGN KEY (project_id) REFERENCES projects(id)
); );
-- 创建索引以提高查询性能
CREATE INDEX IF NOT EXISTS idx_entities_project ON entities(project_id);
CREATE INDEX IF NOT EXISTS idx_entities_name ON entities(name);
CREATE INDEX IF NOT EXISTS idx_transcripts_project ON transcripts(project_id);
CREATE INDEX IF NOT EXISTS idx_mentions_entity ON entity_mentions(entity_id);
CREATE INDEX IF NOT EXISTS idx_mentions_transcript ON entity_mentions(transcript_id);
CREATE INDEX IF NOT EXISTS idx_relations_project ON entity_relations(project_id);
CREATE INDEX IF NOT EXISTS idx_glossary_project ON glossary(project_id);

View File

@@ -1,4 +1,5 @@
// InsightFlow Frontend - Phase 2 (Interactive Workbench) // InsightFlow Frontend - Phase 3 (Memory & Growth)
// Knowledge Growth: Multi-file fusion + Entity Alignment + Document Import
const API_BASE = '/api/v1'; const API_BASE = '/api/v1';
let currentProject = null; let currentProject = null;
@@ -7,8 +8,11 @@ let selectedEntity = null;
let projectRelations = []; let projectRelations = [];
let projectEntities = []; let projectEntities = [];
let currentTranscript = null; let currentTranscript = null;
let projectTranscripts = [];
let editMode = false; let editMode = false;
let contextMenuTarget = null; let contextMenuTarget = null;
let currentUploadTab = 'audio';
let knowledgeBaseData = null;
// Init // Init
document.addEventListener('DOMContentLoaded', () => { document.addEventListener('DOMContentLoaded', () => {
@@ -70,6 +74,49 @@ async function uploadAudio(file) {
return await res.json(); return await res.json();
} }
// Phase 3: Document Upload API
async function uploadDocument(file) {
const formData = new FormData();
formData.append('file', file);
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/upload-document`, {
method: 'POST',
body: formData
});
if (!res.ok) {
const error = await res.json();
throw new Error(error.detail || 'Document upload failed');
}
return await res.json();
}
// Phase 3: Knowledge Base API
async function fetchKnowledgeBase() {
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/knowledge-base`);
if (!res.ok) throw new Error('Failed to fetch knowledge base');
return await res.json();
}
// Phase 3: Glossary API
async function addGlossaryTerm(term, pronunciation = '') {
const res = await fetch(`${API_BASE}/projects/${currentProject.id}/glossary`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ term, pronunciation })
});
if (!res.ok) throw new Error('Failed to add glossary term');
return await res.json();
}
async function deleteGlossaryTerm(termId) {
const res = await fetch(`${API_BASE}/glossary/${termId}`, {
method: 'DELETE'
});
if (!res.ok) throw new Error('Failed to delete glossary term');
return await res.json();
}
// Phase 2: Entity Edit API // Phase 2: Entity Edit API
async function updateEntity(entityId, data) { async function updateEntity(entityId, data) {
const res = await fetch(`${API_BASE}/entities/${entityId}`, { const res = await fetch(`${API_BASE}/entities/${entityId}`, {
@@ -147,7 +194,7 @@ async function updateTranscript(transcriptId, fullText) {
async function loadProjectData() { async function loadProjectData() {
try { try {
// 并行加载实体关系 // 并行加载实体关系和转录列表
const [entitiesRes, relationsRes, transcriptsRes] = await Promise.all([ const [entitiesRes, relationsRes, transcriptsRes] = await Promise.all([
fetch(`${API_BASE}/projects/${currentProject.id}/entities`), fetch(`${API_BASE}/projects/${currentProject.id}/entities`),
fetch(`${API_BASE}/projects/${currentProject.id}/relations`), fetch(`${API_BASE}/projects/${currentProject.id}/relations`),
@@ -160,32 +207,173 @@ async function loadProjectData() {
if (relationsRes.ok) { if (relationsRes.ok) {
projectRelations = await relationsRes.json(); projectRelations = await relationsRes.json();
} }
if (transcriptsRes.ok) {
projectTranscripts = await transcriptsRes.json();
}
// 加载最新的转录 // 加载最新的转录
if (transcriptsRes.ok) { if (projectTranscripts.length > 0) {
const transcripts = await transcriptsRes.json(); currentTranscript = await getTranscript(projectTranscripts[0].id);
if (transcripts.length > 0) { currentData = {
currentTranscript = await getTranscript(transcripts[0].id); transcript_id: currentTranscript.id,
currentData = { project_id: currentProject.id,
transcript_id: currentTranscript.id, segments: [{ speaker: '全文', text: currentTranscript.full_text }],
project_id: currentProject.id, entities: projectEntities,
segments: [{ speaker: '全文', text: currentTranscript.full_text }], full_text: currentTranscript.full_text,
entities: projectEntities, created_at: currentTranscript.created_at
full_text: currentTranscript.full_text, };
created_at: currentTranscript.created_at renderTranscript();
};
renderTranscript();
}
} }
renderGraph(); renderGraph();
renderEntityList(); renderEntityList();
renderTranscriptDropdown();
} catch (err) { } catch (err) {
console.error('Load project data failed:', err); console.error('Load project data failed:', err);
} }
} }
// Phase 3: View Switching
window.switchView = function(viewName) {
// Update sidebar buttons
document.querySelectorAll('.sidebar-btn').forEach(btn => {
btn.classList.remove('active');
});
event.target.classList.add('active');
if (viewName === 'workbench') {
document.getElementById('workbenchView').style.display = 'flex';
document.getElementById('knowledgeBaseView').classList.remove('show');
} else if (viewName === 'knowledge-base') {
document.getElementById('workbenchView').style.display = 'none';
document.getElementById('knowledgeBaseView').classList.add('show');
loadKnowledgeBase();
}
};
// Phase 3: Load Knowledge Base
async function loadKnowledgeBase() {
try {
knowledgeBaseData = await fetchKnowledgeBase();
renderKnowledgeBase();
} catch (err) {
console.error('Load knowledge base failed:', err);
}
}
// Phase 3: Render Knowledge Base
function renderKnowledgeBase() {
if (!knowledgeBaseData) return;
// Update stats
document.getElementById('kbEntityCount').textContent = knowledgeBaseData.stats.entity_count;
document.getElementById('kbRelationCount').textContent = knowledgeBaseData.stats.relation_count;
document.getElementById('kbTranscriptCount').textContent = knowledgeBaseData.stats.transcript_count;
document.getElementById('kbGlossaryCount').textContent = knowledgeBaseData.stats.glossary_count;
// Render entities
const entityGrid = document.getElementById('kbEntityGrid');
entityGrid.innerHTML = knowledgeBaseData.entities.map(e => `
<div class="kb-entity-card" onclick="selectEntity('${e.id}'); switchView('workbench');">
<span class="entity-type-badge type-${e.type}">${e.type}</span>
<div class="kb-entity-name">${e.name}</div>
<div class="kb-entity-def">${e.definition || '暂无定义'}</div>
<div class="kb-entity-meta">提及 ${e.mention_count} 次 | 出现在 ${e.appears_in.length} 个文件中</div>
</div>
`).join('');
// Render relations
const relationsList = document.getElementById('kbRelationsList');
relationsList.innerHTML = knowledgeBaseData.relations.map(r => `
<div class="kb-glossary-item">
<div>
<strong>${r.source_name}</strong>
<span style="color:#666;">→ ${r.type} →</span>
<strong>${r.target_name}</strong>
<div style="font-size:0.8rem;color:#666;margin-top:4px;">${r.evidence || '无证据'}</div>
</div>
</div>
`).join('');
// Render glossary
const glossaryList = document.getElementById('kbGlossaryList');
glossaryList.innerHTML = knowledgeBaseData.glossary.map(g => `
<div class="kb-glossary-item">
<div>
<strong>${g.term}</strong>
${g.pronunciation ? `<span style="color:#666;font-size:0.85rem;"> (${g.pronunciation})</span>` : ''}
<span style="color:#00d4ff;font-size:0.8rem;margin-left:8px;">出现 ${g.frequency} 次</span>
</div>
<button class="btn-icon" onclick="deleteGlossaryTerm('${g.id}').then(loadKnowledgeBase)">删除</button>
</div>
`).join('');
// Render transcripts
const transcriptsList = document.getElementById('kbTranscriptsList');
transcriptsList.innerHTML = knowledgeBaseData.transcripts.map(t => `
<div class="kb-transcript-item">
<div>
<span class="file-type-icon type-${t.type}">${t.type === 'audio' ? '🎵' : '📄'}</span>
<span style="margin-left:8px;">${t.filename}</span>
</div>
<span style="color:#666;font-size:0.8rem;">${new Date(t.created_at).toLocaleDateString()}</span>
</div>
`).join('');
}
// Phase 3: KB Tab Switching
window.switchKBTab = function(tabName) {
document.querySelectorAll('.kb-nav-item').forEach(item => {
item.classList.remove('active');
});
event.target.classList.add('active');
document.querySelectorAll('.kb-section').forEach(section => {
section.classList.remove('active');
});
document.getElementById(`kb${tabName.charAt(0).toUpperCase() + tabName.slice(1)}Section`).classList.add('active');
};
// Phase 3: Transcript Dropdown
window.toggleTranscriptDropdown = function() {
const dropdown = document.getElementById('transcriptDropdown');
dropdown.classList.toggle('show');
};
function renderTranscriptDropdown() {
const dropdown = document.getElementById('transcriptDropdown');
if (!dropdown || projectTranscripts.length === 0) return;
dropdown.innerHTML = projectTranscripts.map(t => `
<div class="transcript-option ${currentTranscript && currentTranscript.id === t.id ? 'active' : ''}"
onclick="switchTranscript('${t.id}')">
<span class="file-type-icon type-${t.type || 'audio'}">${(t.type || 'audio') === 'audio' ? '🎵' : '📄'}</span>
<span style="margin-left:4px;">${t.filename}</span>
</div>
`).join('');
}
window.switchTranscript = async function(transcriptId) {
try {
currentTranscript = await getTranscript(transcriptId);
currentData = {
transcript_id: currentTranscript.id,
project_id: currentProject.id,
segments: [{ speaker: '全文', text: currentTranscript.full_text }],
entities: projectEntities,
full_text: currentTranscript.full_text,
created_at: currentTranscript.created_at
};
renderTranscript();
renderTranscriptDropdown();
document.getElementById('transcriptDropdown').classList.remove('show');
} catch (err) {
console.error('Switch transcript failed:', err);
alert('切换文件失败');
}
};
// Phase 2: Transcript Edit Mode // Phase 2: Transcript Edit Mode
window.toggleEditMode = function() { window.toggleEditMode = function() {
editMode = !editMode; editMode = !editMode;
@@ -255,7 +443,7 @@ function renderTranscript() {
const div = document.createElement('div'); const div = document.createElement('div');
div.className = 'segment'; div.className = 'segment';
div.innerHTML = ` div.innerHTML = `
<div class="speaker">转录文本</div> <div class="speaker">${currentTranscript.filename || '转录文本'}</div>
<div class="segment-text">${text}</div> <div class="segment-text">${text}</div>
`; `;
@@ -311,7 +499,7 @@ function renderGraph() {
.attr('y', '50%') .attr('y', '50%')
.attr('text-anchor', 'middle') .attr('text-anchor', 'middle')
.attr('fill', '#666') .attr('fill', '#666')
.text('暂无实体数据,请上传音频'); .text('暂无实体数据,请上传音频或文档');
return; return;
} }
@@ -458,7 +646,7 @@ function renderEntityList() {
container.innerHTML = '<h3 style="margin-bottom:12px;color:#888;font-size:0.9rem;">项目实体</h3>'; container.innerHTML = '<h3 style="margin-bottom:12px;color:#888;font-size:0.9rem;">项目实体</h3>';
if (!projectEntities || projectEntities.length === 0) { if (!projectEntities || projectEntities.length === 0) {
container.innerHTML += '<p style="color:#666;font-size:0.85rem;">暂无实体,请上传音频文件</p>'; container.innerHTML += '<p style="color:#666;font-size:0.85rem;">暂无实体,请上传音频或文档文件</p>';
return; return;
} }
@@ -788,6 +976,28 @@ window.createEntityFromSelection = async function() {
} }
}; };
// Phase 3: Upload Tab Switching
window.switchUploadTab = function(tab) {
currentUploadTab = tab;
document.querySelectorAll('.upload-tab').forEach(t => t.classList.remove('active'));
event.target.classList.add('active');
const hint = document.getElementById('uploadHint');
if (tab === 'audio') {
hint.textContent = '支持 MP3, WAV, M4A (最大 500MB)';
} else {
hint.textContent = '支持 PDF, DOCX, DOC, TXT, MD';
}
};
window.triggerFileSelect = function() {
if (currentUploadTab === 'audio') {
document.getElementById('fileInput').click();
} else {
document.getElementById('docInput').click();
}
};
// Show/hide upload // Show/hide upload
window.showUpload = function() { window.showUpload = function() {
const el = document.getElementById('uploadOverlay'); const el = document.getElementById('uploadOverlay');
@@ -799,49 +1009,120 @@ window.hideUpload = function() {
if (el) el.classList.remove('show'); if (el) el.classList.remove('show');
}; };
// Phase 3: Glossary Modal
window.showAddTermModal = function() {
document.getElementById('glossaryModal').classList.add('show');
};
window.hideGlossaryModal = function() {
document.getElementById('glossaryModal').classList.remove('show');
document.getElementById('glossaryTerm').value = '';
document.getElementById('glossaryPronunciation').value = '';
};
window.saveGlossaryTerm = async function() {
const term = document.getElementById('glossaryTerm').value.trim();
const pronunciation = document.getElementById('glossaryPronunciation').value.trim();
if (!term) {
alert('请输入术语');
return;
}
try {
await addGlossaryTerm(term, pronunciation);
hideGlossaryModal();
loadKnowledgeBase();
} catch (err) {
console.error('Add term failed:', err);
alert('添加术语失败: ' + err.message);
}
};
// Upload handling // Upload handling
function initUpload() { function initUpload() {
const input = document.getElementById('fileInput'); // Audio upload
const audioInput = document.getElementById('fileInput');
if (audioInput) {
audioInput.addEventListener('change', async (e) => {
if (!e.target.files.length) return;
await handleFileUpload(e.target.files[0], 'audio');
});
}
// Document upload
const docInput = document.getElementById('docInput');
if (docInput) {
docInput.addEventListener('change', async (e) => {
if (!e.target.files.length) return;
await handleFileUpload(e.target.files[0], 'document');
});
}
}
async function handleFileUpload(file, type) {
const overlay = document.getElementById('uploadOverlay'); const overlay = document.getElementById('uploadOverlay');
if (!input) return; overlay.innerHTML = `
<div style="text-align:center;">
<h2>正在分析...</h2>
<p style="color:#666;margin-top:10px;">${file.name}</p>
<p style="color:#888;margin-top:20px;font-size:0.9rem;">${type === 'audio' ? 'ASR转录 + 实体提取中' : '文档解析 + 实体提取中'}</p>
</div>
`;
input.addEventListener('change', async (e) => { try {
if (!e.target.files.length) return; let result;
if (type === 'audio') {
result = await uploadAudio(file);
} else {
result = await uploadDocument(file);
}
const file = e.target.files[0]; // 更新当前数据
if (overlay) { currentData = result;
overlay.innerHTML = `
<div style="text-align:center;"> // 重新加载项目数据
<h2>正在分析...</h2> await loadProjectData();
<p style="color:#666;margin-top:10px;">${file.name}</p>
<p style="color:#888;margin-top:20px;font-size:0.9rem;">ASR转录 + 实体提取中</p> // 重置上传界面
overlay.innerHTML = `
<div class="upload-box">
<h2 style="margin-bottom:10px;">上传文件</h2>
<div class="upload-tabs">
<div class="upload-tab active" onclick="switchUploadTab('audio')">🎵 音频</div>
<div class="upload-tab" onclick="switchUploadTab('document')">📄 文档</div>
</div> </div>
`; <p style="color:#666;" id="uploadHint">支持 MP3, WAV, M4A (最大 500MB)</p>
} <input type="file" id="fileInput" accept="audio/*" hidden>
<input type="file" id="docInput" accept=".pdf,.docx,.doc,.txt,.md" hidden>
<button class="btn" onclick="triggerFileSelect()">选择文件</button>
<br><br>
<button class="btn btn-secondary" onclick="hideUpload()">取消</button>
</div>
`;
try { // 重新绑定事件
const result = await uploadAudio(file); initUpload();
overlay.classList.remove('show');
// 更新当前数据 } catch (err) {
currentData = result; console.error('Upload failed:', err);
overlay.innerHTML = `
// 重新加载项目数据 <div style="text-align:center;">
await loadProjectData(); <h2 style="color:#ff6b6b;">分析失败</h2>
<p style="color:#666;margin-top:10px;">${err.message}</p>
if (overlay) overlay.classList.remove('show'); <button class="btn" onclick="location.reload()" style="margin-top:20px;">重试</button>
</div>
} catch (err) { `;
console.error('Upload failed:', err); }
if (overlay) {
overlay.innerHTML = `
<div style="text-align:center;">
<h2 style="color:#ff6b6b;">分析失败</h2>
<p style="color:#666;margin-top:10px;">${err.message}</p>
<button class="btn" onclick="location.reload()" style="margin-top:20px;">重试</button>
</div>
`;
}
}
});
} }
// Close dropdown when clicking outside
document.addEventListener('click', (e) => {
const dropdown = document.getElementById('transcriptDropdown');
const selector = document.querySelector('.transcript-selector');
if (dropdown && selector && !selector.contains(e.target)) {
dropdown.classList.remove('show');
}
});

View File

@@ -3,7 +3,7 @@
<head> <head>
<meta charset="UTF-8"> <meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>InsightFlow - 知识工作台 (Phase 2)</title> <title>InsightFlow - 知识工作台 (Phase 3)</title>
<script src="https://d3js.org/d3.v7.min.js"></script> <script src="https://d3js.org/d3.v7.min.js"></script>
<style> <style>
* { margin: 0; padding: 0; box-sizing: border-box; } * { margin: 0; padding: 0; box-sizing: border-box; }
@@ -46,10 +46,44 @@
color: #888; color: #888;
font-size: 0.9rem; font-size: 0.9rem;
} }
.header-actions {
display: flex;
gap: 10px;
}
.main { .main {
display: flex; display: flex;
height: calc(100vh - 50px); height: calc(100vh - 50px);
} }
.sidebar {
width: 60px;
background: #111;
border-right: 1px solid #222;
display: flex;
flex-direction: column;
align-items: center;
padding: 10px 0;
}
.sidebar-btn {
width: 44px;
height: 44px;
background: transparent;
border: none;
color: #666;
font-size: 1.2rem;
cursor: pointer;
border-radius: 8px;
margin-bottom: 8px;
transition: all 0.2s;
}
.sidebar-btn:hover, .sidebar-btn.active {
background: #1a1a1a;
color: #00d4ff;
}
.content-area {
flex: 1;
display: flex;
overflow: hidden;
}
.editor-panel { .editor-panel {
width: 50%; width: 50%;
border-right: 1px solid #222; border-right: 1px solid #222;
@@ -199,10 +233,29 @@
border-radius: 16px; border-radius: 16px;
padding: 60px; padding: 60px;
text-align: center; text-align: center;
max-width: 500px;
} }
.upload-box:hover { .upload-box:hover {
border-color: #00d4ff; border-color: #00d4ff;
} }
.upload-tabs {
display: flex;
gap: 10px;
margin-bottom: 20px;
justify-content: center;
}
.upload-tab {
padding: 8px 16px;
background: #1a1a1a;
border: 1px solid #333;
border-radius: 6px;
cursor: pointer;
color: #888;
}
.upload-tab.active {
border-color: #00d4ff;
color: #00d4ff;
}
.btn { .btn {
background: linear-gradient(90deg, #00d4ff, #7b2cbf); background: linear-gradient(90deg, #00d4ff, #7b2cbf);
color: white; color: white;
@@ -373,6 +426,164 @@
.node-label { .node-label {
pointer-events: none; pointer-events: none;
} }
/* Phase 3: Knowledge Base Panel */
.kb-panel {
width: 100%;
height: 100%;
display: none;
flex-direction: column;
background: #0a0a0a;
}
.kb-panel.show {
display: flex;
}
.kb-header {
padding: 16px 20px;
background: #141414;
border-bottom: 1px solid #222;
display: flex;
justify-content: space-between;
align-items: center;
}
.kb-stats {
display: flex;
gap: 24px;
}
.kb-stat {
text-align: center;
}
.kb-stat-value {
font-size: 1.5rem;
font-weight: 600;
color: #00d4ff;
}
.kb-stat-label {
font-size: 0.75rem;
color: #666;
}
.kb-content {
flex: 1;
display: flex;
overflow: hidden;
}
.kb-sidebar {
width: 200px;
background: #111;
border-right: 1px solid #222;
padding: 16px 0;
}
.kb-nav-item {
padding: 12px 20px;
cursor: pointer;
color: #888;
border-left: 3px solid transparent;
}
.kb-nav-item:hover {
background: #1a1a1a;
color: #e0e0e0;
}
.kb-nav-item.active {
background: #1a1a1a;
color: #00d4ff;
border-left-color: #00d4ff;
}
.kb-main {
flex: 1;
padding: 20px;
overflow-y: auto;
}
.kb-section {
display: none;
}
.kb-section.active {
display: block;
}
.kb-entity-grid {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
gap: 16px;
}
.kb-entity-card {
background: #141414;
border: 1px solid #222;
border-radius: 8px;
padding: 16px;
cursor: pointer;
transition: all 0.2s;
}
.kb-entity-card:hover {
border-color: #00d4ff;
}
.kb-entity-name {
font-weight: 600;
margin-bottom: 4px;
}
.kb-entity-def {
font-size: 0.85rem;
color: #888;
margin-bottom: 8px;
}
.kb-entity-meta {
font-size: 0.75rem;
color: #666;
}
.kb-glossary-item {
display: flex;
justify-content: space-between;
align-items: center;
padding: 12px 16px;
background: #141414;
border-radius: 6px;
margin-bottom: 8px;
}
.kb-transcript-item {
padding: 12px 16px;
background: #141414;
border-radius: 6px;
margin-bottom: 8px;
display: flex;
justify-content: space-between;
align-items: center;
}
.file-type-icon {
padding: 4px 8px;
border-radius: 4px;
font-size: 0.7rem;
font-weight: 600;
}
.type-audio { background: #7b2cbf; }
.type-document { background: #00d4ff; color: #000; }
/* Transcript selector */
.transcript-selector {
position: relative;
}
.transcript-dropdown {
position: absolute;
top: 100%;
right: 0;
background: #1a1a1a;
border: 1px solid #333;
border-radius: 8px;
min-width: 200px;
max-height: 300px;
overflow-y: auto;
display: none;
z-index: 100;
}
.transcript-dropdown.show {
display: block;
}
.transcript-option {
padding: 10px 16px;
cursor: pointer;
border-bottom: 1px solid #222;
}
.transcript-option:hover {
background: #2a2a2a;
}
.transcript-option.active {
background: #00d4ff22;
}
</style> </style>
</head> </head>
<body> <body>
@@ -381,35 +592,113 @@
<a href="/" class="back-link">← 返回项目列表</a> <a href="/" class="back-link">← 返回项目列表</a>
<span class="project-name" id="projectName">加载中...</span> <span class="project-name" id="projectName">加载中...</span>
</div> </div>
<button class="btn btn-small" onclick="showUpload()">+ 上传音频</button> <div class="header-actions">
<button class="btn btn-small" onclick="showUpload()">+ 上传文件</button>
</div>
</div> </div>
<div class="main"> <div class="main">
<div class="editor-panel"> <!-- Sidebar -->
<div class="panel-header"> <div class="sidebar">
<span>📄 转录文本</span> <button class="sidebar-btn active" onclick="switchView('workbench')" title="工作台">📝</button>
<div class="panel-actions"> <button class="sidebar-btn" onclick="switchView('knowledge-base')" title="知识库">📚</button>
<button class="btn-icon" onclick="toggleEditMode()" id="editBtn">✏️ 编辑</button>
<button class="btn-icon" onclick="saveTranscript()" id="saveBtn" style="display:none;">💾 保存</button>
</div>
</div>
<div class="transcript-content" id="transcriptContent">
<div class="empty-state">
<p style="color:#666;">暂无转录内容</p>
<button class="btn" onclick="showUpload()">上传音频</button>
</div>
</div>
</div> </div>
<div class="graph-panel"> <!-- Content Area -->
<div class="panel-header"> <div class="content-area">
<span>🔗 知识图谱</span> <!-- Workbench View -->
<span style="font-size:0.8rem;color:#666;">右键节点编辑 | 拖拽建立关系</span> <div id="workbenchView" class="workbench-view" style="display: flex; width: 100%;">
<div class="editor-panel">
<div class="panel-header">
<div style="display: flex; align-items: center; gap: 12px;">
<span>📄 转录文本</span>
<div class="transcript-selector">
<button class="btn-icon" onclick="toggleTranscriptDropdown()">📁 选择文件</button>
<div class="transcript-dropdown" id="transcriptDropdown"></div>
</div>
</div>
<div class="panel-actions">
<button class="btn-icon" onclick="toggleEditMode()" id="editBtn">✏️ 编辑</button>
<button class="btn-icon" onclick="saveTranscript()" id="saveBtn" style="display:none;">💾 保存</button>
</div>
</div>
<div class="transcript-content" id="transcriptContent">
<div class="empty-state">
<p style="color:#666;">暂无转录内容</p>
<button class="btn" onclick="showUpload()">上传音频或文档</button>
</div>
</div>
</div>
<div class="graph-panel">
<div class="panel-header">
<span>🔗 知识图谱</span>
<span style="font-size:0.8rem;color:#666;">右键节点编辑 | 拖拽建立关系</span>
</div>
<svg id="graph-svg"></svg>
<div class="entity-list" id="entityList">
<h3 style="margin-bottom:12px;color:#888;font-size:0.9rem;">项目实体</h3>
<p style="color:#666;font-size:0.85rem;">暂无实体数据</p>
</div>
</div>
</div> </div>
<svg id="graph-svg"></svg>
<div class="entity-list" id="entityList"> <!-- Knowledge Base View -->
<h3 style="margin-bottom:12px;color:#888;font-size:0.9rem;">项目实体</h3> <div id="knowledgeBaseView" class="kb-panel">
<p style="color:#666;font-size:0.85rem;">暂无实体数据</p> <div class="kb-header">
<h2>📚 项目知识库</h2>
<div class="kb-stats">
<div class="kb-stat">
<div class="kb-stat-value" id="kbEntityCount">0</div>
<div class="kb-stat-label">实体</div>
</div>
<div class="kb-stat">
<div class="kb-stat-value" id="kbRelationCount">0</div>
<div class="kb-stat-label">关系</div>
</div>
<div class="kb-stat">
<div class="kb-stat-value" id="kbTranscriptCount">0</div>
<div class="kb-stat-label">文件</div>
</div>
<div class="kb-stat">
<div class="kb-stat-value" id="kbGlossaryCount">0</div>
<div class="kb-stat-label">术语</div>
</div>
</div>
</div>
<div class="kb-content">
<div class="kb-sidebar">
<div class="kb-nav-item active" onclick="switchKBTab('entities')">🏷️ 实体</div>
<div class="kb-nav-item" onclick="switchKBTab('relations')">🔗 关系</div>
<div class="kb-nav-item" onclick="switchKBTab('glossary')">📖 术语表</div>
<div class="kb-nav-item" onclick="switchKBTab('transcripts')">📁 文件</div>
</div>
<div class="kb-main">
<!-- Entities Section -->
<div class="kb-section active" id="kbEntitiesSection">
<h3 style="margin-bottom:16px;">所有实体</h3>
<div class="kb-entity-grid" id="kbEntityGrid"></div>
</div>
<!-- Relations Section -->
<div class="kb-section" id="kbRelationsSection">
<h3 style="margin-bottom:16px;">所有关系</h3>
<div id="kbRelationsList"></div>
</div>
<!-- Glossary Section -->
<div class="kb-section" id="kbGlossarySection">
<div style="display:flex;justify-content:space-between;align-items:center;margin-bottom:16px;">
<h3>术语表</h3>
<button class="btn btn-small" onclick="showAddTermModal()">+ 添加术语</button>
</div>
<div id="kbGlossaryList"></div>
</div>
<!-- Transcripts Section -->
<div class="kb-section" id="kbTranscriptsSection">
<h3 style="margin-bottom:16px;">所有文件</h3>
<div id="kbTranscriptsList"></div>
</div>
</div>
</div>
</div> </div>
</div> </div>
</div> </div>
@@ -417,10 +706,15 @@
<!-- Upload Modal --> <!-- Upload Modal -->
<div class="upload-overlay" id="uploadOverlay"> <div class="upload-overlay" id="uploadOverlay">
<div class="upload-box"> <div class="upload-box">
<h2 style="margin-bottom:10px;">上传音频分析</h2> <h2 style="margin-bottom:10px;">上传文件</h2>
<p style="color:#666;">支持 MP3, WAV, M4A (最大 500MB)</p> <div class="upload-tabs">
<div class="upload-tab active" onclick="switchUploadTab('audio')">🎵 音频</div>
<div class="upload-tab" onclick="switchUploadTab('document')">📄 文档</div>
</div>
<p style="color:#666;" id="uploadHint">支持 MP3, WAV, M4A (最大 500MB)</p>
<input type="file" id="fileInput" accept="audio/*" hidden> <input type="file" id="fileInput" accept="audio/*" hidden>
<button class="btn" onclick="document.getElementById('fileInput').click()">选择文件</button> <input type="file" id="docInput" accept=".pdf,.docx,.doc,.txt,.md" hidden>
<button class="btn" onclick="triggerFileSelect()">选择文件</button>
<br><br> <br><br>
<button class="btn btn-secondary" onclick="hideUpload()">取消</button> <button class="btn btn-secondary" onclick="hideUpload()">取消</button>
</div> </div>
@@ -514,6 +808,25 @@
</div> </div>
</div> </div>
<!-- Add Glossary Term Modal -->
<div class="modal-overlay" id="glossaryModal">
<div class="modal">
<h3 class="modal-header">添加术语</h3>
<div class="form-group">
<label>术语</label>
<input type="text" id="glossaryTerm" placeholder="术语名称">
</div>
<div class="form-group">
<label>发音提示 (可选)</label>
<input type="text" id="glossaryPronunciation" placeholder="如: K8s 发音为 Kubernetes">
</div>
<div class="modal-actions">
<button class="btn btn-secondary" onclick="hideGlossaryModal()">取消</button>
<button class="btn" onclick="saveGlossaryTerm()">添加</button>
</div>
</div>
</div>
<!-- Context Menu --> <!-- Context Menu -->
<div class="context-menu" id="contextMenu"> <div class="context-menu" id="contextMenu">
<div class="context-menu-item" onclick="editEntity()">✏️ 编辑实体</div> <div class="context-menu-item" onclick="editEntity()">✏️ 编辑实体</div>