错题本生成系统

基于 PaddleOCR + LangChain Agent 的智能错题本生成系统。上传试卷 PDF 或图片，自动识别文档结构、智能分割题目，导出为 Markdown 错题本。

项目结构

├── backend/                         # Flask 后端（API + 静态资源托管）
│   ├── config.py                    # 集中配置（路径、目录）
│   ├── llm.py                       # LLM 初始化（DeepSeek / 文心）
│   ├── web_app.py                   # Flask 主应用（路由 + 会话状态）
│   ├── src/                         # 核心模块（LangGraph workflow、OCR 客户端、工具函数）
│   ├── error_correction_agent/      # LangChain Agent（题目分割、OCR 纠错、科目识别）
│   ├── solve_agent/                 # 解题智能体
│   ├── benchmark/                   # 模型评测（数据采集、评测运行、指标计算）
│   ├── db/                          # SQLite 数据库（ORM 模型、CRUD）
│   └── tests/                       # 后端测试（13 个测试模块）
├── frontend/                        # Vue 3 + Vite + Tailwind CSS 前端
│   ├── index.html                   # 介绍落地页
│   ├── app.html                     # Vue 工作台入口
│   └── src/
│       ├── App.vue                  # 根组件（侧边栏布局 + 视图路由）
│       ├── components/              # 9 个功能组件
│       └── __tests__/               # 前端测试（Vitest）
├── example_uploads/                 # 示例测试文件（图片 + PDF）
├── .env.example                     # 环境变量模板
└── requirements.txt                 # Python 依赖

环境部署

1. 安装依赖

需要 Python 3.11+、Node.js 18+。

# 后端依赖
pip install -r requirements.txt

# 前端依赖
cd frontend && npm install

2. 配置环境变量

copy .env.example .env

编辑 .env，填写以下必需项：

# LLM API（Agent 智能分割题目，二选一即可）
# DeepSeek
DEEPSEEK_API_KEY=your_key
DEEPSEEK_BASE_URL=https://api.deepseek.com

# 文心一言（百度 AIStudio OpenAI 兼容接口）
ERNIE_API_KEY=your_key
ERNIE_BASE_URL=https://aistudio.baidu.com/llm/lmapi/v3
ERNIE_MODEL_NAME=ernie-4.5-turbo-128k-preview

# PaddleOCR API（文档结构解析，V2 异步任务接口）
PADDLEOCR_API_URL=https://paddleocr.aistudio-app.com/api/v2/ocr/jobs
PADDLEOCR_API_TOKEN=your_token
PADDLEOCR_MODEL=PaddleOCR-VL-1.5

DeepSeek 和文心二选一配置即可，前端会自动检测已配置的模型供选择。轻量模型、LangSmith 等可选配置见 .env.example。

3. 启动

开发模式（前后端分离，支持热更新）：

# 终端 1：启动后端
cd backend && python web_app.py

# 终端 2：启动前端开发服务器
cd frontend && npm run dev

前端开发服务器会自动将 /api、/images、/download 请求代理到后端 localhost:5001。

生产模式（Flask 托管前端静态资源）：

# 构建前端
cd frontend && npm run build

# 启动后端（自动托管构建产物）
cd backend && python web_app.py

访问 http://localhost:5001 即可使用。

支持的文件格式

PDF(.pdf)、图片(.jpg .jpeg .png .bmp .tiff .webp)，单次上传限制 50 MB。

测试

# 后端测试
cd backend && python -m pytest tests/ -v

# 前端测试
cd frontend && npm test

详见 backend/tests/README.md 和 frontend/src/tests/README.md。

许可证

本项目基于 AGPL-3.0 开源。任何修改或基于本项目的衍生作品（包括网络服务部署）均须以相同许可证开源。

Name		Name	Last commit message	Last commit date
Latest commit History 290 Commits
backend		backend
data-processing		data-processing
docs		docs
example_uploads		example_uploads
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
WEB_APP_GUIDE.md		WEB_APP_GUIDE.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

错题本生成系统

项目结构

环境部署

1. 安装依赖

2. 配置环境变量

3. 启动

支持的文件格式

测试

许可证

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

错题本生成系统

项目结构

环境部署

1. 安装依赖

2. 配置环境变量

3. 启动

支持的文件格式

测试

许可证

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages