🏥 Medical RAG with Asset-Aware MCP - Precise PDF asset retrieval (tables, figures, sections) and Knowledge Graph for AI Agents.
🌐 繁體中文
AI cannot directly read image files on your computer. This is a common misconception.
| Method | Can AI analyze image content? | Description |
|---|---|---|
| ❌ Provide PNG path | No | AI cannot access the local file system |
| ✅ Asset-Aware MCP | Yes | Retrieves Base64 via MCP, allowing AI vision to understand directly |
# After retrieving the image via MCP, the AI can analyze it directly:
User: What is this figure about?
AI: This is the architecture diagram for Scaled Dot-Product Attention:
1. Inputs: Q (Query), K (Key), V (Value)
2. MatMul of Q and K
3. Scale (1/√dₖ)
4. Optional Mask (for decoder)
5. SoftMax normalization
6. Final MatMul with V to get the output
This is the value of Asset-Aware MCP - enabling AI Agents to truly "see" and understand charts and tables in your PDF literature.
- 📄 Asset-Aware ETL - PDF → Markdown with dual-engine PDF parsing:
- PyMuPDF (default) - Fast extraction (~50MB)
- Marker (optional,
use_marker=True) - High-precision structured parsing withblocks.json(bbox/coordinates)
- 🧭 Section Navigation - Dynamic hierarchy section tree with 5 tools: browse, search, detail, content reading, and block extraction for any depth of headings.
- 🔄 Async Job Pipeline - Supports asynchronous task processing and progress tracking for large documents.
- 🗺️ Document Manifest - Provides a structured "map" of the document for precise data access by Agents.
- 🧠 LightRAG Integration - Knowledge Graph + Vector Index, supporting cross-document comparison and reasoning.
- 📝 Docx Editing (DFM) - Edit .docx files in Markdown via Docx-Flavored Markdown format. Supports legacy
.docfiles (auto-converts via LibreOffice). 13 tools: ingest, read, save, list, delete, export, strict round-trip validation, DOCX→PDF, DOCX→DOC, and Docx ↔ A2T bridges. - 🛡️ DFM Integrity Checker - Automatic validation and auto-repair at every pipeline stage (post-ingest, pre-save, post-save). Catches orphan markers, column mismatches, and format inconsistencies.
- 📊 A2T (Anything to Table) - 7 operation-based tools for building professional tables from any source (PDF assets, Knowledge Graph, URLs, user input). Features: Citations (AssetRef), Audit Trail, Schema Evolution, Templates, Drafting, and Token-efficient resumption.
- 🖥️ VS Code Management Extension - Graphical interface for monitoring server status, ingested documents, and A2T tables/drafts with one-click Excel export.
- 🔌 MCP Server - Exposes tools and resources to Copilot/Claude via FastMCP.
- 🏥 Medical Research Focus - Optimized for medical literature, supporting Base64 image transmission for Vision AI analysis.
┌─────────────────────────────────────────────────────────┐
│ AI Agent (Copilot) │
└─────────────────────┬───────────────────────────────────┘
│ MCP Protocol (Tools & Resources)
┌─────────────────────▼───────────────────────────────────┐
│ MCP Server (Modular Presentation) │
│ ┌─────────────────────────────────────────────────┐ │
│ │ tools/: 43 tools in 7 modules │ │
│ │ document (8) │ docx (13) │ section (5) │ │
│ │ job (3) │ knowledge (2) │ table (7) │ profile (5) │
│ └─────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ resources/: 12 resources in 2 modules │ │
│ └─────────────────────────────────────────────────┘ │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ ETL Pipeline (DDD) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ PyMuPDF │ │ Asset │ │ LightRAG │ │
│ │ Adapter │→ │ Parser │→ │ Index │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ Local Storage │
│ ./data/ │
│ ├── doc_{id}/ # Document Assets │
│ ├── docx_{id}/ # Docx IR + DFM + Assets │
│ ├── tables/ # A2T Tables (JSON/MD/XLSX) │
│ │ └── drafts/ # Table Drafts (Persistence) │
│ └── lightrag_db/ # Knowledge Graph │
└─────────────────────────────────────────────────────────┘
asset-aware-mcp/
├── src/
│ ├── domain/ # 🔵 Domain: Entities, Value Objects, Interfaces
│ ├── application/ # 🟢 Application: Doc Service, Table Service (A2T), Asset Service
│ ├── infrastructure/ # 🟠 Infrastructure: PyMuPDF, LightRAG, Excel Renderer
│ └── presentation/ # 🔴 Presentation: MCP Server (FastMCP)
├── data/ # Document and Asset Storage
├── docs/
│ └── spec.md # Technical Specification
├── tests/ # Unit and Integration Tests
├── vscode-extension/ # VS Code Management Extension
└── pyproject.toml # uv Project Config
# Install dependencies (using uv) — default install skips Marker/torch
uv sync
# Optional: install Marker backend only if you need structured parsing
uv sync --extra marker
# Run MCP Server
uv run python -m src.presentation.server
# Or use the VS Code extension for graphical managementRuntime note:
The VS Code extension prefers a managed Python 3.11 runtime when launching the MCP server via uv or uvx. This avoids native package builds on end-user machines, especially macOS systems without Xcode Command Line Tools, while keeping the project itself compatible with newer Python versions.
Marker note:
marker-pdf is now an optional dependency because it may pull in torch, surya, and platform-specific ML wheels. Default installs use the PyMuPDF backend only. Enable Marker only when you need use_marker=True or parse_pdf_structure.
| Tool | Purpose |
|---|---|
ingest_documents |
Process PDF files with optional Marker backend (use_marker=True for blocks.json) |
list_documents |
List all ingested documents and their asset counts |
delete_document |
Delete an ingested PDF and its local artifacts |
convert_pdf_to_docx |
Reconstruct a readable DOCX from extracted PDF content |
inspect_document_manifest |
Inspect document structure before fetching specific assets |
fetch_document_asset |
Precisely retrieve tables (MD) / figures (B64) / sections |
parse_pdf_structure |
Run high-precision Marker parsing and emit structured blocks |
search_source_location |
Search exact source locations with page + bbox for verification |
| Tool | Purpose |
|---|---|
get_job_status |
Get async ingestion job progress and final result |
list_jobs |
List active or historical ETL jobs |
cancel_job |
Cancel a running ETL job |
| Tool | Purpose |
|---|---|
consult_knowledge_graph |
Knowledge graph query, cross-document comparison |
export_knowledge_graph |
Export graph summary / JSON / Mermaid for inspection |
| Tool | Purpose |
|---|---|
list_section_tree |
Display complete section hierarchy tree (supports any depth) |
get_section_detail |
Get detailed info for a specific section |
get_section_blocks |
Extract all blocks from a section with page + bbox |
search_sections |
Search section titles |
get_section_content |
Read section content via asset service |
Edit .docx files as Markdown. Preserves formatting, tables, media on round-trip.
| Tool | Purpose |
|---|---|
ingest_docx |
Import .docx and decompose into DFM blocks |
get_docx_content |
Read DFM content of specific blocks |
save_docx |
Write DFM edits back to .docx |
list_docx_blocks |
List document block structure |
list_docx_documents |
List all ingested DOCX/DFM documents |
delete_docx |
Delete an ingested DOCX/DFM document and its local artifacts |
convert_docx_to_pdf |
Export the current DOCX/DFM state to PDF in fidelity mode |
convert_docx_to_doc |
Export the current DOCX/DFM state to DOC in fidelity mode |
docx_validate_roundtrip |
6-dimension round-trip fidelity validation + file-level comparison (SHA-256, ZIP diff) |
docx_table_to_context |
Bridge: Docx table → A2T context |
docx_table_from_context |
Bridge: A2T table → Docx table |
docx_chart_data |
Extract chart data from Docx |
export_markdown |
Export Markdown to .docx/.pdf/.doc |
Agent-friendly design: each tool handles multiple operations via
operationparameter. Tables accept any source — PDF assets, KG entities, external URLs, or user input.
| Tool | Operations | Purpose |
|---|---|---|
plan_table |
schema / templates / from_template |
Schema planning, browse 4 built-in templates, create from template |
table_manage |
create / delete / list / preview / resume / render / add_column / remove_column / rename_column |
Table lifecycle + Schema evolution |
table_data |
add_rows / get_row / update_row / delete_row / get_cell / update_cell / clear_cell |
Row & cell CRUD |
table_cite |
add / get / remove / cell_history |
Citation management with AssetRef (7 source types) |
table_history |
changes / tokens |
Audit trail & token estimation |
table_draft |
create / update / add_rows / resume / commit / list / delete |
Draft workflow with persistence |
discover_sources |
— | Cross-document source discovery (sections, tables, figures, KG) |
Different journals/formats need different extraction settings. Use these tools to switch profiles.
| Tool | Purpose |
|---|---|
list_etl_profiles |
List all available profiles (default, arxiv, nature, ieee, elsevier) |
get_etl_profile |
Get detailed configuration of a specific profile |
get_current_etl_profile |
Show currently active profile |
set_etl_profile |
Switch profile for subsequent document ingestion |
load_etl_profile_from_json |
Load custom profile from JSON file |
| Category | Technology |
|---|---|
| Language | Python 3.10+ |
| Package Manager | uv (all pip/setup-python removed) |
| ETL | PyMuPDF (fitz) + Marker (optional, high-precision) |
| RAG | LightRAG (lightrag-hku) |
| MCP | FastMCP |
| Storage | Local filesystem (JSON/Markdown/PNG) |
Installation guidance:
-
Default install:
uv sync -
Install Marker backend only when needed:
uv sync --extra marker -
Safer extension Marker setup: enable Marker backend in settings and keep
torchBackend=cpuunless you explicitly need GPU wheels -
Technical Spec - Detailed technical specification
-
Architecture - System architecture
-
Constitution - Project principles
-
Competitive Analysis - MCP + DOCX ecosystem landscape