Skip to content

vedantparmar12/Document-Automation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

21 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“š Document Automation MCP Server

Python MCP License Code Style

A powerful Model Context Protocol (MCP) server that provides intelligent codebase analysis, comprehensive documentation generation, and multi-format export capabilities. Built for seamless integration with Claude and other AI assistants.


โœจ Features

๐Ÿ” Intelligent Codebase Analysis

  • AST Parsing: Deep analysis of Python and JavaScript code structures
  • Framework Detection: Automatically identifies frameworks (React, Flask, Django, FastAPI, etc.)
  • Database Schema Analysis: Extracts and visualizes database relationships
  • API Endpoint Discovery: Maps REST endpoints with methods and parameters
  • Security Scanning: Identifies potential vulnerabilities and best practices
  • Dependency Tracking: Analyzes project dependencies and versions

๐Ÿ“ Professional Documentation Generation

  • Multiple Formats: Markdown, HTML, PDF, DOCX, Confluence, Notion, JSON, EPUB
  • Interactive Documentation: Search, navigation, syntax highlighting
  • Mermaid Diagrams: Architecture, database ER, API flows, dependency graphs
  • Multi-language Support: Built-in internationalization
  • Custom Themes: Modern, minimal, dark, corporate, and more
  • Accessibility Compliant: WCAG 2.1 AA standards

๐Ÿš€ Advanced Capabilities

  • Pagination Support: Handles large repositories efficiently
  • Background Processing: Async analysis for better performance
  • Smart Chunking: Token-aware content splitting
  • Concurrent Analysis: Parallel file processing
  • Export Automation: Batch export to multiple formats
  • Archive Generation: ZIP, TAR, TAR.GZ support

๐Ÿ“‹ Table of Contents


๐Ÿš€ Installation

Prerequisites

  • Python 3.8 or higher
  • Git
  • Claude Desktop or MCP-compatible client

Using pip

# Clone the repository
git clone https://github.com/vedantparmar12/Document-Automation.git
cd Document-Automation

# Install dependencies
pip install -r requirements.txt

Using uv (Recommended)

# Install uv (if not already installed)
pip install uv

# Install dependencies with uv
uv pip install -r requirements.txt

๐ŸŽฏ Quick Start

1. Configure Claude Desktop

Add the following to your Claude Desktop configuration file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "document-automation": {
      "command": "python",
      "args": [
        "C:\\path\\to\\Document-Automation\\run_server.py"
      ]
    }
  }
}

2. Start the Server

# Using Python directly
python run_server.py

# Or using uv
uv run python run_server.py

3. Use in Claude

Once configured, Claude can analyze codebases and generate documentation:

Analyze this repository: https://github.com/username/repo
Generate comprehensive documentation in HTML format

โš™๏ธ Configuration

Environment Variables

Create a .env file in the project root:

# Analysis Settings
MAX_FILES=1000
MAX_TOKENS_PER_CHUNK=4000
PAGINATION_STRATEGY=auto

# Security Settings
ENABLE_SECURITY_SCAN=true
CONTENT_FILTERING=true

# Export Settings
DEFAULT_OUTPUT_DIR=./docs
DEFAULT_THEME=modern
ENABLE_DIAGRAMS=true

MCP Server Configuration

The server can be configured via wrangler.toml for Cloudflare Workers deployment:

name = "document-automation"
compatibility_date = "2024-01-01"

[env.production]
vars = { ENVIRONMENT = "production" }

๐Ÿ“– Usage Examples

Analyze a Local Codebase

from src.tools.consolidated_documentation_tools import analyze_codebase

result = analyze_codebase(
    path="/path/to/project",
    source_type="local",
    include_ast_analysis=True,
    include_mermaid_diagrams=True,
    include_security_analysis=True
)

print(f"Analysis ID: {result['analysis_id']}")
print(f"Total Files: {result['metrics']['total_files']}")
print(f"Frameworks: {result['technology_stack']['primary_technologies']}")

Analyze a GitHub Repository

result = analyze_codebase(
    path="https://github.com/username/repo.git",
    source_type="github",
    include_api_endpoints=True,
    include_database_analysis=True
)

Generate Documentation

from src.tools.consolidated_documentation_tools import generate_documentation

docs = generate_documentation(
    analysis_id="analysis_20241212_123456_789",
    format="interactive",
    theme="modern",
    include_toc=True,
    include_search=True,
    auto_export_formats=["pdf", "html", "markdown"]
)

print(f"Documentation generated: {docs['generated_files']}")

Export to Multiple Formats

from src.tools.consolidated_documentation_tools import export_documentation

exports = export_documentation(
    analysis_id="analysis_20241212_123456_789",
    formats=["html", "pdf", "docx", "markdown"],
    theme="corporate",
    include_diagrams=True,
    accessibility_compliance=True,
    archive_formats=["zip"]
)

print(f"Exported {len(exports['exported_files'])} files")

๐Ÿ”Œ API Reference

Core Tools

analyze_codebase

Comprehensive codebase analysis with built-in features.

Parameters:

  • path (str): Local path or GitHub URL
  • source_type (str): "local" or "github"
  • include_ast_analysis (bool): Enable AST parsing
  • include_framework_detection (bool): Detect frameworks
  • include_database_analysis (bool): Analyze database schemas
  • include_mermaid_diagrams (bool): Generate diagrams
  • include_api_endpoints (bool): Extract API routes
  • include_security_analysis (bool): Security scanning
  • max_files (int): Maximum files to analyze (default: 1000)
  • pagination_strategy (str): "auto", "file_by_file", "chunk_by_chunk", "smart"

Returns:

{
    "success": bool,
    "analysis_id": str,
    "comprehensive_analysis": {
        "project_structure": {...},
        "dependencies": [...],
        "api_endpoints": [...],
        "mermaid_diagrams": {...},
        "security_analysis": {...}
    },
    "metrics": {...}
}

generate_documentation

Generate professional documentation with all features built-in.

Parameters:

  • analysis_id (str): ID from previous analysis
  • format (str): "markdown", "html", "pdf", "interactive", etc.
  • theme (str): "modern", "minimal", "dark", "corporate", etc.
  • include_toc (bool): Table of contents
  • include_search (bool): Search functionality
  • include_navigation (bool): Navigation sidebar
  • auto_export_formats (list): Additional export formats

Returns:

{
    "success": bool,
    "generated_files": [...],
    "documentation_stats": {...}
}

export_documentation

Export to multiple formats with advanced features.

Parameters:

  • analysis_id (str): ID from previous analysis
  • formats (list): Export formats
  • theme (str): Documentation theme
  • accessibility_compliance (bool): WCAG 2.1 AA compliance
  • archive_formats (list): Archive types ["zip", "tar", "tar.gz"]

๐Ÿ“ Project Structure

Document-Automation/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ analyzers/           # Codebase analysis modules
โ”‚   โ”‚   โ”œโ”€โ”€ base_analyzer.py
โ”‚   โ”‚   โ”œโ”€โ”€ codebase_analyzer.py
โ”‚   โ”‚   โ”œโ”€โ”€ database_analyzer.py
โ”‚   โ”‚   โ”œโ”€โ”€ framework_detector.py
โ”‚   โ”‚   โ””โ”€โ”€ project_classifier.py
โ”‚   โ”œโ”€โ”€ diagrams/            # Diagram generation
โ”‚   โ”‚   โ”œโ”€โ”€ architecture_diagrams.py
โ”‚   โ”‚   โ”œโ”€โ”€ database_diagrams.py
โ”‚   โ”‚   โ””โ”€โ”€ mermaid_generator.py
โ”‚   โ”œโ”€โ”€ export/              # Format exporters
โ”‚   โ”‚   โ””โ”€โ”€ format_exporter.py
โ”‚   โ”œโ”€โ”€ generators/          # Documentation generators
โ”‚   โ”‚   โ”œโ”€โ”€ documentation_generator.py
โ”‚   โ”‚   โ”œโ”€โ”€ interactive_doc_generator.py
โ”‚   โ”‚   โ”œโ”€โ”€ professional_doc_generator.py
โ”‚   โ”‚   โ””โ”€โ”€ readme_template.py
โ”‚   โ”œโ”€โ”€ pagination/          # Pagination and chunking
โ”‚   โ”‚   โ”œโ”€โ”€ chunker.py
โ”‚   โ”‚   โ”œโ”€โ”€ context.py
โ”‚   โ”‚   โ”œโ”€โ”€ strategies.py
โ”‚   โ”‚   โ””โ”€โ”€ token_estimator.py
โ”‚   โ”œโ”€โ”€ parsers/             # Language parsers
โ”‚   โ”‚   โ”œโ”€โ”€ ast_analyzer.py
โ”‚   โ”‚   โ”œโ”€โ”€ javascript_parser.py
โ”‚   โ”‚   โ”œโ”€โ”€ python_parser.py
โ”‚   โ”‚   โ””โ”€โ”€ mcp_analyzer.py
โ”‚   โ”œโ”€โ”€ processing/          # Background processing
โ”‚   โ”‚   โ”œโ”€โ”€ background_processor.py
โ”‚   โ”‚   โ””โ”€โ”€ concurrent_analyzer.py
โ”‚   โ”œโ”€โ”€ security/            # Security features
โ”‚   โ”‚   โ”œโ”€โ”€ content_filter.py
โ”‚   โ”‚   โ””โ”€โ”€ validation.py
โ”‚   โ”œโ”€โ”€ tools/               # MCP tools
โ”‚   โ”‚   โ””โ”€โ”€ consolidated_documentation_tools.py
โ”‚   โ”œโ”€โ”€ schemas.py           # Data schemas
โ”‚   โ””โ”€โ”€ server.py            # MCP server
โ”œโ”€โ”€ run_server.py            # Server entry point
โ”œโ”€โ”€ requirements.txt         # Python dependencies
โ”œโ”€โ”€ pyproject.toml          # Project metadata
โ””โ”€โ”€ README.md               # This file

๐Ÿ—๏ธ Architecture

System Overview

flowchart TB
    Client[Claude/MCP Client] --> Server[MCP Server]
    Server --> Analyzer[Codebase Analyzer]
    Server --> Generator[Doc Generator]
    Server --> Exporter[Format Exporter]
    
    Analyzer --> AST[AST Parser]
    Analyzer --> Framework[Framework Detector]
    Analyzer --> Database[DB Analyzer]
    Analyzer --> Security[Security Scanner]
    
    Generator --> Interactive[Interactive HTML]
    Generator --> Professional[Professional Docs]
    Generator --> Diagrams[Mermaid Diagrams]
    
    Exporter --> PDF[PDF Export]
    Exporter --> HTML[HTML Export]
    Exporter --> Markdown[Markdown Export]
    Exporter --> DOCX[DOCX Export]
Loading

Component Interaction

sequenceDiagram
    participant Client
    participant Server
    participant Analyzer
    participant Generator
    participant Exporter
    
    Client->>Server: analyze_codebase()
    Server->>Analyzer: Parse & Analyze
    Analyzer->>Analyzer: AST, Framework, DB, Security
    Analyzer-->>Server: Analysis Results
    Server-->>Client: analysis_id
    
    Client->>Server: generate_documentation()
    Server->>Generator: Create Docs
    Generator->>Generator: Format, Theme, Diagrams
    Generator-->>Server: Documentation
    Server-->>Client: Generated Files
    
    Client->>Server: export_documentation()
    Server->>Exporter: Convert Formats
    Exporter->>Exporter: PDF, HTML, DOCX
    Exporter-->>Server: Exported Files
    Server-->>Client: Export Results
Loading

๐Ÿ› ๏ธ Development

Setup Development Environment

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dev dependencies
pip install -r requirements.txt
pip install black pytest pytest-cov mypy

# Run tests
pytest tests/

# Format code
black src/

# Type checking
mypy src/

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run specific test file
pytest tests/test_analyzer.py

Code Quality

# Format with Black
black src/ tests/

# Lint with Flake8
flake8 src/ tests/

# Type check with MyPy
mypy src/

๐Ÿค Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

Contribution Guidelines

  • Follow PEP 8 style guide
  • Add tests for new features
  • Update documentation
  • Ensure all tests pass
  • Keep commits atomic and well-described

๐Ÿ“Š Performance

Benchmarks

Operation Small Repo (<100 files) Medium Repo (100-1000 files) Large Repo (>1000 files)
Analysis ~5 seconds ~30 seconds ~2-5 minutes
Doc Generation ~2 seconds ~10 seconds ~30 seconds
Export (all formats) ~3 seconds ~15 seconds ~45 seconds

Optimization Tips

  • Use pagination_strategy="smart" for large repositories
  • Enable include_security_analysis=False if not needed
  • Limit max_files for faster analysis
  • Use concurrent processing for multiple projects

๐Ÿ› Troubleshooting

Common Issues

Issue: ModuleNotFoundError: No module named 'src'

# Solution: Add project root to PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:/path/to/Document-Automation"

Issue: GitHub rate limiting

# Solution: Set GitHub token
export GITHUB_TOKEN="your_token_here"

Issue: Memory errors with large repositories

# Solution: Use pagination
analyze_codebase(path="...", max_files=500, pagination_strategy="smart")

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments


๐Ÿ“ฎ Contact

Vedant Parmar - @vedantparmar12

Project Link: https://github.com/vedantparmar12/Document-Automation


๐Ÿ—บ๏ธ Roadmap

  • Support for more programming languages (Go, Rust, Java)
  • Real-time collaboration features
  • Cloud storage integration (S3, GCS)
  • API documentation auto-generation from OpenAPI specs
  • Enhanced security scanning with CVE database
  • Performance profiling and optimization suggestions
  • Custom template support
  • CLI tool for standalone use

โญ Star History

If you find this project useful, please consider giving it a star! โญ


Made with โค๏ธ by Vedant Parmar

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages