Automated GitHub repository analysis tool using GitIngest CLI and Claude Code.
- Installation
- Quick Start
- Commands
- How It Works
- Common Use Cases
- Troubleshooting
- Developer Setup
- Testing
- Project Structure
- Phase Roadmap
- Python 3.12 or higher - Download Python
- uv package manager - Install uv
- GitIngest CLI - Install with:
uv tool install gitingest - Git - For cloning the repository
# Clone the repository
git clone https://github.com/DAESA24/gitingest-agent-project.git
cd gitingest-agent-project/execute
# Sync environment and install dependencies
uv sync
# Install in development mode
uv pip install -e .
# Verify installation
uv run gitingest-agent --helpNote: This tool is currently distributed via GitHub only. Installation from PyPI will be available in a future release.
Get started with GitIngest Agent in 3 simple steps:
# From the execute/ directory
uv run gitingest-agent check-size https://github.com/octocat/Hello-WorldOutput:
Checking repository size...
Token count: 47 tokens
Route: full extraction
# From the execute/ directory
uv run gitingest-agent extract-full https://github.com/octocat/Hello-WorldOutput:
Extracting full repository...
[OK] Saved to: /home/user/my-project/context/related-repos/Hello-World/digest.txt
Token count: 47 tokens
The extracted content is saved to:
- When in gitingest-agent-project:
data/[repo-name]/digest.txt - When in other directories:
context/related-repos/[repo-name]/digest.txt
You can now read and analyze the extracted content with your preferred tool or AI assistant!
GitIngest Agent provides four main commands for repository analysis:
Check repository token count and determine extraction strategy.
uv run gitingest-agent check-size <github-url> [--output-dir PATH]Examples:
# Basic usage
uv run gitingest-agent check-size https://github.com/fastapi/fastapi
# With custom output directory
uv run gitingest-agent check-size https://github.com/fastapi/fastapi --output-dir ./my-analysesOutput:
Checking repository size...
Token count: 487,523 tokens
Route: selective extraction
Extract entire repository content (recommended for repos < 200k tokens).
uv run gitingest-agent extract-full <github-url> [--output-dir PATH]Examples:
# Basic usage
uv run gitingest-agent extract-full https://github.com/octocat/Hello-World
# With custom output directory
uv run gitingest-agent extract-full https://github.com/octocat/Hello-World --output-dir ./my-analysesOutput:
Extracting full repository...
[OK] Saved to: /home/user/project/context/related-repos/Hello-World/digest.txt
Token count: 47 tokens
Extract repository tree structure without full content (for large repos >= 200k tokens).
uv run gitingest-agent extract-tree <github-url> [--output-dir PATH]Examples:
# Basic usage
uv run gitingest-agent extract-tree https://github.com/fastapi/fastapi
# With custom output directory
uv run gitingest-agent extract-tree https://github.com/fastapi/fastapi --output-dir ./my-analysesOutput:
Extracting tree structure...
[OK] Saved to: /home/user/project/context/related-repos/fastapi/tree.txt
Token count: 8,234 tokens
Repository structure:
README.md
fastapi/
__init__.py
applications.py
routing.py
...
docs/
tutorial/
...
Extract specific content using filters with automatic overflow prevention.
uv run gitingest-agent extract-specific <github-url> --type <content-type> [--output-dir PATH]Content Types:
docs- Documentation files (*.md, docs/**, README)installation- Setup files (README, setup.py, package.json, requirements.txt)code- Source code (src//*.py, lib//*.py)auto- Automatic selection (README + key docs)
Examples:
# Extract only documentation
uv run gitingest-agent extract-specific https://github.com/fastapi/fastapi --type docs
# Extract installation files
uv run gitingest-agent extract-specific https://github.com/fastapi/fastapi --type installation
# Extract source code
uv run gitingest-agent extract-specific https://github.com/fastapi/fastapi --type code
# Automatic selection with custom output
uv run gitingest-agent extract-specific https://github.com/fastapi/fastapi --type auto --output-dir ./analysesOutput:
Extracting specific content (type: docs)...
[OK] Saved to: /home/user/project/context/related-repos/fastapi/docs-content.txt
Token count: 125,430 tokens
All commands support the --output-dir parameter to specify a custom output location:
# Save to custom directory
uv run gitingest-agent extract-full https://github.com/user/repo --output-dir ./my-custom-folder
# Use absolute path
uv run gitingest-agent extract-full https://github.com/user/repo --output-dir /home/user/analysesDefault Behavior (without --output-dir):
- In gitingest-agent-project: Saves to
data/[repo-name]/ - In other directories: Saves to
context/related-repos/[repo-name]/
# Show all commands
uv run gitingest-agent --help
# Get help for specific command
uv run gitingest-agent check-size --help
uv run gitingest-agent extract-full --help
uv run gitingest-agent extract-tree --help
uv run gitingest-agent extract-specific --helpGitIngest Agent uses intelligent location detection to determine where to save extracted repository content.
When running from the gitingest-agent-project directory, the tool uses the original Phase 1.0 behavior:
Detection Logic:
- Checks if current directory contains
execute/cli.pyandexecute/main.py - OR if current directory is the
execute/subdirectory itself
Output Location: data/[repo-name]/
Example:
cd ~/work/dev/gitingest-agent-project/execute
uv run gitingest-agent extract-full https://github.com/octocat/Hello-World
# Creates: ~/work/dev/gitingest-agent-project/data/Hello-World/digest.txtFolder Structure:
gitingest-agent-project/
├── execute/
│ ├── cli.py
│ └── main.py
├── data/ # Phase 1.0 output location
│ └── Hello-World/
│ └── digest.txt
When running from ANY other directory (React projects, Vue projects, Node.js apps, etc.), the tool uses Phase 1.5:
Output Location: context/related-repos/[repo-name]/
Benefits:
- Clear purpose: "context" folder indicates external reference materials
- No pollution: Avoids creating random
data/folders in your projects - Universal standard: Same convention works across all project types
- Auto-creation: Automatically creates
context/andrelated-repos/if they don't exist
Example:
cd ~/work/dev/my-react-app
uv run ~/work/dev/gitingest-agent-project/execute/gitingest-agent extract-full https://github.com/facebook/react
# Creates: ~/work/dev/my-react-app/context/related-repos/react/digest.txtFolder Structure:
my-react-app/
├── src/
├── public/
├── package.json
└── context/ # Phase 1.5 universal convention
└── related-repos/
└── react/
└── digest.txt
You can always override the automatic detection with --output-dir:
# Save to completely custom location
uv run gitingest-agent extract-full https://github.com/user/repo --output-dir ./my-analyses
# Creates: ./my-analyses/digest.txtPath Validation:
- Creates directory if it doesn't exist (with confirmation prompt)
- Supports both relative and absolute paths
- Validates write permissions
For repositories under 200k tokens, use full extraction:
# Check size first
uv run gitingest-agent check-size https://github.com/octocat/Hello-World
# Output: Token count: 47 tokens
# Route: full extraction
# Extract full content
uv run gitingest-agent extract-full https://github.com/octocat/Hello-World
# Output: [OK] Saved to: /current/directory/context/related-repos/Hello-World/digest.txtWhen to use: READMEs, small utilities, simple examples, config files
For repositories >= 200k tokens, use selective extraction:
# Check size first
uv run gitingest-agent check-size https://github.com/fastapi/fastapi
# Output: Token count: 487,523 tokens
# Route: selective extraction
# Extract tree structure to understand layout
uv run gitingest-agent extract-tree https://github.com/fastapi/fastapi
# Extract only what you need
uv run gitingest-agent extract-specific https://github.com/fastapi/fastapi --type installation
# Output: [OK] Saved to: /current/directory/context/related-repos/fastapi/installation-content.txtWhen to use: Large frameworks, complex applications, extensive documentation
Save analyses to a dedicated folder outside your project:
# Create dedicated analyses directory
mkdir -p ~/repo-analyses
# Extract to custom location
uv run gitingest-agent extract-full https://github.com/user/repo --output-dir ~/repo-analyses
# Result: ~/repo-analyses/digest.txtWhen to use: Centralized analysis storage, shared team folder, backup location
Analyze related repositories while working in your frontend project:
# In your React project
cd ~/work/dev/my-react-app
# Analyze React source for reference
uv run ~/work/dev/gitingest-agent-project/execute/gitingest-agent extract-specific https://github.com/facebook/react --type docs
# Analyze related libraries
uv run ~/work/dev/gitingest-agent-project/execute/gitingest-agent extract-full https://github.com/reduxjs/redux
# All saved to: my-react-app/context/related-repos/Folder structure:
my-react-app/
├── src/
├── public/
├── package.json
└── context/
└── related-repos/
├── react/
│ └── docs-content.txt
└── redux/
└── digest.txt
Benefits:
- Reference materials organized alongside your code
- No pollution of project root
- Easy to .gitignore (add
context/to.gitignore)
Cause: Repository not cloned or not running from execute/ directory.
Solutions:
-
Clone the repository:
git clone https://github.com/DAESA24/gitingest-agent-project.git cd gitingest-agent-project/execute uv sync -
Run from execute/ directory:
cd gitingest-agent-project/execute uv run gitingest-agent --help -
Use absolute path from anywhere:
uv run ~/path/to/gitingest-agent-project/execute/gitingest-agent --help
Cause: Invalid repository URL or repository is private/doesn't exist.
Solutions:
-
Verify the URL format:
# Correct format: https://github.com/owner/repository # Examples: https://github.com/facebook/react ✓ https://github.com/octocat/Hello-World ✓ # Incorrect: github.com/owner/repo ✗ http://github.com/owner/repo ✗
-
Check repository exists:
- Visit the URL in your browser
- Ensure repository is public (private repos not supported)
-
Check your internet connection:
# Test GitHub connectivity curl -I https://github.com
Cause: URL doesn't match expected GitHub format.
Solution:
Ensure URL follows the pattern: https://github.com/{owner}/{repository}
# Correct format
gitingest-agent extract-full https://github.com/octocat/Hello-World
# Incorrect formats (will fail)
gitingest-agent extract-full github.com/octocat/Hello-World
gitingest-agent extract-full www.github.com/octocat/Hello-World
gitingest-agent extract-full https://github.com/octocatCause: Even after filtering, extracted content still exceeds 200k token limit.
Solution:
Narrow your selection further:
# If 'docs' is too large, try 'installation' (more specific)
uv run gitingest-agent extract-specific https://github.com/large-repo --type installation
# Or use 'auto' for minimal content (README + key docs)
uv run gitingest-agent extract-specific https://github.com/large-repo --type autoOverflow Prevention:
- The tool will warn you if content exceeds limits
- You'll be prompted to narrow selection or proceed with partial content
- Use more specific content types for better control
Cause: Insufficient write permissions in target directory.
Solutions:
-
Check directory permissions:
# Check current directory permissions ls -ld . # Ensure you have write access
-
Use --output-dir to specify writable location:
uv run gitingest-agent extract-full https://github.com/user/repo --output-dir ~/my-analyses -
Create directory manually first:
mkdir -p context/related-repos chmod 755 context/related-repos uv run gitingest-agent extract-full https://github.com/user/repo
Cause: GitIngest (external dependency) may fail on Windows with UTF-8 files.
Symptoms:
Error reading file with 'cp1252': 'charmap' codec can't decode byte...
Solutions:
-
Set Python encoding environment variable:
# PowerShell $env:PYTHONIOENCODING = "utf-8" uv run gitingest-agent extract-full https://github.com/user/repo # CMD set PYTHONIOENCODING=utf-8 uv run gitingest-agent extract-full https://github.com/user/repo
-
Use WSL (Windows Subsystem for Linux):
wsl cd gitingest-agent-project/execute uv run gitingest-agent extract-full https://github.com/user/repo -
Try selective extraction (fewer files = fewer encoding issues):
uv run gitingest-agent extract-specific https://github.com/user/repo --type installation
Note: This is a known issue with the external GitIngest library on Windows. The tool will detect and warn about encoding errors, but extraction will continue with available content.
This section is for developers contributing to GitIngest Agent.
- Python 3.12 or higher
- UV package manager installed
- GitIngest CLI installed globally (
uv tool install gitingest) - Git for version control
# Clone the repository
git clone https://github.com/your-username/gitingest-agent-project.git
cd gitingest-agent-project
# Navigate to implementation directory
cd execute
# Sync environment (creates .venv and installs dependencies)
uv sync
# Install in development mode
uv pip install -e .
# Verify installation
uv run gitingest-agent --help
uv run pytest --versionImportant: All development commands must be run from the execute/ directory.
This project follows the BMAD (Breakthrough Method for AI-driven Agile Development) methodology with integrated QA.
Workflow:
- Story Creation:
@sm *draft - Story Implementation:
@dev *develop-story {story} - Story Review:
@qa *review {story} - Risk Assessment:
@qa *risk {story}
See .bmad-core/enhanced-ide-development-workflow.md for complete workflow details.
# Navigate to execute directory
cd execute
# Run CLI in development mode
uv run gitingest-agent --help
uv run gitingest-agent check-size https://github.com/user/repo
# Run tests
uv run pytest
# Run linting
uv run ruff check .
# Format code
uv run ruff format .All test commands must be run from the execute/ directory:
# Navigate to execute directory
cd execute
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov
# Run specific test file
uv run pytest tests/test_token_counter.py
# Run with verbose output
uv run pytest -v
# Run tests matching pattern
uv run pytest -k "test_check_size"Test Suite Stats:
- 190+ tests across all modules
- 96%+ code coverage
- Integration tests for CLI commands
- Unit tests for core functionality
Test Structure:
execute/tests/
├── test_cli.py # CLI command tests
├── test_token_counter.py # Token counting logic tests
├── test_workflow.py # Display formatting tests
├── test_storage.py # Storage layer tests
├── test_extractor.py # GitIngest integration tests
└── test_exceptions.py # Exception handling tests
gitingest-agent-project/
├── .bmad-core/ # BMAD framework
├── docs/ # Planning & story documents
│ ├── prd.md # Product requirements
│ ├── architecture.md # System design
│ ├── stories/ # Implementation stories
│ └── handoffs/ # QA handoff documents
├── explore/ # Research documents
├── plan/ # Planning work
├── user-context/ # User-provided contextual files
├── docker/ # Development tools
│ └── toon-test/ # TOON format testing environment
├── execute/ # Implementation directory (all Python code)
│ ├── .venv/ # UV virtual environment
│ ├── tests/ # Test suite (190 tests, 96%+ coverage)
│ ├── cli.py # CLI entry point (Click framework)
│ ├── token_counter.py # Token counting & routing logic
│ ├── workflow.py # Display formatting utilities
│ ├── storage.py # File management & analysis storage
│ ├── storage_manager.py # Dynamic path resolution (Phase 1.5)
│ ├── extractor.py # GitIngest API wrapper
│ ├── exceptions.py # Custom exception classes
│ ├── pyproject.toml # Python project configuration
│ └── uv.lock # Dependency lock file
├── analyze/ # Generated analyses storage (runtime)
├── data/ # Repository extraction storage (runtime)
├── CLAUDE.md # Agent configuration (Claude Code behavior)
├── CLAUDE_ANALYSIS_GUIDE.md # Analysis generation specifications
├── CHANGELOG.md # Version history
└── README.md # This file
Root Level:
- BMAD Framework:
.bmad-core/,docs/,explore/,plan/,user-context/ - Agent Configuration:
CLAUDE.md,CLAUDE_ANALYSIS_GUIDE.md(Claude Code runtime config) - Project Documentation:
README.md,CHANGELOG.md
execute/ Directory:
- All Python Implementation: Source code, tests, and Python environment
- Self-Contained: Complete Python project with its own pyproject.toml and .venv
- Working Directory: All development commands run from execute/
High-fidelity replication of proven design from AI LABS video.
Implemented Features:
- ✅ Token size checking and routing (200k threshold)
- ✅ Full and selective extraction workflows
- ✅ Claude Code automation via CLAUDE.md
- ✅ Analysis generation (4 types: installation, workflow, architecture, custom)
- ✅ Analysis storage with metadata headers
- ✅ Token overflow prevention with iterative refinement
- ✅ CLI with 4 commands (check-size, extract-full, extract-tree, extract-specific)
- ✅ 190 tests passing, 96%+ coverage
Implementation Stats:
- 13 stories completed (1.2-1.14)
- 5 core modules (cli, token_counter, workflow, storage, extractor)
- 15 files changed, 4,926+ lines added
- Complete documentation and workflow automation
Enhanced storage capabilities for cross-project usage.
Implemented Features:
- ✅ BMAD project detection (gitingest-agent-project vs. other directories)
- ✅ context/related-repos/ universal convention
- ✅ --output-dir parameter on all commands
- ✅ Work from any directory without project pollution
- ✅ Automatic directory creation with validation
- ✅ StorageManager abstraction for dynamic path resolution
Status: Released in v1.1.0
Multi-repository analysis at scale through token optimization and parallel processing.
Key Features:
- TOON Format Integration - 15-25% token savings on GitHub API data (validated)
- Multi-Agent Architecture - Parallel sub-agent processing for 5+ repositories
- Multi-Repo Comparison - Synthesized analysis across multiple codebases
- GitHub API Integration - Commit history, issues, PRs with TOON optimization
Validation Completed:
- ✅ Docker testing infrastructure (docker/toon-test/)
- ✅ Real token savings verified (15-25% on API data)
- ✅ TOON CLI integration tested and working
Status: Feature request complete, ready for story creation
See user-context/v2-toon-multiagent-feature-request.md for complete V2.0 specification.
- PRD - Product requirements and user stories
- Architecture - System design and technical details
- CLAUDE.md - Complete workflow automation guide
- CLAUDE_ANALYSIS_GUIDE.md - Analysis generation specifications
- Stories - Implementation stories
- CHANGELOG.md - Version history
- Research - Exploration phase research documents
This is a personal development project following BMAD methodology. Development is tracked through:
- User stories in
docs/stories/ - Quality gates via Test Architect (
@qa) - Git commits with co-authorship by Claude
MIT License (To be finalized)
Current Version: v1.1.0
Development Status: Phase 1.5 Complete ✅ - Fully functional CLI tool with universal context convention.
Next Steps:
- ✅ Phase 1.0 Complete - All 13 stories implemented and tested
- ✅ Phase 1.5 Complete - Multi-location output and --output-dir parameter
- ✅ V2.0 Research Complete - TOON format validated
- 🎯 Ready for V2.0 Story Creation - BMAD workflow planning phase
Built with Claude Code using BMAD methodology