Born out of pure frustration with codebase bloat.
Every developer knows the feeling. You start with a clean, organized project. Then months pass. Features get added, experiments are tried, demos are created, tests are written and forgotten. Before you know it, your once-pristine codebase has become a digital junkyard.
I hit that wall with my financial portfolio management system. What started as a focused project had grown into 959 Python files scattered across countless directories. I found myself asking:
- "Which files actually matter?"
- "What can I safely delete?"
- "Is this script still being used?"
- "Why do I have three different config managers?"
Traditional tools tell you about code quality, but they don't answer the fundamental question: "What's actually important in my codebase?"
So I built Codebase Bloodhound. Not because I wanted to build another tool, but because I needed to reclaim my sanity.
Codebase Bloodhound analyzes your project's architecture to identify:
- Files with many dependents
- Main application entry points
- Configuration files
- Recently active, substantial code
- Unused demo files
- Old test files with no dependencies
- Abandoned experiments
- Duplicate utilities
- Everything else that's actively part of your project
When I ran this on my own project, it identified:
- 178 critical files (18.6%) - The core architecture
- 59 archival candidates (6.2%) - Safe to clean up
- 722 normal files (75.3%) - Standard project code
The cleanup process removed 67 files and organized 48 demo files, transforming chaos back into clarity.
- 🔗 Dependency Analysis: Builds a complete dependency graph
- 📅 Git Integration: Considers commit history and file activity
- 🏷️ Smart Classification: Uses multiple signals to classify files
- �� Detailed Metrics: Lines of code, dependencies, activity patterns
- 📄 Human-Readable Reports: Clear explanations for every decision
- ⚙️ Configurable: Adapt to your project's specific patterns
# Clone the repository
git clone https://github.com/TeldridgeLDN/code_base_bloodhound.git
cd code_base_bloodhound
# Run directly (no dependencies required beyond Python 3.6+)
python code_base_bloodhound.py /path/to/your/projectpython code_base_bloodhound.py /path/to/your/project# Markdown report
python code_base_bloodhound.py /path/to/your/project -o analysis_report.md
# JSON data for further processing
python code_base_bloodhound.py /path/to/your/project -j results.json# Use custom analysis rules
python code_base_bloodhound.py /path/to/your/project -c custom_config.json🔍 Codebase Architecture Analysis Report
Generated on: 2024-12-22 17:30:15
📊 Summary
- Total Files Analyzed: 959
- Critical Files: 178
- Normal Files: 722
- Archival Candidates: 59
🎯 Critical Files (Keep These!)
### src/utils/logging_utils.py
Score: 45.0
Reasons: High incoming dependencies (22)
Stats: 50 lines, 22 dependents
📦 Archival Candidates
### demo_old_experiment.py
Score: -3.0
Reasons: Example/demo file, No incoming dependencies, Small file (15 lines)
Last Modified: 180 days ago
Currently supports:
- Python (full AST analysis)
- JavaScript/TypeScript (import pattern analysis)
Easy to extend for other languages by adding import pattern extractors.
Customize the analysis by providing a JSON config file:
{
"critical_indicators": {
"min_dependencies": 3,
"recent_activity_days": 90,
"min_lines_critical": 50
},
"archive_indicators": {
"max_dependencies": 1,
"inactive_days": 180,
"max_lines_small": 20
}
}Good code isn't just about what you write—it's about what you keep.
Codebase Bloodhound helps you make informed decisions about your project's architecture. It doesn't delete anything automatically (that would be reckless), but it gives you the confidence to clean up with surgical precision.
Found a bug? Have an idea? This tool was born from real frustration with real problems. If you're facing similar challenges, let's make it better together.
MIT License - Use it, modify it, share it. Just help keep codebases clean.
"The best code is the code you don't have to maintain."
Built with ❤️ and a healthy dose of frustration by a developer who just wanted a clean codebase.