A powerful tool for deep analysis of GitHub repositories, providing file-by-file insights, call hierarchy visualization, and AI-powered summaries using OpenAI's API.
RepoSage combines advanced reasoning models for multi-step, context-aware analysis of complex project structures with agile general models for fast, high-level summarization. This dual-model approach captures nuanced interdependencies and delivers more accurate, reliable insights than traditional one-shot methods.
Take a quick look on a sample report here.
RepoSage is a powerful tool for deep analysis of GitHub repositories, providing file-by-file insights, call hierarchy visualization, and AI-powered summaries using OpenAI's API.
The uniqueness of this project lies in its multi-step, AI-powered approach that goes far beyond traditional static or one-shot methods:
-
Multi-Layered Analysis: It starts by fetching repository metadata, file trees, and README content, then leverages this context to analyze project structure, filter essential files, examine individual code files, generate a detailed call hierarchy, and finally produce a comprehensive project summary.
-
Hybrid AI Strategy: It dynamically chooses between reasoning models (for in-depth, logical, multi-step tasks like call hierarchy generation and smart file filtering) and general models (for broad summarization tasks), ensuring both nuanced insights and efficient processing.
-
Smart File Filtering: Instead of analyzing every file, it employs an AI-driven filter to identify only the most relevant files for analysis, using a robust fallback mechanism if needed.
-
Token and Response Management: The analyzer monitors token usage to avoid limits by summarizing content when necessary and logs API responses for transparency and debugging.
By combining these elements, the analyzer.js provides a context-aware, detailed, and accurate understanding of a repository’s structure and functionality—delivering insights that traditional one-shot methods simply can’t match.
- Repository Analysis: Comprehensive examination of GitHub repositories
- AI-Powered Insights: Integration with OpenAI for intelligent summarization
- Call Hierarchy Visualization: Understand project structure and function relationships
- File-by-File Breakdown: Detailed analysis of individual files
- Modern Web Interface: Responsive React frontend with Vite build system
- Modular Architecture: Clean separation between backend and frontend components
- Language: JavaScript (Node.js)
- Framework: Express.js (assumed based on common practices)
- APIs: OpenAI API for analysis and summarization
- Utilities: Custom utilities for file handling and GitHub API interactions
- Library: React
- Build Tool: Vite
- Styling: CSS
- Node.js (v18+)
- npm (v9+)
- OpenAI API Key
This project utilizes the GitHub API and OpenAI API for repository analysis and summarization. To run the project, you will need:
- A GitHub token to access GitHub's API.
- An OpenAI API key to interact with OpenAI's services.
Please be aware that this project is not free to run. It makes multiple API calls to OpenAI, and the cost will depend on the size of the repository being analyzed. The number of API calls varies based on the repository's complexity and size.
You can customize the following settings to optimize performance and cost:
-
Model Selection:
- The default model configuration is recommended for the best performance.
- You can change the model used for each function call in
backend/openaiConfig.jsonfile. - For a more cost-effective analysis, use the configuration specified in the
backend/openaiConfigQA.jsonfile. (You can change it inbackend/src/analyzer.jsfile)
-
Token Limits:
- Adjust the
MAX_TOKENS_PER_REQUESTsetting to control the token usage per repository.
- Adjust the
-
Recommendation
- For most users, the default configuration provides the best balance of performance and accuracy. However, if cost is a concern, consider switching to the cost-effective model as described above.
Each function call to OpenAI API is logged in apiResponsesLog.txt file in the root of the project.
-
Clone Repository
git clone https://github.com/your-username/github-repo-bot.git cd github-repo-bot -
Backend Setup
cd backend npm install -
Create .env file in backend folder:
cd backend OPENAI_API_KEY=your_openai_key_here GITHUB_TOKEN=your_github_token_here -
Frontend Setup
cd ../frontend npm install -
Configuration
Update OpenAI settings in backend/openaiConfig.json Default file is backend/openaiConfigQA.json (It costs significantly less to run as it does not use reasoning models) To use openaiConfig.json, change the configPath variable in backend/src/analyzer.js
Modify server port in backend/index.js if needed
In seperate terminals run the following commands:
cd backend
npm startcd frontend
npm run devDefault port for frontend is 5173. If it is already in use, you can change it in frontend/vite.config.js file.
-
Enter GitHub repository URL in the input field
-
Click "Analyze Repository"
-
View real-time analysis progress
-
Download and Explore generated MarkDown reports containing:
- Project summary
- File-by-file analysis
- Call hierarchy diagram
- Code insights powered by OpenAI
Contributions are welcome! If you have suggestions, improvements, or bug fixes, please:
- Fork the repository.
- Create a new branch (git checkout -b feature/your-feature).
- Commit your changes (git commit -m 'Add some feature').
- Push to the branch (git push origin feature/your-feature).
- Open a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
- OpenAI: For providing the API that powers the analysis.
- Vite: For the modern and fast frontend build tooling.
- React: For the robust library used in building the user interface.
For questions or feedback, please contact me at [email protected].
