Skip to content

wb-alacard/wandb_test

Repository files navigation

πŸ€– Tech Stack Demo Generator

An intelligent agentic system that takes a technology stack description and generates comprehensive IPython notebook demos using ADK orchestration and MCP servers.

Features

  • 🧠 Intelligent Parsing: Uses LLM to parse and categorize technologies from natural language descriptions
  • πŸ“š Multi-Source Discovery: Discovers documentation from Context7, DeepWiki, and Tavily MCP servers in parallel
  • βš–οΈ LLM-as-a-Judge: Evaluates documentation quality and determines if more retrieval is needed
  • πŸ““ Notebook Generation: Creates comprehensive IPython notebooks with explanations and working code
  • πŸ”„ ADK Orchestration: Uses Anthropic ADK framework for workflow management
  • 🎨 Rich CLI: Beautiful terminal interface with interactive feedback
  • πŸ”§ Flexible LLM Support: Works with both Anthropic Claude and Google Gemini

Architecture

User Query β†’ Parser Agent β†’ User Approval β†’ Discovery Loop β†’ Notebook Generator
                                              ↓           ↑
                                         LLM Judge β†β”€β”€β”€β”€β”€β”˜

Workflow Steps

  1. Technology Parsing: Breaks down user's tech stack description into individual technologies
  2. User Approval: Presents parsed technologies for user review and feedback
  3. Parallel Discovery: Queries multiple MCP servers simultaneously for each technology:
    • Context7: Library documentation and API references
    • DeepWiki: GitHub repository documentation and wikis
    • Tavily: Web search for additional context (optional)
  4. LLM-as-a-Judge: Evaluates if discovered data is sufficient (with retry loop)
  5. Notebook Generation: Creates integrated IPython notebook demo
  6. User Feedback: Collects satisfaction feedback

Installation

Prerequisites

  • Python 3.11+
  • Cursor with MCP servers configured
  • API keys for your chosen LLM provider

Setup

  1. Clone the repository:
cd /home/csaba/repos/AIML/WeaveHacks2/wandb_test
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure environment:
cp .env.example .env
# Edit .env with your API keys
  1. Ensure MCP servers are configured in ~/.cursor/mcp.json:
{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@upstash/context7-mcp"]
    },
    "deepwiki": {
      "url": "https://mcp.deepwiki.com/sse"
    }
  }
}

Configuration

Environment Variables (.env)

# LLM Provider ("anthropic" or "google")
LLM_PROVIDER=anthropic

# API Keys
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here

# Model Names
ANTHROPIC_MODEL=claude-sonnet-4-20250514
GOOGLE_MODEL=gemini-2.0-flash-exp

# MCP Server Configuration
CONTEXT7_ENABLED=true
DEEPWIKI_ENABLED=true
TAVILY_ENABLED=false
TAVILY_API_KEY=your_tavily_key_here

# Workflow Configuration
MAX_RETRIEVAL_ITERATIONS=3
PARALLEL_DISCOVERY=true

Usage

Basic Usage

python run.py

The CLI will guide you through:

  1. Describing your technology stack
  2. Reviewing parsed technologies
  3. Approving or providing feedback
  4. Watching the discovery and generation process
  5. Receiving your generated notebook

Example Queries

"I want to build a web app with React, FastAPI, and PostgreSQL"
"Create a machine learning pipeline with PyTorch, Pandas, and MLflow"
"A microservices architecture with Docker, Kubernetes, and MongoDB"
"Build a data dashboard using Streamlit, Plotly, and SQLite"

Output

Generated notebooks are saved to the output/ directory with timestamps:

output/tech_demo_20241011_143022.ipynb

Open with Jupyter or VS Code:

jupyter notebook output/tech_demo_*.ipynb
# or
code output/tech_demo_*.ipynb

Project Structure

wandb_test/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agents/           # Agent implementations
β”‚   β”‚   β”œβ”€β”€ parser.py           # Tech stack parser
β”‚   β”‚   β”œβ”€β”€ discovery.py        # Documentation discovery
β”‚   β”‚   β”œβ”€β”€ judge.py            # LLM-as-a-Judge evaluator
β”‚   β”‚   └── notebook_generator.py  # Notebook creator
β”‚   β”œβ”€β”€ models/           # Data models
β”‚   β”‚   └── state.py            # Shared agent state
β”‚   β”œβ”€β”€ services/         # Service layer
β”‚   β”‚   β”œβ”€β”€ llm_service.py      # LLM integration
β”‚   β”‚   └── mcp_service.py      # MCP server integration
β”‚   β”œβ”€β”€ utils/            # Utilities
β”‚   β”‚   β”œβ”€β”€ config.py           # Configuration management
β”‚   β”‚   └── cli.py              # CLI utilities
β”‚   β”œβ”€β”€ workflow.py       # ADK workflow orchestration
β”‚   └── main.py          # CLI entry point
β”œβ”€β”€ output/              # Generated notebooks
β”œβ”€β”€ run.py              # Convenience runner
β”œβ”€β”€ requirements.txt    # Dependencies
β”œβ”€β”€ .env.example       # Environment template
└── README.md          # This file

Components

Models (Pydantic)

  • AgentState: Shared state across workflow stages
  • Technology: Individual technology representation
  • TechnologyList: Structured output from parser
  • DiscoveryResult: Documentation discovery results
  • JudgementResult: LLM judge evaluation
  • NotebookMetadata: Generated notebook metadata

Agents

All agents follow the ReAct pattern where applicable:

  1. Parser Agent: Analyzes natural language β†’ structured technologies
  2. Discovery Agent: Parallel MCP queries for documentation
  3. Judge Agent: Evaluates data quality and sufficiency
  4. Notebook Generator: Synthesizes notebook from discoveries

Services

  • LLMService: Unified interface for Anthropic/Google LLMs

    • Text generation
    • Structured output (JSON mode)
    • Context-aware generation
  • MCPService: Interface to MCP servers

    • Context7: Library documentation
    • DeepWiki: GitHub repository docs
    • Tavily: Web search

Workflow (ADK)

The ADK workflow orchestrates all agents with:

  • Sequential execution with state passing
  • Retry loops for discovery
  • User interaction points
  • Error handling

Advanced Features

Structured Output

Uses Pydantic models for type-safe, validated outputs from LLMs:

result = llm.generate_structured(
    prompt=prompt,
    response_model=TechnologyList,
    system_prompt=system_prompt
)

Parallel Discovery

Discovers from multiple MCP servers simultaneously:

discoveries = await mcp_service.discover_all(
    technology="React",
    topic="hooks",
    repo_name="facebook/react"
)

Retry Loop with Judge

Automatically retries discovery if data is insufficient:

while not sufficient_data and iterations < max_iterations:
    discover()
    judge()

Limitations & Future Work

Current Limitations (MVP)

  1. MCP integration is structured but needs actual MCP SDK calls
  2. User feedback in approval step doesn't trigger re-parsing
  3. Limited error recovery in generation
  4. No notebook execution validation

Planned Enhancements

  • Full MCP SDK integration
  • Iterative refinement based on user feedback
  • Notebook execution and testing
  • Multi-turn conversation for clarification
  • Support for more MCP servers
  • Template-based notebook generation
  • Version control integration
  • Export to multiple formats (HTML, PDF)

Troubleshooting

"Configuration error: ANTHROPIC_API_KEY is required"

Ensure your .env file has the correct API key:

ANTHROPIC_API_KEY=sk-ant-...

"No discoveries found"

  1. Check MCP servers are configured in ~/.cursor/mcp.json
  2. Verify MCP servers are enabled in .env
  3. Check network connectivity

"Notebook generation failed"

  • Try with a simpler tech stack
  • Check LLM token limits
  • Review error messages in console

Contributing

This is an MVP implementation. Contributions welcome for:

  • Full MCP SDK integration
  • Additional MCP servers
  • Enhanced notebook templates
  • Better error handling
  • Test coverage

License

MIT License - see LICENSE file for details

Acknowledgments

About

Testing W&B

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published