Open-source agentic infrastructure for planning, running, and evaluating product experiments.
- Agent-Based Architecture: Extensible agent system with base classes and registry
- Multi-Provider LLM Support: Works with OpenAI, Anthropic, and Mistral
- Workflow Orchestration: Build multi-step experiment workflows with dependencies, conditional execution, and data flow between steps
- Production Ready: Logging, metrics, error handling, and retry logic
- CLI Interface: Command-line tools for quick experimentation
- Web API: FastAPI-based REST API for integration
- Type Safe: Full type hints and Pydantic models
- Ensure that you have Python 3.10+ installed.
python -V- (Recommended) Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: venv\Scripts\activate
- Install the package:
# For development (includes dev dependencies)
pip install -e ".[dev]"
# Or for production only
pip install -e .Note: Dependencies are managed in pyproject.toml. The [dev] extra
includes testing, linting, and development tools.
-
Configure API Keys
ExperimentKit requires API keys from at least one LLM provider to function. Supported providers:
- OpenAI (default) - Get your API key from OpenAI Platform
- Anthropic - Get your API key from Anthropic Console
- Mistral - Get your API key from Mistral AI Platform
Create a
.envfile in the project root directory:# .env file OPENAI_API_KEY=your_openai_api_key_here # ANTHROPIC_API_KEY=your_anthropic_api_key_here # MISTRAL_API_KEY=your_mistral_api_key_here
Note: You only need to set the API key for the provider(s) you plan to use. The default provider is OpenAI. Uncomment and set the API keys for other providers if you want to use them.
Security: Never commit your
.envfile to version control. It should already be in.gitignore.
# Refine a hypothesis
experimentkit refine \
"Users who see personalized onboarding screens are more likely to upgrade."
# Analyze a hypothesis
experimentkit analyze "Refined hypothesis text here"
# Run the complete workflow
experimentkit workflow "Your hypothesis here"
# Show configuration
experimentkit configfrom src.workflows.hypothesis import HypothesisRefinementWorkflow
# Run the complete workflow
workflow = HypothesisRefinementWorkflow()
result = workflow.execute(initial_inputs={"hypothesis": "Your hypothesis"})
print(result.steps["refine"]["result"]) # Refined hypothesis
print(result.steps["analyze"]["result"]) # Analysis
print(result.steps["revise"]["result"]) # Revised hypothesisfrom src.agents import hypothesis_refiner, hypothesis_analyzer, hypothesis_reviser
# Refine a hypothesis
refined = hypothesis_refiner("Your hypothesis")
# Analyze it
analysis = hypothesis_analyzer(refined)
# Revise based on analysis
revised = hypothesis_reviser(refined, analysis)Start the API server:
uvicorn src.api.app:create_app --reloadThen access the API at http://localhost:8000:
GET /api/v1/agents/list- List available agentsPOST /api/v1/agents/execute- Execute an agentPOST /api/v1/workflows/hypothesis-refinement/execute- Run workflow
See the interactive API docs at http://localhost:8000/docs.
Create multi-step workflows with dependencies and conditional execution:
from src.workflows.workflow import Workflow
class MyWorkflow(Workflow):
def __init__(self):
super().__init__("my_workflow")
self.add_step(
name="step1",
agent_name="agent1",
inputs={"data": "$input"},
).add_step(
name="step2",
agent_name="agent2",
inputs={"data": "$step1"}, # Uses result from step1
depends_on=["step1"], # Must run after step1
)See docs/WORKFLOWS.md for detailed workflow documentation.
experimentkit/
├── src/
│ ├── agents/ # Agent implementations
│ │ └── hypothesis/ # Hypothesis-related agents
│ ├── api/ # FastAPI web application
│ ├── config/ # Configuration management
│ ├── core/ # Core components (agents, registry, etc.)
│ ├── models/ # Pydantic data models
│ ├── services/ # Service layer
│ ├── utils/ # Utility functions
│ └── workflows/ # Workflow orchestration
├── tests/ # Test suite
├── examples/ # Example scripts
└── docs/ # Documentation# All tests
make test
# Unit tests only
make test-unit
# Integration tests only
make test-integration
# With coverage
make test-cov# Format code
make format
# Run linters
make lint
# Type checking
make type-checkPre-commit hooks are automatically installed with development dependencies. They will run formatting, linting, and type checking before each commit.
See CONTRIBUTING.md for guidelines on contributing to ExperimentKit.
MIT License - see LICENSE for details.