Skip to content

Latest commit

 

History

History
71 lines (55 loc) · 3.45 KB

File metadata and controls

71 lines (55 loc) · 3.45 KB

Contributing to cube-harness

For contribution philosophy, DCO requirements, RFC process, and community guidelines, see the canonical CONTRIBUTING.md in cube-standard.

OpenSpec — how we manage contracts

We follow the OpenSpec methodology. Each layer of cube-harness has a living spec in openspec/specs/<layer>/spec.md that defines its public API, invariants, and gotchas. Before modifying a layer, read its spec. After a PR that changes a public contract, run /update-openspec in Claude Code to sync the spec.

For breaking changes, write a short delta proposal in openspec/changes/<name>/ before coding — this makes the contract change visible to the team before code lands.

Full workflow, delta format, and examples: openspec/README.md.
The methodology reference is in cube-standard's openspec/README.md.

Setup

git clone https://github.com/The-AI-Alliance/cube-harness.git
cd cube-harness
make install           # uv sync --all-extras
pre-commit install --hook-type pre-commit --hook-type commit-msg
make lint    # ruff check + format (auto-fix)
make test    # pytest tests/

All commits need a DCO sign-off: git commit -s -m "...". Running make install sets up a git hook that adds this automatically.

Repo Layout

src/cube_harness/
  agent.py          # Agent protocol and AgentConfig base
  benchmark.py      # Benchmark interface for task collections
  core.py           # Data structures: Action, Observation, Trajectory, Task
  environment.py    # Environment and EnvConfig abstractions
  episode.py        # Episode execution and trajectory persistence
  experiment.py     # Experiment configuration and statistics
  exp_runner.py     # Sequential and Ray-based parallel execution
  llm.py            # LLM wrapper using LiteLLM
  storage.py        # Trajectory storage backends
  tool.py           # Tool abstraction for action spaces
  agents/           # Agent implementations (ReAct, Genny, …)
  tools/            # Tool implementations (Playwright, BrowserGym, …)
  benchmarks/       # Benchmark wrappers (MiniWob, WorkArena, …)
  metrics/          # Telemetry and tracing (OpenTelemetry-based)
  action_spaces/    # Browser action space protocols
  analyze/          # Trajectory analysis and XRay inspection utilities
  mcp/              # MCP server for exposing tools via Model Context Protocol
recipes/            # Example experiment scripts
tests/              # Test suite

Licenses

Community

See also the AI Alliance community repo for cross-project guidelines and the Code of Conduct.