Release v0.0.4

akhatua2 · akhatua2 · commit f81013765cae · 2026-02-14T17:53:58.000-08:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,31 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.0.4] - 2026-02-14
+
+### Added
+
+- **Token usage tracking** - `AgentResult` now reports `input_tokens`, `output_tokens`, `cache_read_tokens`, and `cache_write_tokens` throughout the pipeline
+- **Fallback cost calculator** - New `pricing.py` module computes cost from token counts when litellm doesn't report it, with manual pricing table for custom endpoints
+- **Log directory passthrough** - Runners now pass `log_dir` path to agents for downstream logging (PR #32)
+- **Eval stats in summary** - Run summary JSON now includes pass rate and per-task eval results when auto-eval is enabled
+- **Gold conflict checker** - New `scripts/check_gold_conflicts.py` to detect merge conflicts between gold patches across all tasks using parallel Modal sandboxes
+- **Benchmark runner script** - New `scripts/run_benchmark.sh` for quick experiment launches
+- **Model smoke test** - New `scripts/test_model.py` to verify models work via LiteLLM
+
+### Changed
+
+- **Improved cooperation prompt** - Replaced situational-awareness prompt with explicit numbered workflow (plan → coordinate → summarize) after A/B testing showed better coordination and lower cost
+- **OpenHands SDK dependencies promoted to core** - Moved from `[openhands]` optional extra into base dependencies for simpler installation
+- **HTTP retry logic** - Remote conversation requests now retry on 5xx errors with exponential backoff (via tenacity)
+- **Patch extraction timing** - Patches are now extracted while the sandbox is still alive, before stats collection
+
+### Fixed
+
+- **Pricing calculation** - Fixed cost reporting for models where litellm returns zero cost
+- **MaxIterationsReached handling** - Now caught inside the conversation loop instead of as an outer exception, preventing lost patches
+- **Custom API base URLs** - `ANTHROPIC_BASE_URL` and `OPENAI_BASE_URL` now forwarded to sandboxes
+
 ## [0.0.3] - 2026-02-04
 
 ### Added
diff --git a/src/cooperbench/__about__.py b/src/cooperbench/__about__.py
@@ -1,3 +1,3 @@
 """Version information for CooperBench."""
 
-__version__ = "0.0.3"
+__version__ = "0.0.4"

Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,3 @@`
`1`	`1`	`"""Version information for CooperBench."""`
`2`	`2`
`3`		`-__version__ = "0.0.3"`
	`3`	`+__version__ = "0.0.4"`