chore: bump version to 0.1.1 and update CHANGELOG

Lawhy · claude · Lawhy · commit 95df06c7ca61 · 2026-02-06T02:07:17.000-08:00
Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,47 +7,34 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
-### Added
-
-- **`Evaluator`**: Concurrent evaluation orchestrator with checkpointing, resume, and pass@k metrics.
-  - tqdm progress bar with `logging_redirect_tqdm` for clean output
-  - `n_samples_per_prompt` for pass@k evaluation
-  - JSONL checkpointing with automatic resume
-- **`AIMEEvaluator`**: AIME benchmark evaluator subclass.
-- **`MathRewardFunction`**: Math reward using `math-verify` for symbolic equivalence checking.
-- **`utils/sglang.py`**: SGLang client caching utilities.
-  - `get_cached_client(base_url, max_connections)` with `lru_cache`
-  - `get_cached_client_from_slime_args(args)` for slime RL training integration
-- **`utils/aws.py`**: AWS boto3 session caching utilities.
-  - `get_boto3_session(region, profile_name)` with `lru_cache`
-  - `get_assumed_role_session(role_arn, region)` with `RefreshableCredentials` for auto-refresh
-- **`tools/code_interpreter.py`**: `CodeInterpreterToolkit` for AWS Bedrock AgentCore Code Interpreter.
-  - `execute_code` tool for running Python code
-  - `execute_command` tool for running shell commands
-- **`environments/code_sandbox/`**: `CodeSandboxEnv` using AWS Bedrock AgentCore Code Interpreter.
-  - `CodeMode` enum for configurable tool availability (CODE, TERMINAL, CODE_AND_TERMINAL)
-  - Async `cleanup()` for session cleanup
-- **`environments/calculator/`**: `CalculatorEnv` renamed from SimpleMathEnv for clarity.
-- Added `boto3`, `datasets`, `tqdm` to main dependencies.
-
-## [0.0.2] - 2026-02-03
-
-### Fixed
-
-- Replace git dependency (`strands-sglang @ git+...`) with PyPI package (`strands-sglang>=0.1.2`) to fix PyPI upload rejection.
-
-## [0.0.1] - 2026-02-03 [yanked]
-
-Initial release — core abstractions only. Environments will be added in future releases.
+## [0.1.1] - 2026-02-06
 
 ### Added
 
-- **`Environment`** base class: `step()`, `reset()`, `cleanup()`, `get_tools()`, `get_hooks()`, `compute_metrics()`.
-- **`Action` / `TaskContext`**: User message + ground truth, conversation history, and arbitrary metadata (`extra="allow"`).
+- **Environments**
+  - `CalculatorEnv`: Simple calculator tool for math problems.
+  - `CodeSandboxEnv`: AWS Bedrock AgentCore Code Interpreter with `CodeMode` enum.
+- **Evaluation**
+  - `Evaluator`: Concurrent evaluation with checkpointing, resume, and pass@k metrics.
+  - `AIMEEvaluator`: AIME benchmark evaluator.
+  - `MathRewardFunction`: Math reward using `math-verify` for symbolic equivalence.
+- **Utilities**
+  - `utils/sglang.py`: SGLang client caching with `lru_cache`.
+  - `utils/aws.py`: AWS boto3 session caching with `RefreshableCredentials` for auto-refresh.
+- **Tools**
+  - `CodeInterpreterToolkit`: `execute_code` and `execute_command` for sandboxed execution.
+- **Examples**
+  - `aime_eval.py`: Support `--env chat` and `--env code` modes with `--role-arn` option.
+  - `common.py`: Use cached SGLang client with connection pooling.
+
+## [0.1.0] - 2026-02-03
+
+Initial release with core abstractions.
+
+- **`Environment`** base class: `step()`, `reset()`, `cleanup()`, `get_tools()`, `get_hooks()`.
+- **`Action` / `TaskContext`**: User message + ground truth, conversation history, and arbitrary metadata.
 - **`Observation`**: Step messages, metrics, and optional `TokenObservation` for TITO training.
 - **`StepResult`**: Bundles observation, reward, and termination reason.
-- **`TerminationReason`**: Maps agent exceptions (`MaxToolIterationsReachedError`, `MaxTokensReachedException`, timeouts) to enum values via cause-chain walking.
+- **`TerminationReason`**: Maps agent exceptions to enum values via cause-chain walking.
 - **`RewardFunction` / `RewardResult`**: Abstract reward interface with scalar reward + diagnostics.
-- **`ModelFactory`** type and factory functions for SGLang, Bedrock, and OpenAI backends.
-- **`examples/math_env.py`**: Calculator tool example with exact-match reward, supporting SGLang and Bedrock.
-- CI/CD: GitHub Actions for testing (lint + unit tests on Python 3.10–3.12) and PyPI publishing.
+- **`ModelFactory`**: Factory functions for SGLang, Bedrock, and OpenAI backends.
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "strands-env"
-version = "0.1.0"
+version = "0.1.1"
 description = "RL environments for Strands Agents — step, observe, reward."
 authors = [{name = "Yuan He", email = "yuanhe.cs.ai@gmail.com"}]
 requires-python = ">=3.10"