docs: update rllm integration status and add training example links (#45)

luyuzhe111 · web-flow · commit ff1b63e86c6c · 2026-03-27T14:47:55.000-07:00
diff --git a/AGENTS.md b/AGENTS.md
@@ -17,7 +17,6 @@ This document provides context, patterns, and guidelines for AI coding assistant
 - [Environment Variables](#environment-variables)
 - [Common Tasks](#common-tasks)
 - [Development Tips](#development-tips)
-- [Known Limitations & TODOs](#known-limitations--todos)
 - [External References](#external-references)
 
 ---
@@ -494,13 +493,6 @@ uv run pre-commit install
 
 ---
 
-## Known Limitations & TODOs
-
-### Design Improvements
-- **Model gateway (in preview)**: [rllm-model-gateway](https://github.com/rllm-org/rllm/tree/main/rllm-model-gateway) replaces the need for `vLLMModel` client-side token collection. The gateway proxies inference requests and captures token IDs + logprobs transparently at the HTTP layer. Integration with rllm training backends is under active development. The legacy `vLLMModel` under `frameworks/strands/` is retained for backward compatibility.
-
----
-
 ## External References
 
 - **ACR Documentation**: https://docs.aws.amazon.com/bedrock-agentcore/
@@ -510,4 +502,5 @@ uv run pre-commit install
 - **Runtime SDK Overview**: https://aws.github.io/bedrock-agentcore-starter-toolkit/user-guide/runtime/overview.html
 - **HTTP Protocol Contract**: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-http-protocol-contract.html#container-requirements-http
 - **rLLM SDK (reference)**: https://rllm-project.readthedocs.io/en/latest/core-concepts/sdk/#1-define-your-agent-function
-- **rllm-model-gateway** (token capture proxy for RL training): https://github.com/rllm-org/rllm/tree/main/rllm-model-gateway
+- **rllm-model-gateway** (token capture proxy for RL training): https://github.com/rllm-org/rllm/tree/main/rllm-model-gateway | [PyPI](https://pypi.org/project/rllm-model-gateway/)
+- **AgentCore math training example** (rllm + Tinker backend): https://github.com/rllm-org/rllm/blob/main/examples/agentcore_math/train_agentcore_math_tinker.sh
diff --git a/README.md b/README.md
@@ -127,7 +127,7 @@ In practice, this is infrastructure managed by the training framework:
 - **During training**: the training engine points `base_url` through the gateway automatically
 - **During evaluation**: `base_url` points directly to any OpenAI-compatible endpoint (vLLM, SGLang, LiteLLM, etc.), or you can use `BedrockModel` via the Bedrock API — no gateway involved
 
-The gateway is currently in preview. See the [rllm-model-gateway repo](https://github.com/rllm-org/rllm/tree/main/rllm-model-gateway) for details.
+The gateway is [available on PyPI](https://pypi.org/project/rllm-model-gateway/) (`pip install rllm-model-gateway`). See the [rllm-model-gateway repo](https://github.com/rllm-org/rllm/tree/main/rllm-model-gateway) for details.
 
 ## Client-Side: Invoking Agents and Collecting Results
 
@@ -267,7 +267,8 @@ The training architecture follows a **decoupled design** where agent rollouts an
 This architecture enables parallel and highly efficient rollouts with secure execution during RL training. The decoupled design means training libraries only need the agent's container image to start training—agent code and dependencies stay completely separate from the training library.
 
 **Supported Training Libraries:**
-- [rllm](https://github.com/rllm-org/rllm) integration coming soon (supports multiple backends: veRL, Tinker, and more)
+- [rLLM](https://github.com/rllm-org/rllm) — supports multiple backends (veRL, Tinker, and more)
+  - [Math Agent](examples/strands_math_agent/): [Tinker](https://github.com/rllm-org/rllm/blob/main/examples/agentcore_math/train_agentcore_math_tinker.sh)
 
 ### Prepare Your Agent Container