Hardening: Prevent API Server OOM via Rollout Backpressure

## Describe the Issue

The Atropos API server (`atroposlib/api/server.py`) lacked a mechanism to limit the size of its rollout trajectory queue. 

In high-throughput environments where rollout workers generate data faster than the Trainer can process it, the trajectories would accumulate indefinitely in the server's memory. This leads to unbounded memory growth (RAM) and eventually causes the API server to be terminated by the system's OOM (Out of Memory) Killer.

## Environment/API Details

- **Environment Class/Name:** `atroposlib/api/server.py`
- **API Endpoint/Method Involved:** `/scored_data` (submission of trajectories)

## Steps to Reproduce

1. Launch a training run with a high number of parallel rollout workers.
2. Slow down the Trainer (e.g., by increasing gradient accumulation steps or using a very large model).
3. Monitor the RAM usage of the API server process.
4. Observe that memory increases linearly until the process crashes.

## Interaction Details (if applicable)

- **Expected Behavior:** 
  1. The API server should have a configurable `MAX_QUEUE_SIZE`.
  2. When the queue is full, the server should return an `HTTP 503 Service Unavailable` status to rollout workers, forcing them to wait or retry (Backpressure).

## Setup Details

- **OS:** Linux
- **Python Version:** 3.10+
- **Atropos Version:** commit c20c852
- **Relevant Libraries/Versions:** `fastapi`, `uvicorn`

## Additional Context & Logs

Implementing backpressure ensures that the entire training system remains stable even when there is a mismatch between data generation and consumption rates.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardening: Prevent API Server OOM via Rollout Backpressure #459

Describe the Issue

Environment/API Details

Steps to Reproduce

Interaction Details (if applicable)

Setup Details

Additional Context & Logs

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Hardening: Prevent API Server OOM via Rollout Backpressure #459

Description

Describe the Issue

Environment/API Details

Steps to Reproduce

Interaction Details (if applicable)

Setup Details

Additional Context & Logs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions