📋 Implementation Tracker: Issue #508 - Episodic Return Logging Bug
Original Issue
Issue #508: When using multiple parallel environments (num_envs > 1), episodic returns were logged at the same TensorBoard step, causing data loss.
Problem Description
When multiple environments finish episodes simultaneously:
# OLD CODE - BROKEN
for info in infos["final_info"]. if info and "episode" in info:
writer.add_scalar("charts/episodic_return", info["episode"]["r"], global_step)
# All envs log at the SAME step - data is overwritten!
Result: Only the last episode's return is visible in TensorBoard.
Solution Implemented
# NEW CODE - FIXED
for i, info in enumerate(infos["final_info"]). if info and "episode" in info:
logging_step = global_step - args.num_envs + i
writer.add_scalar("charts/episodic_return", info["episode"]["r"], logging_step)
# Each env logs at a UNIQUE step - all data preserved!
Implementation Status
| Component |
Status |
Details |
| ✅ Code Changes |
Complete |
30 files updated |
| ✅ Unit Tests |
Complete |
6 tests, all passing |
| ✅ Demo Script |
Complete |
demo_fix.py visualization |
| ⏳ CI Tests |
Blocked |
JAX dependency issue #540 |
| ⏳ Code Review |
Pending |
Awaiting maintainer approval |
| ⏳ Merge |
Pending |
Blocked by CI |
Files Modified
Algorithm Files (25):
cleanrl/ppo.py - Added enumerate and logging_step
cleanrl/ppo_atari.py - Added enumerate and logging_step
cleanrl/ppo_atari_envpool.py - Added enumerate and logging_step
cleanrl/ppo_atari_envpool_xla.py - Added enumerate and logging_step
cleanrl/ppo_atari_jax.py - Added enumerate and logging_step
cleanrl/ppo_continuous_action.py - Added enumerate and logging_step
cleanrl/ppo_continuous_action_jax.py - Added enumerate and logging_step
cleanrl/ppo_procgen.py - Removed break, added logging_step
cleanrl/sac_continuous_action.py - Removed break, added logging_step
cleanrl/sac_atari.py - Removed break, added logging_step
cleanrl/sac_ae_continuous_action.py - Removed break, added logging_step
cleanrl/td3_continuous_action.py - Removed break, added logging_step
cleanrl/td3_continuous_action_jax.py - Added enumerate and logging_step
cleanrl/ddpg_continuous_action.py - Added enumerate and logging_step
cleanrl/ddpg_continuous_action_jax.py - Added enumerate and logging_step
cleanrl/dqn.py - Added enumerate and logging_step
cleanrl/dqn_atari.py - Added enumerate and logging_step
cleanrl/dqn_atari_jax.py - Added enumerate and logging_step
cleanrl/dqn_jax.py - Added enumerate and logging_step
cleanrl/c51.py - Added enumerate and logging_step
cleanrl/c51_atari.py - Added enumerate and logging_step
cleanrl/c51_atari_jax.py - Added enumerate and logging_step
cleanrl/c51_jax.py - Added enumerate and logging_step
cleanrl/qdagger_dqn_atari_jax_impalacnn.py - Added enumerate and logging_step
cleanrl/trpo_continuous_action.py - Added enumerate and logging_step
cleanrl/ppo_atari_envpool_xla_jax.py - Added enumerate and logging_step
cleanrl/ppo_atari_envpool_xla_jax_scan.py - Added enumerate and logging_step
cleanrl/ppo_lstm_atari.py - Added enumerate and logging_step
cleanrl/ppo_atari_lstm.py - Added enumerate and logging_step
cleanrl/ppo_atari_envpool_xla_jax_scan.py - Added enumerate and logging_step
Test Coverage
New Test File: tests/test_episodic_logging.py
def test_single_episode_logging():
"""Test that single episode logs correctly"""
def test_multiple_episodes_same_step():
"""Test that multiple episodes at same step log at unique steps"""
def test_no_duplicate_steps():
"""Test that all logging steps are unique"""
def test_all_episodes_logged():
"""Test that all episode returns are logged"""
def test_per_env_step_calculation():
"""Test the per-env logging_step formula"""
def test_logging_step_doesnt_exceed_global_step():
"""Test that logging_step <= global_step"""
Result:
tests/test_episodic_logging.py::test_single_episode_logging PASSED
tests/test_episodic_logging.py::test_multiple_episodes_same_step PASSED
tests/test_episodic_logging.py::test_no_duplicate_steps PASSED
tests/test_episodic_logging.py::test_all_episodes_logged PASSED
tests/test_episodic_logging.py::test_per_env_step_calculation PASSED
tests/test_episodic_logging.py::test_logging_step_doesnt_exceed_global_step PASSED
6 passed
Visualization
Demo Script: demo_fix.py
Shows the before/after comparison:
- BEFORE: All episodes log at step 1000 → only last value (30.0) is visible
- AFTER: Episodes log at steps 997, 998, 999, 1000 → all values visible
Pull Request
PR #539: #539
Blocking Issues
Issue #540: JAX CI tests failing (repository-wide issue)
- Affects ALL recent PRs
- Needs separate fix to update JAX dependencies
- Not related to the episodic logging fix itself
Validation Script
Script: test_ci_fix.py
Validates:
- ✅ pyproject.toml requires-python is correct
- ✅ JAX dependencies are present
- ✅ Episodic logging fix is in place
- ✅ Break statements were removed
- ✅ Unit tests pass
- ✅ Count of fixed files (31 files)
Impact
Before Fix:
- Only one episode's return visible when multiple envs finish simultaneously
- Biased logging (first or last episode depending on implementation)
- Inaccurate training metrics
After Fix:
- All episode returns logged at unique TensorBoard steps
- Unbiased, complete logging
- Accurate training metrics
Related Issues/PRs
Checklist
Labels: enhancement, bug-fix, logging, testing, ready-for-merge
📋 Implementation Tracker: Issue #508 - Episodic Return Logging Bug
Original Issue
Issue #508: When using multiple parallel environments (
num_envs > 1), episodic returns were logged at the same TensorBoard step, causing data loss.Problem Description
When multiple environments finish episodes simultaneously:
Result: Only the last episode's return is visible in TensorBoard.
Solution Implemented
Implementation Status
demo_fix.pyvisualizationFiles Modified
Algorithm Files (25):
cleanrl/ppo.py- Added enumerate and logging_stepcleanrl/ppo_atari.py- Added enumerate and logging_stepcleanrl/ppo_atari_envpool.py- Added enumerate and logging_stepcleanrl/ppo_atari_envpool_xla.py- Added enumerate and logging_stepcleanrl/ppo_atari_jax.py- Added enumerate and logging_stepcleanrl/ppo_continuous_action.py- Added enumerate and logging_stepcleanrl/ppo_continuous_action_jax.py- Added enumerate and logging_stepcleanrl/ppo_procgen.py- Removed break, added logging_stepcleanrl/sac_continuous_action.py- Removed break, added logging_stepcleanrl/sac_atari.py- Removed break, added logging_stepcleanrl/sac_ae_continuous_action.py- Removed break, added logging_stepcleanrl/td3_continuous_action.py- Removed break, added logging_stepcleanrl/td3_continuous_action_jax.py- Added enumerate and logging_stepcleanrl/ddpg_continuous_action.py- Added enumerate and logging_stepcleanrl/ddpg_continuous_action_jax.py- Added enumerate and logging_stepcleanrl/dqn.py- Added enumerate and logging_stepcleanrl/dqn_atari.py- Added enumerate and logging_stepcleanrl/dqn_atari_jax.py- Added enumerate and logging_stepcleanrl/dqn_jax.py- Added enumerate and logging_stepcleanrl/c51.py- Added enumerate and logging_stepcleanrl/c51_atari.py- Added enumerate and logging_stepcleanrl/c51_atari_jax.py- Added enumerate and logging_stepcleanrl/c51_jax.py- Added enumerate and logging_stepcleanrl/qdagger_dqn_atari_jax_impalacnn.py- Added enumerate and logging_stepcleanrl/trpo_continuous_action.py- Added enumerate and logging_stepcleanrl/ppo_atari_envpool_xla_jax.py- Added enumerate and logging_stepcleanrl/ppo_atari_envpool_xla_jax_scan.py- Added enumerate and logging_stepcleanrl/ppo_lstm_atari.py- Added enumerate and logging_stepcleanrl/ppo_atari_lstm.py- Added enumerate and logging_stepcleanrl/ppo_atari_envpool_xla_jax_scan.py- Added enumerate and logging_stepTest Coverage
New Test File:
tests/test_episodic_logging.pyResult:
Visualization
Demo Script:
demo_fix.pyShows the before/after comparison:
Pull Request
PR #539: #539
Blocking Issues
Issue #540: JAX CI tests failing (repository-wide issue)
Validation Script
Script:
test_ci_fix.pyValidates:
Impact
Before Fix:
After Fix:
Related Issues/PRs
Checklist
Labels: enhancement, bug-fix, logging, testing, ready-for-merge