Skip to content

Storage V2: lazy ExperimentResult.episodes + iterator-based bulk methods #264

@recursix

Description

@recursix

Problem

ExperimentResult.episodes() reads all episode.metadata.json files eagerly on first access. load_all_trajectories() and load_all_trajectory_metadata() return list[Trajectory], loading everything into memory.

At 100k episodes, this is a significant memory and latency bottleneck.

Proposed Fix

  • Make ExperimentResult.episodes() yield EpisodeResult wrappers without reading metadata — the wrapper already loads metadata lazily on first .metadata() call.
  • Return iterators instead of lists from load_all_trajectories() and load_all_trajectory_metadata().
  • Consider ThreadPoolExecutor for parallel metadata reads when the full list is needed.

Context

Identified during review of PR #262 (Storage V2).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions