Skip to content

Make IterToMap loading more lazily #454

Open
@ejguan

Description

🚀 The feature

Currently, IterToMap starts to load all data from prior IterDataPipe when the first __getitem__ is invoked here.

We can stop loading data from prior IterDataPipe whenever we find the requested index. And, we might need to add a flag to prevent loading data multiple times.

Motivation, pitch

This would improve the performance if users simply iterate over the MapDataPipe as we don't need to pre-load everything at the beginning of the iteration, basically, simulating the behavior of IterDataPipe.

Alternatives

No response

Additional context

No response

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions