-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
To reduce the latency of localfile reading, we introduce the page cache to prefetch the data from the localfile.
Currently, I'm applying this strategy into our internal spark jobs to observe the potential performance improvement, so this issue is to track these metrics or more next improvements
Impressive improvement of current impl
Bugs fix
- Incorrect metric of read_ahead_misses #482
- Hang with read ahead when reading #475
- fix(read-ahead): Make sequential param as false by default #474
- feat(read-ahead): Correct hit metric and introduce 2 metrics to indicate latency #484
Potential improvements
- Slow read ahead system operation
- Apply this mechanism into the huge partition tasks
- feat(read-ahead): Introduce seperated options to control ahead batch number/size for client #481
- feat(read-ahead): followup #481 to respect options setting by client for read-ahead batch number/size #485
Metadata
Metadata
Assignees
Labels
No labels