-
Notifications
You must be signed in to change notification settings - Fork 3
Open
1 / 31 of 3 issues completedDescription
Motivation
According to the paper io_uring: Rethinking Asynchronous I/O for Storage Systems, io_uring can significantly improve CPU utilization and I/O latency for storage systems, especially those with a buffer manager.
Based on this observation, Riffle introduces an io_uring-based I/O handler (UringIO) to optimize local shuffle storage performance.
Prerequisites
- Linux kernel version >= 5.10
- Currently verified on Anolis OS 8
Conclusions
- With io_uring enabled and 16 threads per disk (still under tuning), CPU load is reduced by approximately 3× compared to the non-io_uring implementation.
Update: Further tuning shows that 2 threads per disk are sufficient, achieving comparable overall throughput. - After enabling io_uring:
- Write throughput reaches 5 GB/s, compared to 3.75 GB/s without io_uring (~25% improvement).
- Read throughput shows no significant improvement, which is expected because:
- Read requests are not intensive
- Each read operation transfers relatively large data blocks
- io_uring provides limited benefits for this access pattern
Benchmark
The benchmark runs a 2.2 TB TeraSort application, with a single Riffle server handling all shuffle data.
Performance Results
| Type | Write Time | Read Time |
|---|---|---|
| With io_uring | 3.4 min | 6.3 min |
| Without io_uring | 4.1 min | 8.3 min |
cpu load comparison
Subtasks
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels