Maybe we should simplify the code, so we just fill the io_uring submission queue until len(sq) + len(cq) - MAX_ENTRIES_PER_ITERATION >= SQ_LEN. Don't bother with the HIGH_WATER_MARK idea. I'm not sure it makes sense?
I should probably benchmark though!