Skip to content
This repository was archived by the owner on Jul 17, 2025. It is now read-only.
This repository was archived by the owner on Jul 17, 2025. It is now read-only.

Unbounded TLB WorkQueue #290

@hunhoffe

Description

@hunhoffe

Describe the bug

The TLB WorkQueue is currently not bounded in size. It holds both shootdown requests (which are important to handle) and advance replica work requests (which are less important to handle).

Currently, if the queue is full and enqueue is called the error is ignored: https://github.com/vmware-labs/node-replicated-kernel/blob/fc25186d57ca400c8e4a7cb313deb8eabd21d971/kernel/src/arch/x86_64/tlb.rs#L112

If this is uncommented, it becomes clear that some requests may be dropped if the queue is full.

Reproduction steps

  1. Change the line to check for failure to enqueue (use expect to unwrap the result)
  2. Run the fxmark benchmark with 96-ish cores
  3. Most of the time, it will cause an error.

Expected behavior

We would like a scenario where the queue has a theoretical bound, so that we can ensure it is always possible to enqueue. This is an important property because, overall, we just want to make sure shootdowns are not lost.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions