Summary
In DP deployments (data-parallel), different ranks maintain their own KV-cache. Ranks could be deployed in separate vLLM processes (different API servers/frontends), or within the same vLLM frontend.
KVEvents have a data_parallel_rank field that is sent in every message but is currently ignored. While this makes sense in the latter kind of deployments, within the first, the DP rank can be assigned on the connection/subscription level.
This gap should be closed in accommodation with the DP-aware wide-ep work.