Add HMA support to FS connector via vllm offloading connector HMA support

## Summary

Add HMA (Hybrid Memory Architecture) support to the llmd-fs-backend by adopting the new vllm offloading connector HMA interfaces.

This enables KV cache offloading for models with multiple KV cache groups, such as full attention + sliding window + Mamba (e.g., Jamba, Zamba, Command-A).

Related to vllm-project/vllm#33689 (KV Offloading Roadmap — HMA Support).

## What needs to happen

- Adopt new vllm HMA interfaces: `CanonicalKVCaches`, `OffloadKey`, updated `OffloadingManager` API
- Update FS connector worker, manager, spec, and mediums modules
- Update tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HMA support to FS connector via vllm offloading connector HMA support #472

Summary

What needs to happen

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add HMA support to FS connector via vllm offloading connector HMA support #472

Description

Summary

What needs to happen

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions