On the compact branch, in lmcache_vllm/scheduler_adapter.py, PreemptionMode is undefined on line 270. This will lead to a variable undefined error whenever a sequence group gets pre-empted, e.g. due to insufficient space for KV cache on GPU. To reproduce, try running any offline batch inference workload with a large batch size.