🔥 Remove all continuous batching tests#693
🔥 Remove all continuous batching tests#693joerunde wants to merge 17 commits into2.0-release-prepfrom
Conversation
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. We also recommend installing prek and configuring it to check your code before every local commit. |
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
| repo: "git+https://github.com/vllm-project/vllm --branch main" | ||
| test_suite: | ||
| - name: "chunked prefill" | ||
| markers: "cpu and chunked_prefill and not prefix_caching and not quantized and not multimodal" |
There was a problem hiding this comment.
The chunked_prefill mark still exists, but I don't think it'll make sense to keep maintaining since every decoder test will be using chunked prefill.
We could clean up mode: str with these marks to something like disable_prefix_caching: bool. I think that'd make sense to do once we swap the default behavior to using prefix caching, then we can selectively disable for some tests where required.
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Signed-off-by: Joe Runde <joe@joerun.de>
Description
Starting on #679, working on deleting all of the CB tests and refactoring the cases that were only written for CB to run with CP instead.
Leaving this PR as test-only changes to limit the diff. We can work separately on actually removing all the CB related code.