Skip to content

[CPU] Support head_size 512 in cpu_attn#38676

Merged
Isotr0py merged 1 commit intovllm-project:mainfrom
bigPYJ1151:attn_512
Apr 1, 2026
Merged

[CPU] Support head_size 512 in cpu_attn#38676
Isotr0py merged 1 commit intovllm-project:mainfrom
bigPYJ1151:attn_512

Conversation

@bigPYJ1151
Copy link
Copy Markdown
Member

@bigPYJ1151 bigPYJ1151 commented Apr 1, 2026

Purpose

Add 512 head_size support for in-coming models.

Test Plan

CI tests

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify
Copy link
Copy Markdown

mergify bot commented Apr 1, 2026

Documentation preview: https://vllm--38676.org.readthedocs.build/en/38676/

@mergify mergify bot added documentation Improvements or additions to documentation cpu Related to CPU backends v1 labels Apr 1, 2026
@bigPYJ1151 bigPYJ1151 added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 1, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for a head size of 512 to the CPU attention backend. The changes include updating the dispatch generation script, documentation, unit tests, and the backend's supported head sizes list. I have no feedback to provide.

Signed-off-by: jiang1.li <jiang1.li@intel.com>
@Isotr0py Isotr0py enabled auto-merge (squash) April 1, 2026 04:15
@Isotr0py Isotr0py merged commit 36d7f19 into vllm-project:main Apr 1, 2026
66 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpu Related to CPU backends documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants