Skip to content

[2/N] Pass model_config to the Attention constructors#38661

Open
MatthewBonanni wants to merge 5 commits intovllm-project:mainfrom
MatthewBonanni:thread-model-config
Open

[2/N] Pass model_config to the Attention constructors#38661
MatthewBonanni wants to merge 5 commits intovllm-project:mainfrom
MatthewBonanni:thread-model-config

Conversation

@MatthewBonanni
Copy link
Copy Markdown
Collaborator

@MatthewBonanni MatthewBonanni commented Mar 31, 2026

Purpose

We already pass cache_config and quant_config as arguments to Attention.__init__(), but model_config is routinely grabbed from get_current_vllm_config. #38124 requires accesssing model_config.dtype much more frequently, so this PR passes model_config as an argument to standardize and reduce reliance on get_current_vllm_config.

Test Plan

CI

Test Result

TBD


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@mergify
Copy link
Copy Markdown

mergify bot commented Mar 31, 2026

Hi @MatthewBonanni, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the model executor layers and attention mechanisms across various model implementations to explicitly pass the model_config object. It also introduces a mapping from torch.dtype to KV cache string representations in vllm/utils/torch_utils.py to ensure correct cache configuration when cache_config is not provided. I have no further feedback to provide.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@MatthewBonanni MatthewBonanni added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 31, 2026
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models gpt-oss Related to GPT-OSS models llama Related to Llama models qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: To Triage

Development

Successfully merging this pull request may close these issues.

1 participant