Skip to content

Add support for tensors/heads not divisible by GPUs #3

Add support for tensors/heads not divisible by GPUs

Add support for tensors/heads not divisible by GPUs #3

Triggered via pull request February 3, 2025 18:37
Status Failure
Total duration 4m 10s
Artifacts

pre-commit.yml

on: pull_request
Fit to window
Zoom out
Zoom in

Annotations

10 errors
Ruff (F841): vllm/config.py#L690
vllm/config.py:690:9: F841 Local variable `total_num_attention_heads` is assigned to but never used
Ruff (F841): vllm/config.py#L692
vllm/config.py:692:9: F841 Local variable `tensor_parallel_size` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/fused_moe/layer.py#L453
vllm/model_executor/layers/fused_moe/layer.py:453:9: F841 Local variable `tp_rank` is assigned to but never used
Ruff (E501): vllm/model_executor/layers/linear.py#L290
vllm/model_executor/layers/linear.py:290:81: E501 Line too long (93 > 80)
Ruff (F841): vllm/model_executor/layers/linear.py#L377
vllm/model_executor/layers/linear.py:377:13: F841 Local variable `shard_size` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/linear.py#L678
vllm/model_executor/layers/linear.py:678:9: F841 Local variable `tp_size` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/linear.py#L1127
vllm/model_executor/layers/linear.py:1127:9: F841 Local variable `tp_rank` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/linear.py#L1151
vllm/model_executor/layers/linear.py:1151:13: F841 Local variable `shard_size` is assigned to but never used
Ruff (E501): vllm/model_executor/layers/quantization/base_config.py#L43
vllm/model_executor/layers/quantization/base_config.py:43:81: E501 Line too long (83 > 80)
Ruff (F841): vllm/model_executor/layers/quantization/fp8.py#L170
vllm/model_executor/layers/quantization/fp8.py:170:9: F841 Local variable `tp_chunk` is assigned to but never used