Skip to content

Reduce K8s plugin template duplication#320

Open
khluu wants to merge 1 commit intomainfrom
cleanup/k8s-plugin-template-dedup
Open

Reduce K8s plugin template duplication#320
khluu wants to merge 1 commit intomainfrom
cleanup/k8s-plugin-template-dedup

Conversation

@khluu
Copy link
Copy Markdown
Collaborator

@khluu khluu commented Mar 29, 2026

Summary

  • Replace three near-identical template dicts (h100_plugin_template, nebius_h200_plugin_template, a100_plugin_template) with shared constants (_COMMON_ENV, _COMMON_VOLUME_MOUNTS, _COMMON_VOLUMES) and a _build_k8s_template() builder function
  • Fix bug on line 135 where DeviceType.A100.value was used instead of DeviceType.A100 (inconsistent with other comparisons and would never match since step.device is a DeviceType enum)
  • Add ValueError for unsupported device types instead of silently leaving plugin as None and crashing with TypeError on line 140

Test plan

  • H100: no nodeSelector, no priorityClassName, image gets pull-through-cache replacement
  • A100: has nodeSelector for A100, has priorityClassName "ci"
  • H200: has nodeSelector for gpu-h200-sxm, hardcoded 8 GPUs
  • Unsupported device raises ValueError
  • All configs include common env (VLLM_USAGE_SOURCE, HF_TOKEN secret ref)
  • num_devices is correctly set in resource limits
  • Two calls return independent dicts (no shared mutation)
  • All 13 tests pass locally

🤖 Generated with Claude Code

Replace three large template dicts with shared constants (_COMMON_ENV,
_COMMON_VOLUME_MOUNTS, _COMMON_VOLUMES) and a _build_k8s_template()
builder function. Also fixes inconsistent .value comparison and adds
error handling for unsupported device types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant