260424 sync upstream/master to odh/master#1445
260424 sync upstream/master to odh/master#1445vivekk16 wants to merge 9 commits intoopendatahub-io:masterfrom
Conversation
…rve#5429) Signed-off-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
…ve#5407) Signed-off-by: Vivek Karunai Kiri Ragavan <vkarunai@redhat.com> enforce gitaction have an issue so I manually merge this
Signed-off-by: James Ostrander <jostrand@redhat.com>
…serve#5449) Signed-off-by: Jooho Lee <jlee@redhat.com>
…serve#5433) Signed-off-by: Vivek Karunai Kiri Ragavan <vkarunai@redhat.com>
Signed-off-by: Vivek Karunai Kiri Ragavan <vkarunai@redhat.com> # Conflicts: # test/e2e/pytest.ini
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vivekk16 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis pull request introduces comprehensive autoscaling support for KServe LLM InferenceService via HPA and KEDA pathways. It adds new Helm-based infrastructure management scripts for Prometheus, Prometheus Adapter, and Workload Variant Autoscaler (WVA) deployment. The scheduler configuration migration system now handles disaggregated profile parameter transformations and plugin renaming. E2E testing is expanded with matrix-driven autoscaling test jobs and a new quick-install workflow. Minor updates include container CLI argument flag changes ( Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Security observationsTLS verification bypass in configuration: Multiple new shell scripts ( Shell script credential/command injection risks: The new Helm management scripts use Arbitrary code execution in test load generation: Missing input validation in scheduler migration: 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/retest |
There was a problem hiding this comment.
Actionable comments posted: 19
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
hack/setup/quick-install/kserve-knative-mode-dependency-install.sh (1)
647-668:⚠️ Potential issue | 🟡 MinorDead autoscaling variables in a Knative-only install script (SC2034).
PROMETHEUS_VERSION,PROMETHEUS_ADAPTER_VERSION,PROMETHEUS_NAMESPACE,PROMETHEUS_ADAPTER_NAMESPACE, andWVA_NAMESPACEare declared but never consumed in this script —main()(Lines 1325-1360) only installs helm/kustomize/yq/cert-manager/istio/knative-operator. Since the file is auto-generated (Lines 19-24), fix it in the generator: either (a) only emit variables a given script actually uses, or (b) prefix unused defaults withexportor# shellcheck disable=SC2034where they are intentionally passed through to child processes. Otherwise every regeneration will keep tripping shellcheck.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/kserve-knative-mode-dependency-install.sh` around lines 647 - 668, The file declares unused autoscaling variables (PROMETHEUS_VERSION, PROMETHEUS_ADAPTER_VERSION, PROMETHEUS_NAMESPACE, PROMETHEUS_ADAPTER_NAMESPACE, WVA_NAMESPACE) while main() only installs knative-related components, which trips SC2034 on regen; update the generator that emits this script so it either (a) only emits the variables actually referenced by this Knative-only install script (remove PROMETHEUS_* and WVA_* from output) or (b) marks intentionally-unused defaults so shellcheck stops complaining (prefix with export or add a comment like “# shellcheck disable=SC2034” next to the unused declarations), and ensure the generator logic that produces KEDA_NAMESPACE, KSERVE_NAMESPACE, etc. continues to emit those still in use (refer to the main() entrypoint in this script when deciding which vars to include)..github/workflows/e2e-test-quick-install.yaml (1)
30-190:⚠️ Potential issue | 🟠 MajorAdd explicit job permissions and use environment variables for event data.
None of the three jobs declare
permissions:blocks. Each inherits the defaultGITHUB_TOKENscope, allowing far more repository operations than needed (CWE-732). Additionally,${{ github.event.pull_request.base.ref }}is interpolated directly in shell commands at lines 41, 44, 76, 79, 143, and 146, violating script injection safeguards (CWE-94). Store event data in environment variables and pass them safely to shell commands.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test-quick-install.yaml around lines 30 - 190, The jobs detect-changes, controller-image-build and test-quick-install currently lack explicit minimal permissions and interpolate github.event data directly into shell commands; add a permissions: block to each job (e.g., permissions: contents: read) to restrict GITHUB_TOKEN scope and at the job level define environment variables for event data (e.g., env: PR_BASE_REF: ${{ github.event.pull_request.base.ref || '' }} and EVENT_NAME: ${{ github.event_name }}), then update the "Merge target branch" step in each job (the Merge target branch step that runs git fetch/merge) to use those env vars ($PR_BASE_REF and $EVENT_NAME) instead of inline ${{ ... }} to avoid script injection and unsafe interpolation. Ensure the shell tests use quoted variables and fallbacks when PR_BASE_REF is empty.
🟡 Minor comments (21)
charts/kserve-runtime-configs/files/llmisvcconfigs/resources.yaml-1626-1627 (1)
1626-1627:⚠️ Potential issue | 🟡 MinorQuote the version value to stay consistent with the sibling config.
config/llmisvcconfig/config-llm-scheduler.yaml(Line 26) uses"0.7.0"(quoted). Here it's bare. While0.7.0parses as a string in YAML (two dots ⇒ not numeric), leaving it unquoted invites future drift if someone changes it to0.8(which would parse as a float and breaksemver.NewVersioninpkg/controller/v1alpha2/llmisvc/scheduler.go).Proposed fix
annotations: - app.kubernetes.io/version: 0.7.0 + app.kubernetes.io/version: "0.7.0"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/kserve-runtime-configs/files/llmisvcconfigs/resources.yaml` around lines 1626 - 1627, The annotations entry uses an unquoted semantic version value which risks being parsed as a number if changed (e.g., 0.8) and can break semver parsing; update the annotations mapping key "app.kubernetes.io/version" in resources.yaml to use a quoted string value (e.g., change 0.7.0 to "0.7.0") so it matches the sibling config and always parses as a string for code like pkg/controller/v1alpha2/llmisvc/scheduler.go that uses semver.NewVersion.hack/setup/quick-install/llmisvc-dependency-install.sh-647-648 (1)
647-648:⚠️ Potential issue | 🟡 Minor
PROMETHEUS_*andWVA_NAMESPACEvariables are unused — either remove from generator output or export if downstream scripts depend on them.Lines 647–648 and 666–668 define
PROMETHEUS_VERSION,PROMETHEUS_ADAPTER_VERSION,PROMETHEUS_NAMESPACE,PROMETHEUS_ADAPTER_NAMESPACE, andWVA_NAMESPACE, butmain()(lines 1238–1279) never references them and does not invoke any prometheus or wva installation. Since they are notexported, they won't propagate to any child shells either. These variables are dead code in this script and should be omitted from the generator (llmisvc-dependency-install.definition) unless downstream scripts in the same process require them—in which case, addexportdirectives.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/llmisvc-dependency-install.sh` around lines 647 - 648, The PROMETHEUS_* and WVA_NAMESPACE variables (PROMETHEUS_VERSION, PROMETHEUS_ADAPTER_VERSION, PROMETHEUS_NAMESPACE, PROMETHEUS_ADAPTER_NAMESPACE, WVA_NAMESPACE) are defined but never used by main(); either remove these variable declarations from the generator output (llmisvc-dependency-install.definition) to eliminate dead code, or if downstream scripts rely on them, change their declarations to exported form (export PROMETHEUS_VERSION=... etc.) and ensure main() or the invoked child scripts actually consume them (call or document the dependent install functions); update the generator so the script contains only variables that are either exported or referenced by main()/called functions.hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh-3673-3674 (1)
3673-3674:⚠️ Potential issue | 🟡 MinorQuote the version string to preserve it as a string in YAML.
0.7.0is fine as-is for a semver, but annotation values in Kubernetes must be strings, and some YAML parsers will happily turn an unquoted scalar that looks numeric-ish into something else on re-serialization. The consumer inpkg/controller/v1alpha2/llmisvc/scheduler.gois doing a string comparison for the migration gate — a parser quirk here would silently skip the migration path.Proposed fix
annotations: - app.kubernetes.io/version: 0.7.0 + app.kubernetes.io/version: "0.7.0"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh` around lines 3673 - 3674, The YAML annotation value app.kubernetes.io/version should be quoted to guarantee it is treated as a string (change app.kubernetes.io/version: 0.7.0 to app.kubernetes.io/version: "0.7.0"); update the manifest in kserve-standard-mode-full-install-with-manifests.sh so the version is emitted as a quoted string to avoid YAML parsers converting it to a non-string type, because pkg/controller/v1alpha2/llmisvc/scheduler.go relies on a string comparison for the migration gate and could silently skip migration if the annotation is not a string.hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh-647-648 (1)
647-648:⚠️ Potential issue | 🟡 MinorRemove unused version variables or document their purpose.
PROMETHEUS_VERSIONandPROMETHEUS_ADAPTER_VERSIONat lines 647–648 are defined but never referenced within this file. This is a manifest-based installer that does not perform Helm installations, so these version pins serve no function. Either remove them or, if they are required by an external caller or sourced helper, explicitlyexportthem and document the dependency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh` around lines 647 - 648, PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION are declared but never used; either remove these unused variables or make their intent explicit by exporting them and adding a comment describing who consumes them (e.g., external caller or sourced helper). Locate the two variables PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION in the script and either delete the declarations or change them to exported variables (export PROMETHEUS_VERSION=... and export PROMETHEUS_ADAPTER_VERSION=...) and add a one-line comment above explaining the external dependency that requires them.docs/samples/llmisvc/multi-gpu-vendor/llm-inference-service-qwen2-7b-amd-no-scheduler.yaml-20-32 (1)
20-32:⚠️ Potential issue | 🟡 MinorSample omits readiness/liveness probes and securityContext.
Users copy-paste docs samples verbatim into production. With no
readinessProbe, theInferencePoolwill route traffic to pods before vLLM finishes loading the 7B Qwen weights (which can take minutes on ROCm), causing initial 5xx spikes. Consider adding probes consistent with the other fixtures in this PR (seetest/e2e/llmisvc/fixtures.py:109-122) and a minimalsecurityContext(runAsNonRoot, no root FS writes) so the sample works on OpenShift SCCs without edits.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/samples/llmisvc/multi-gpu-vendor/llm-inference-service-qwen2-7b-amd-no-scheduler.yaml` around lines 20 - 32, Add readiness and liveness probes and a minimal securityContext to the container named "main" in the template so pods aren't routed traffic before vLLM finishes loading and the sample works on OpenShift SCCs; specifically, mirror the probes used in test/e2e/llmisvc/fixtures.py (readiness probe that checks the inference health endpoint with an initialDelaySeconds tuned for large model load times and a simple liveness HTTP GET or tcp check) and add a securityContext with runAsNonRoot: true (and an appropriate runAsUser if used elsewhere) plus a readOnlyRootFilesystem setting to prevent root FS writes so the sample will run under OpenShift SCCs.pkg/controller/v1alpha2/llmisvc/config_merge_test.go-1347-1414 (1)
1347-1414:⚠️ Potential issue | 🟡 MinorTest prescribes label-map union semantics — confirm this is the intended contract.
The "custom pool selector merge base selector" case asserts that when both base and override supply
Selector.MatchLabels, the result is the union of all keys (llm-pool,kserve.io/component,x). This is what Kubernetes strategic merge patch does by default for maps, but the README for the multi-GPU-vendor sample relies on the override replacing the default workload selector, not augmenting it. If a base config ever sets the defaultapp.kubernetes.io/nametriplet and the user addsllm-pool: qwen2-7b, the merged selector will still include the default labels and theInferencePoolwill not match the AMD pods as documented. Consider adding a test case that mirrors the multi-GPU-vendor scenario (base = default triplet, override = custom labels) to pin down the actual behavior seen in production.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/v1alpha2/llmisvc/config_merge_test.go` around lines 1347 - 1414, Add a new test case in config_merge_test.go mirroring the multi-GPU-vendor scenario: create a base LLMInferenceServiceSpec whose Scheduler.Pool.Spec.Selector.MatchLabels contains the default app.kubernetes.io/name/app.kubernetes.io/instance triplet and an override spec that only supplies {"llm-pool":"qwen2-7b"}; assert the merged result in the test (for the relevant merge function exercised by these tests) contains the union of labels (i.e., both the default triplet and "llm-pool": "qwen2-7b") so the actual map-merge semantics for InferencePoolSpec.Selector.MatchLabels are pinned and will surface if behavior changes. Ensure you reference the existing case "custom pool selector merge base selector" and reuse RouterSpec/SchedulerSpec/InferencePoolSpec/Selector.MatchLabels structures to construct the new test entry.docs/samples/llmisvc/multi-gpu-vendor/README.md-21-21 (1)
21-21:⚠️ Potential issue | 🟡 MinorMissing fenced code block language (MD040).
Line 21's ASCII architecture diagram block has no language tag. Mark it as
textto silence markdownlint.Proposed fix
-``` +```text ┌──────────────┐ │ Gateway │🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/samples/llmisvc/multi-gpu-vendor/README.md` at line 21, The ASCII architecture diagram fenced code block (the triple-backtick block showing the Gateway diagram) is missing a language tag; update the opening fence from ``` to ```text so the block becomes a text fenced code block and resolves the MD040 markdownlint warning.docs/samples/llmisvc/multi-gpu-vendor/README.md-104-111 (1)
104-111:⚠️ Potential issue | 🟡 MinorFix incorrect scheduler component label in kubectl command.
Line 111's command uses
app.kubernetes.io/component=llminferenceservice-scheduler, but the controller actually labels scheduler pods asapp.kubernetes.io/component=llminferenceservice-router-scheduler(seepkg/controller/v1alpha2/llmisvc/scheduler.go:1354). The command will return no results unless corrected.Additionally, line 105's verification command remains ambiguous: once the AMD instance overrides
app.kubernetes.io/nametoqwen2-7b-instruct-nvidia, both vendor pods match that label, defeating the verification intent. Show distinct selectors—by node affinity, GPU label, or deployment hash—to disambiguate which pods are running where.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/samples/llmisvc/multi-gpu-vendor/README.md` around lines 104 - 111, Update the README lines: correct the scheduler pod selector to use the actual label key/value used by the controller (app.kubernetes.io/component=llminferenceservice-router-scheduler as set in scheduler.go at the scheduler creation) so the kubectl logs command returns scheduler pods, and replace the ambiguous qwen2-7b-instruct-nvidia pod verification selector with a disambiguating selector that targets vendor-specific attributes (for example use nodeName/node affinity, the GPU vendor label such as nvidia.com/gpu vs AMD label, or the deployment/pod-template-hash annotation) so you can reliably confirm which replicas are on which GPU nodes and verify InferencePool behavior.hack/setup/quick-install/llmisvc-full-install-with-manifests.sh-647-648 (1)
647-648:⚠️ Potential issue | 🟡 MinorRemove unused
PROMETHEUS_VERSIONandPROMETHEUS_ADAPTER_VERSION— they are never consumed.These variables are defined but never referenced in this file. They appear to be copy-pasted from the
*-dependency-install.shvariants, which use Helm to install Prometheus and pass these pins via--version. This script embeds manifests directly, so the pins are dead code. Keeping them creates false impression of version pinning without actually enforcing any constraint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/llmisvc-full-install-with-manifests.sh` around lines 647 - 648, Remove the unused variable definitions PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION from the script: delete the lines that assign PROMETHEUS_VERSION=83.4.0 and PROMETHEUS_ADAPTER_VERSION=5.3.0 and any now-stale references to those variable names; if you intended to actually pin Prometheus versions in this script, instead implement the explicit Helm/manifest pinning logic (or move the pins back into the corresponding *-dependency-install.sh) so that version variables are not left defined but unused.hack/setup/infra/manage.wva-helm.sh-24-24 (1)
24-24:⚠️ Potential issue | 🟡 MinorDoc/default mismatch:
httpin the header,httpsin the code.Line 24 documents the default as
http://...:9090; line 49 actually useshttps://...:9090. Combined withinsecureSkipVerify=trueon line 86, this silently disables TLS verification against a service that most in-cluster Prometheus deployments expose on plain HTTP. Pick one and make them consistent; if HTTP is acceptable inside the cluster, drop theinsecureSkipVerifysmell.Also applies to: 49-49
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/infra/manage.wva-helm.sh` at line 24, The documentation and code disagree on the default WVA_PROMETHEUS_URL (header shows http://..., but the script uses https://...:9090 and sets insecureSkipVerify=true); update them to be consistent by picking one approach: either change the documented default to https://...:9090 to match the script, or change the script to use http://...:9090 and remove the insecureSkipVerify usage; specifically edit the WVA_PROMETHEUS_URL default value and the TLS verification flag (insecureSkipVerify) references so the variable WVA_PROMETHEUS_URL and the TLS handling are aligned.hack/setup/infra/manage.wva-helm.sh-52-60 (1)
52-60:⚠️ Potential issue | 🟡 Minor
uninstallforce-deletes the entire namespace, not just the release.If
WVA_NAMESPACEwas created or is shared by anything outside this Helm release (custom secrets, co-located workloads, operators),kubectl delete all --all+kubectl delete namespace --force --grace-period=0will blow it away. At minimum, guard the namespace delete behind a label/annotation ownership check, or only run it when the namespace was created by this script (e.g., carries akserve.io/managed-by=manage.wva-helm.shlabel applied at install time).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/infra/manage.wva-helm.sh` around lines 52 - 60, The uninstall() function currently force-deletes everything in ${WVA_NAMESPACE}; change it to only remove resources belonging to this Helm release and only delete the namespace if it was created/owned by this script. Concretely: stop using kubectl delete all --all; instead delete resources by release/helm labels (use -l "app.kubernetes.io/instance=${WVA_RELEASE_NAME}" or the chart's release label) and/or run helm uninstall "${WVA_RELEASE_NAME}" -n "${WVA_NAMESPACE}" (already present) to remove release-managed objects; before calling kubectl delete namespace "${WVA_NAMESPACE}" check the namespace's labels (e.g., require kserve.io/managed-by=manage.wva-helm.sh) by querying kubectl get namespace "${WVA_NAMESPACE}" -o jsonpath=... and only delete the namespace when that label exists; otherwise skip namespace deletion and log a warning.hack/setup/quick-install/kserve-standard-mode-dependency-install.sh-647-648 (1)
647-648:⚠️ Potential issue | 🟡 MinorExport
PROMETHEUS_VERSIONandPROMETHEUS_ADAPTER_VERSIONor remove them.Lines 647–648 declare both variables but they are never used in this file. They are not exported, so child processes spawned from this script will not inherit them. If downstream Helm installation scripts (e.g.,
manage.prometheus-helm.sh,manage.prometheus-adapter-helm.sh) source this file or receive these vars, an undefined/empty value silently defaults tolatestinhelm install --version, breaking reproducibility and introducing supply-chain risk. Either export them if they are intended for child processes, or remove them if they are dead code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/kserve-standard-mode-dependency-install.sh` around lines 647 - 648, The two variables PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION are declared but not exported or used, causing child processes (like manage.prometheus-helm.sh or manage.prometheus-adapter-helm.sh) to see empty values and default to "latest"; either remove these dead variables or export them so downstream scripts inherit explicit versions—update the file to either delete the PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION declarations if unused, or change them to exported variables (export PROMETHEUS_VERSION=83.4.0 and export PROMETHEUS_ADAPTER_VERSION=5.3.0) so helm install --version receives deterministic values.hack/setup/quick-install/kserve-knative-mode-full-install-with-manifests.sh-647-648 (1)
647-648:⚠️ Potential issue | 🟡 MinorRemove or use unused version pins
PROMETHEUS_VERSIONandPROMETHEUS_ADAPTER_VERSION(SC2034).These variables are defined at lines 647–648 but never referenced in this script. Sibling
-helm.shand-dependency-install.shscripts use them for Helm-based Prometheus deployment; this manifest-based script does not. Either remove them or integrate a Prometheus installation step that consumes them.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/kserve-knative-mode-full-install-with-manifests.sh` around lines 647 - 648, The variables PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION are declared but never used; either remove these unused variable declarations or wire them into a manifest-based Prometheus install step so they are consumed. To fix, delete the PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION lines if this script will not install Prometheus, or add a manifest apply/templating step (e.g., curl or kubectl apply of generated YAML) that uses PROMETHEUS_VERSION and PROMETHEUS_ADAPTER_VERSION to fetch/render the correct Prometheus and adapter manifests (or pass them as vars into a templating command) so the variables are actually referenced.test/e2e/llmisvc/test_llm_autoscaling.py-919-946 (1)
919-946:⚠️ Potential issue | 🟡 MinorReplace the fixed
time.sleep(5)with polling — guaranteed CI flake.After
patch_llmisvcthere is no guarantee the controller has reconciled, re-synced the HPA/ScaledObject, and propagatedmaxReplicas=5 within 5 seconds. Under kubelet/apiserver load the next_get_custom_resourcereads a stalemaxReplicas=<old>and the test fails non-deterministically. Use the existingwait_forhelper with an assertion on the expected value.🛠️ Proposed fix (HPA; mirror for KEDA)
- time.sleep(5) - assert_scaling_resources_exist(service_name, actuator="hpa") - - hpa = _get_custom_resource( - HPA_GROUP, - HPA_VERSION, - HPA_PLURAL, - hpa_name(service_name), - KSERVE_TEST_NAMESPACE, - ) - assert hpa is not None - assert hpa["spec"]["maxReplicas"] == 5, ( - f"Expected maxReplicas=5 after update, got {hpa['spec']['maxReplicas']}" - ) + def _max_is_five(): + hpa = _get_custom_resource( + HPA_GROUP, HPA_VERSION, HPA_PLURAL, + hpa_name(service_name), KSERVE_TEST_NAMESPACE, + ) + assert hpa is not None, "HPA not found" + assert hpa["spec"]["maxReplicas"] == 5, ( + f"Expected maxReplicas=5, got {hpa['spec']['maxReplicas']}" + ) + wait_for(_max_is_five, timeout=60, interval=2.0)Also applies to: 993-1026
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/llmisvc/test_llm_autoscaling.py` around lines 919 - 946, Replace the fixed time.sleep(5) after patch_llmisvc with a polling-based wait_for that repeatedly calls _get_custom_resource (using HPA_GROUP/HPA_VERSION/HPA_PLURAL and hpa_name(service_name)) and asserts hpa["spec"]["maxReplicas"] == 5, so the test waits until the controller has reconciled; keep the existing assert_scaling_resources_exist(service_name, actuator="hpa") check but remove the unconditional sleep and use wait_for to avoid CI flakes.pkg/controller/v1alpha2/llmisvc/scheduler.go-1124-1132 (1)
1124-1132:⚠️ Potential issue | 🟡 MinorMetric flags are stripped from container args before we confirm the plugin injection succeeded.
extractDeprecatedMetricFlagsmutatesd.Spec.Template.Spec.Containers[*].Argsin place. IfmutateSchedulerConfigsubsequently no-ops (e.g.--config-textvalue isn't valid YAML — see lines 779–783 where the error is intentionally swallowed, or--config-fileis used instead of--config-text), the CLI flags are gone and thecore-metrics-extractorplugin never lands. Net effect on a v0.7 binary: scheduler starts with no custom metric specs and silently uses defaults.Order the mutation first (accumulate the plugin), and only strip the flags once injection actually happened, or short-circuit when no
--config-textarg is present.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/v1alpha2/llmisvc/scheduler.go` around lines 1124 - 1132, The call to extractDeprecatedMetricFlags currently mutates container args before mutateSchedulerConfig runs, which can remove metric flags even when plugin injection no-ops; change the flow so you first build/accumulate the withCoreMetricsExtractorPlugin argument (call extractDeprecatedMetricFlags only to compute the plugin payload without mutating in-place), invoke mutateSchedulerConfig with withMigrateDisaggHeadersHandler, withMigrateDisaggProfileHandler, WithRemoveHashBlockSize, and the prepared withCoreMetricsExtractorPlugin, and only mutate (i.e., strip) container Args via extractDeprecatedMetricFlags if mutateSchedulerConfig actually applied the mutation (and short-circuit entirely when no --config-text arg is present on the container to avoid stripping flags prematurely).hack/setup/infra/manage.prometheus-adapter-helm.sh-60-68 (1)
60-68:⚠️ Potential issue | 🟡 MinorExistence check matches on substring, not release name.
helm list ... | grep -q "prometheus-adapter"returns true for any release whose name containsprometheus-adapter(e.g.my-prometheus-adapter-dev). Uninstall on line 54 then targets the literalprometheus-adapter, so the detected release is never actually removed and thehelm installbelow fails. Usehelm statusorhelm list -qwith an anchored match.🛠️ Proposed fix
- if helm list -n "${PROMETHEUS_ADAPTER_NAMESPACE}" 2>/dev/null | grep -q "prometheus-adapter"; then + if helm status prometheus-adapter -n "${PROMETHEUS_ADAPTER_NAMESPACE}" >/dev/null 2>&1; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/infra/manage.prometheus-adapter-helm.sh` around lines 60 - 68, The existence check for the release currently uses `helm list ... | grep -q "prometheus-adapter"` which matches substrings and can detect the wrong release; replace that logic with an exact release check (for example call `helm list -n "${PROMETHEUS_ADAPTER_NAMESPACE}" -q` and test for an anchored match like `^prometheus-adapter$`, or use `helm status prometheus-adapter -n "${PROMETHEUS_ADAPTER_NAMESPACE}" >/dev/null 2>&1`), and ensure the subsequent `uninstall` call operates on the same literal release name (`prometheus-adapter`) so the detected release is actually removed before installing.hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh-279-287 (1)
279-287:⚠️ Potential issue | 🟡 MinorDead error branch —
kubectl waitalready aborts underset -e.With
set -o errexit(line 27), a non-zero exit fromkubectl waitterminates the script before line 281 runs, so theif [ $? -eq 0 ]branch is unreachable on failure. Either dropset -ecoverage for this one call (kubectl wait ... || rc=$?) or remove the conditional.🛠️ Proposed fix
- kubectl wait --timeout="$timeout" -n "$namespace" deployment/"$deployment_name" --for=condition=Available - - if [ $? -eq 0 ]; then - log_success "Deployment '$deployment_name' in namespace '$namespace' is available!" - else + if kubectl wait --timeout="$timeout" -n "$namespace" deployment/"$deployment_name" --for=condition=Available; then + log_success "Deployment '$deployment_name' in namespace '$namespace' is available!" + else log_error "Deployment '$deployment_name' in namespace '$namespace' failed to become available within $timeout" return 1 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh` around lines 279 - 287, The current block uses "kubectl wait" under global "set -o errexit", so failures will abort the script before the subsequent "if [ $? -eq 0 ]" conditional runs; update the logic around the kubectl wait call (referencing kubectl wait and variables deployment_name, namespace, timeout) to capture its exit status explicitly (for example run kubectl wait ... || rc=$? or use an if kubectl wait ...; then ...; else ...; fi style) and then use that captured exit code to decide whether to call log_success or log_error and return 1.test/e2e/llmisvc/test_llm_autoscaling.py-208-214 (1)
208-214:⚠️ Potential issue | 🟡 MinorSilent
except Exception: passhides connection failures in the load generator.If the endpoint is wrong (DNS, port, wrong proxy), every request fails and the test still "succeeds" the load phase before asserting
wait_for_pod_count(..., min_count=2), which then times out — you'll debug the autoscaler when in fact zero load reached it. At minimum, count failures and log a summary so the scale-up failure mode is distinguishable from a load-delivery failure mode.🛠️ Proposed fix
- def _worker(): + def _worker(): + errors = 0 while time.time() < deadline: try: requests.post(endpoint, json=payload, headers=headers, timeout=30) - except Exception: - pass + except requests.RequestException as e: + errors += 1 + if errors == 1 or errors % 50 == 0: + logger.warning("load worker request failed (%d so far): %s", errors, e) time.sleep(0.1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/llmisvc/test_llm_autoscaling.py` around lines 208 - 214, The load generator's _worker currently swallows all exceptions from requests.post which hides connection failures; update _worker to track request outcomes by introducing counters (e.g., total_requests and failed_requests) in the enclosing scope, increment total_requests on each attempt and increment failed_requests inside the except Exception block, and periodically or at the end log a concise summary including endpoint, total_requests, failed_requests and an error rate so scale-up failures can be distinguished from load-delivery failures; reference the _worker function and the requests.post call (using endpoint, payload, headers, timeout) when making these additions and ensure the summary is asserted or surfaced after the load phase before calling wait_for_pod_count.hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh-562-585 (1)
562-585:⚠️ Potential issue | 🟡 Minor
evalon an externally-derived identifier is a CWE-94 liability.
eval "current_value=\${${var_name}}"/eval "export $var_name=\"$component_value\""executes arbitrary shell ifvar_nameorcomponent_valueever originates from user input. Today all callers are in-repo, but the function has no validation, no[[ "$var_name" =~ ^[A-Z_][A-Z0-9_]*$ ]]guard, andcomponent_valueis interpolated unquoted into the eval string — a value containing"; rm -rf ~; #would execute. Preferdeclare -n(bash 4.3+) orprintf -v.🛠️ Proposed fix
- local current_value - eval "current_value=\${${var_name}}" + local current_value="${!var_name:-}" @@ - if [ -n "$component_value" ]; then - eval "export $var_name=\"$component_value\"" - elif [ -n "$global_value" ]; then - eval "export $var_name=\"$global_value\"" - fi + if [ -n "$component_value" ]; then + printf -v "$var_name" '%s' "$component_value" + export "$var_name" + elif [ -n "$global_value" ]; then + printf -v "$var_name" '%s' "$global_value" + export "$var_name" + fiAs per coding guidelines: "Avoid eval and dynamic command execution".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh` around lines 562 - 585, The function set_env_with_priority uses eval with external identifiers (eval "current_value=\${${var_name}}" and eval "export $var_name=\"$component_value\""), which is unsafe; validate var_name against a safe identifier regex (e.g. [[ "$var_name" =~ ^[A-Z_][A-Z0-9_]*$ ]]) and reject/exit if it fails, then replace eval usage with safe indirection and assignment: read current value via indirect expansion (current_value="${!var_name}") and set the value using printf -v "$var_name" '%s' "$component_value" (or declare -n ref="$var_name" on bash ≥4.3) and export with export "$var_name" (after setting) — also ensure component_value and global_value are used only as data (quoted) when assigning to avoid command injection.test/e2e/llmisvc/test_llm_autoscaling.py-140-145 (1)
140-145:⚠️ Potential issue | 🟡 Minor
_child_nameuses naive truncation instead of hashing likekmeta.ChildName.Knative's
kmeta.ChildNameappends a hash suffix when the concatenated name exceeds the DNS label limit and uses a hash-based format when the suffix alone exceeds 31 characters. The Python helper instead naively concatenates and slices to 63 chars. When aservice_nameis long enough (≥ ~45 chars + suffix like-kserve-prefill-hpa), this generates a different string than the actual controller produces, causingassert_scaling_resources_existandassert_scaling_resources_deletedassertions to fail for the wrong reason.Current test names are short enough to avoid this, but this is a latent defect for any future test with a longer service name.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/llmisvc/test_llm_autoscaling.py` around lines 140 - 145, The _child_name helper currently truncates parent+suffix naively which diverges from knative.dev/pkg/kmeta.ChildName; replace its logic so it mirrors kmeta.ChildName: when parent+suffix exceeds 63 chars compute a deterministic hash (e.g., SHA-1/sha256) of the parent and embed a short hex hash segment instead of simple truncation; also handle the case where the suffix alone is long (suffix > 31) by using the hash-based format used by kmeta (produce a stable name comprised of a truncated parent, a separator, hex hash, and the suffix) so names produced by _child_name match the controller and fix failing assertions like assert_scaling_resources_exist/assert_scaling_resources_deleted that rely on identical naming.pkg/controller/v1alpha2/llmisvc/scheduler.go-1041-1047 (1)
1041-1047:⚠️ Potential issue | 🟡 MinorThreshold comparison via
fmt.Sprintf("%v", …) != "0"is fragile.YAML decoders yield numeric types (
int,float64) for unquoted scalars. The string comparison breaks whenthresholdValisfloat64(0.0)(prints as"0.0", not"0"). Use a numeric cast (thresholdVal.(float64) == 0or similar) orreflect.Value.Zero()to compare by actual value, not string representation.Occurs at lines 1045 and 1195.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/v1alpha2/llmisvc/scheduler.go` around lines 1041 - 1047, The current check using fmt.Sprintf("%v", thresholdVal) != "0" is fragile; update the logic around thresholdVal/thresholdFound (the unstructured.NestedFieldNoCopy call in scheduler.go) to perform a numeric zero check instead: do a type switch on thresholdVal handling float64, int (and string as fallback by parsing) and compare numeric value to 0 (e.g., float64 == 0.0 or int == 0); ensure you treat missing/unsupported types as non-zero or continue. Replace the string-format comparison with this type-aware numeric comparison wherever thresholdVal is used (locations referenced by thresholdVal, thresholdFound, pluginMap).
🧹 Nitpick comments (9)
python/kserve/kserve/model_server.py (1)
131-141: Flag name uses hyphens while most KServe flags use underscores.--timeout-keep-alivefollows the newer--event-loopstyle, but the majority of flags in this parser are underscore-delimited (--http_port,--max_threads,--log_config_file, etc.).argparsewon't accept--timeout_keep_aliveas an alias unless added explicitly, so any user/script copying the existing underscore convention will get a parse error. Consider either registering an underscore alias for backward/ergonomic consistency, or documenting the intentional shift.♻️ Optional alias
parser.add_argument( - "--timeout-keep-alive", + "--timeout-keep-alive", "--timeout_keep_alive", dest="timeout_keep_alive",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/kserve/kserve/model_server.py` around lines 131 - 141, The flag name uses hyphens while most KServe flags use underscores; update the parser.add_argument call that defines "--timeout-keep-alive" (dest="timeout_keep_alive") to also accept an underscore alias so scripts using the underscore style won't break: add the second option string "--timeout_keep_alive" (keeping dest="timeout_keep_alive", default=65, type=int, help unchanged) in the same parser.add_argument call that currently registers "--timeout-keep-alive" so both forms parse identically.python/kserve/kserve/protocol/rest/server.py (1)
80-101: No bounds check ontimeout_keep_alive. Negative or zero values are accepted here and handed straight touvicorn.Config, where behavior is undefined/loosely defined. Low risk since this is CLI-sourced, but aif timeout_keep_alive < 0: raise ValueError(...)(or clamp) would make misconfiguration fail loudly instead of producing connections that drop unexpectedly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/kserve/kserve/protocol/rest/server.py` around lines 80 - 101, The constructor accepts timeout_keep_alive and passes it to uvicorn.Config without validation; add a bounds check in the initializer (before creating self.config) to reject non-positive values (e.g., if timeout_keep_alive <= 0: raise ValueError("timeout_keep_alive must be > 0")) or clamp it to a sensible minimum, so the parameter used in uvicorn.Config is always valid; update the code around the timeout_keep_alive parameter handling where self.config is created to perform this check and raise a clear error referencing timeout_keep_alive and uvicorn.Config.python/kserve/kserve/protocol/rest/multiprocess/server.py (1)
139-149: Positional forwarding intoRESTServeris brittle. Every one of these nine arguments is passed positionally, so any future reorder or insertion inRESTServer.__init__(e.g., adding a parameter betweengrace_periodandevent_loop) will silently misalignevent_loop↔timeout_keep_aliveand produce a nonsense uvicorn config with no type error. Switching to keyword arguments removes this foot-gun at zero cost.♻️ Proposed refactor
self._rest_server = RESTServer( - app, - data_plane, - model_repository_extension, - http_port, - access_log_format, - workers, - grace_period, - event_loop, - timeout_keep_alive, + app=app, + data_plane=data_plane, + model_repository_extension=model_repository_extension, + http_port=http_port, + access_log_format=access_log_format, + workers=workers, + grace_period=grace_period, + event_loop=event_loop, + timeout_keep_alive=timeout_keep_alive, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/kserve/kserve/protocol/rest/multiprocess/server.py` around lines 139 - 149, The call to RESTServer in server.py passes nine arguments positionally which is brittle; change the instantiation to use explicit keyword arguments matching RESTServer.__init__ (e.g., app=app, data_plane=data_plane, model_repository_extension=model_repository_extension, http_port=http_port, access_log_format=access_log_format, workers=workers, grace_period=grace_period, event_loop=event_loop, timeout_keep_alive=timeout_keep_alive) so parameters cannot be misaligned if the constructor signature changes.python/kserve/test/test_rest_server.py (1)
50-52: Misleading inline comments in parametrize cases.(65, 65)is passed explicitly, so it isn't exercising the "default value" code path (that's whattest_config_timeout_keep_alive_defaultcovers). Similarly(5, 5) # uvicorn defaultpasses 5 explicitly — this test only verifies propagation intouvicorn.Config, not that uvicorn's own default is 5. Consider rewording or removing the comments to avoid confusing future readers about what's actually under test.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/kserve/test/test_rest_server.py` around lines 50 - 52, The inline comments next to the parametrized tuples (65, 65), (120, 120), and (5, 5) are misleading because those tuples are explicit values (not exercising uvicorn/default code paths); update the comments in python/kserve/test/test_rest_server.py for the parametrized test that covers timeout_keep_alive (and related test_config_timeout_keep_alive_default) to either remove the incorrect notes or replace them with accurate notes such as "explicit value" for (65,65) and (120,120) and "explicit propagation check into uvicorn.Config" for (5,5), so readers understand these cases validate value propagation rather than uvicorn defaults.python/custom_transformer/pyproject.toml (1)
26-33: Consolidate duplicated[tool.uv.sources]and[[tool.uv.index]]blocks into a uv workspace.The pytorch-cpu index definition is copy-pasted identically between
python/custom_model/pyproject.toml(lines 22–29) andpython/custom_transformer/pyproject.toml(lines 26–33). Future updates to the URL, name, orexplicitflag must be mirrored in both locations, creating drift risk. Since no uv workspace root exists inpython/pyproject.toml, promote this shared configuration to a workspace-level declaration to maintain a single source of truth.Non-blocking for this PR.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/custom_transformer/pyproject.toml` around lines 26 - 33, Duplicate [tool.uv.sources] and [[tool.uv.index]] blocks for the pytorch-cpu index (entries for torch and torchvision, and the index with name = "pytorch-cpu", url = "https://download.pytorch.org/whl/cpu", explicit = true) should be moved to the workspace-level pyproject.toml under a single [tool.uv.sources] and [[tool.uv.index]] declaration; remove the duplicate blocks from the module pyproject.toml files so they inherit from the workspace root, ensuring the single source of truth for the index configuration.test/e2e/pytest.ini (1)
24-25: Marker registration duplicated intest/e2e/llmisvc/conftest.py.
autoscaling_hpaandautoscaling_kedaare now declared in bothpytest.iniand viaconfig.addinivalue_line(...)intest/e2e/llmisvc/conftest.py. Not harmful, but it's two places to keep in sync on rename. Consider consolidating topytest.inionly (it already applies repo-wide for the e2e suite) and dropping theaddinivalue_linecalls for these two fromconftest.py.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/pytest.ini` around lines 24 - 25, Remove the duplicate marker registration by deleting the two config.addinivalue_line(...) calls for "autoscaling_hpa" and "autoscaling_keda" from test/e2e/llmisvc/conftest.py and rely on the declarations already present in pytest.ini; specifically locate the addinivalue_line usages in conftest.py that reference the autoscaling_hpa and autoscaling_keda markers and remove those lines so the markers are declared only in pytest.ini.test/scripts/gh-actions/setup-deps.sh (1)
110-121: UnknownLLMISVC_AUTOSCALERvalues silently do nothing.A typo like
haporkdapassed from the workflow input produces no error and no autoscaler install, which will surface later as a confusing e2e failure. Consider validating the value and failing fast (onlynone|hpa|kedaallowed).🔧 Proposed fix
if [[ $LLMISVC_AUTOSCALER == "hpa" ]]; then ... elif [[ $LLMISVC_AUTOSCALER == "keda" ]]; then ... + elif [[ $LLMISVC_AUTOSCALER != "none" ]]; then + echo "ERROR: Unknown LLMISVC_AUTOSCALER='${LLMISVC_AUTOSCALER}' (expected one of: none, hpa, keda)" >&2 + exit 1 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/scripts/gh-actions/setup-deps.sh` around lines 110 - 121, Validate the LLMISVC_AUTOSCALER value before branching: check that the environment variable LLMISVC_AUTOSCALER is one of the allowed values ("none", "hpa", "keda") and if not, print a clear error including the invalid value and exit with non-zero status; update the existing conditional block that currently handles "hpa" and "keda" to include an explicit "none" case (no-op) and add an else branch that fails fast, so typos like "hap" or "kda" will cause the script to abort immediately.hack/setup/infra/manage.wva-helm.sh (1)
96-96:${WVA_EXTRA_ARGS:-}is intentionally unquoted — document the contract.The unquoted expansion is needed so callers can pass multiple flags via one env var, but that means values containing spaces (e.g.,
--set foo="a b") will be mis-split. Either document thatWVA_EXTRA_ARGSmust be shell-safe, or switch to an array (WVA_EXTRA_ARGS=(--set k=v ...)expanded as"${WVA_EXTRA_ARGS[@]}") for robustness. Shellcheck SC2086 is a true positive on the quoting rule but the split is intentional here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/infra/manage.wva-helm.sh` at line 96, The unquoted expansion of WVA_EXTRA_ARGS is intentional to allow multiple flags but risks word-splitting for values with spaces; either document the contract that WVA_EXTRA_ARGS must be shell-safe (no unescaped spaces, e.g., callers must quote or escape inner spaces) or change the variable into a bash array and expand it safely; specifically update the script references to use the WVA_EXTRA_ARGS array (and all callers) and expand as the array form to preserve tokens, and add a short comment above the variable expansion explaining the chosen contract so ShellCheck/maintainers understand the intent.hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh (1)
666-691: Duplicate variable definitions silently override earlier values.
PROMETHEUS_NAMESPACE(lines 666, 687) andWVA_NAMESPACE(lines 668, 689) are defined twice with identical defaults. Not a bug today, but if the "Global" block is ever retuned (e.g. a different default namespace or an override chain viaset_env_with_priority), the second assignment clobbers the first and the divergence will be invisible. Drop the duplicates in the "Component-Specific" section.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh` around lines 666 - 691, Remove the duplicate environment variable assignments in the "Component-Specific" section: delete the redundant PROMETHEUS_NAMESPACE and WVA_NAMESPACE lines so the global defaults (PROMETHEUS_NAMESPACE and WVA_NAMESPACE set earlier) are the authoritative definitions; keep the other component-specific variables (PROMETHEUS_RELEASE_NAME, WVA_RELEASE_NAME, WVA_PROMETHEUS_URL) intact and verify there are no other duplicated vars in that block.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 7dc152a9-162e-4bd4-adb9-abb07b3bf516
⛔ Files ignored due to path filters (3)
go.sumis excluded by!**/*.sum,!**/*.sumpython/custom_model/uv.lockis excluded by!**/*.lockpython/custom_transformer/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (50)
.github/actions/kserve-dep-setup/action.yml.github/workflows/comment-cherry-pick.yml.github/workflows/e2e-test-llmisvc.yaml.github/workflows/e2e-test-quick-install.yaml.github/workflows/e2e-test.ymlcharts/kserve-runtime-configs/files/llmisvcconfigs/resources.yamlconfig/llmisvcconfig/config-llm-decode-template.yamlconfig/llmisvcconfig/config-llm-decode-worker-data-parallel.yamlconfig/llmisvcconfig/config-llm-scheduler.yamlconfig/overlays/test/s3-local-backend/seaweedfs-deployment.yamldocs/samples/llmisvc/multi-gpu-vendor/README.mddocs/samples/llmisvc/multi-gpu-vendor/llm-inference-service-qwen2-7b-amd-no-scheduler.yamldocs/samples/llmisvc/multi-gpu-vendor/llm-inference-service-qwen2-7b-nvidia-with-scheduler.yamlgo.modhack/setup/global-vars.envhack/setup/infra/manage.prometheus-adapter-helm.shhack/setup/infra/manage.prometheus-helm.shhack/setup/infra/manage.wva-helm.shhack/setup/quick-install/definitions/llmisvc/llmisvc-autoscaling-hpa-dependency-install.definitionhack/setup/quick-install/definitions/llmisvc/llmisvc-autoscaling-keda-dependency-install.definitionhack/setup/quick-install/keda-dependency-install.shhack/setup/quick-install/kserve-knative-mode-dependency-install.shhack/setup/quick-install/kserve-knative-mode-full-install-helm.shhack/setup/quick-install/kserve-knative-mode-full-install-with-manifests.shhack/setup/quick-install/kserve-standard-mode-dependency-install.shhack/setup/quick-install/kserve-standard-mode-full-install-helm.shhack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.shhack/setup/quick-install/llmisvc-autoscaling-hpa-dependency-install.shhack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.shhack/setup/quick-install/llmisvc-dependency-install.shhack/setup/quick-install/llmisvc-full-install-helm.shhack/setup/quick-install/llmisvc-full-install-with-manifests.shkserve-deps.envpkg/controller/v1alpha2/llmisvc/config_merge_test.gopkg/controller/v1alpha2/llmisvc/config_presets_test.gopkg/controller/v1alpha2/llmisvc/controller_int_scheduler_config_test.gopkg/controller/v1alpha2/llmisvc/scheduler.gopkg/controller/v1alpha2/llmisvc/scheduler_test.gopython/custom_model/pyproject.tomlpython/custom_transformer/pyproject.tomlpython/kserve/kserve/model_server.pypython/kserve/kserve/protocol/rest/multiprocess/server.pypython/kserve/kserve/protocol/rest/server.pypython/kserve/test/test_rest_server.pytest/e2e/llmisvc/conftest.pytest/e2e/llmisvc/fixtures.pytest/e2e/llmisvc/test_llm_autoscaling.pytest/e2e/pytest.initest/scripts/gh-actions/setup-deps.shtest/scripts/gh-actions/setup-kserve.sh
| run: | | ||
| shopt -s nocasematch | ||
| echo "Selected network layer ${{ inputs.network-layer }}" | ||
| ./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" | ||
| ./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" "${{ inputs.autoscaler }}" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n .github/actions/kserve-dep-setup/action.yml | head -40Repository: opendatahub-io/kserve
Length of output: 1709
🌐 Web query:
GitHub Actions inputs shell command injection escaping behavior
💡 Result:
GitHub Actions does not automatically escape inputs (from workflow_dispatch, workflow_call, or action with) for safe use in shell commands. When ${{ inputs.name }} or ${{ github.event.inputs.name }} is used directly in a run: block, GitHub evaluates the expression and substitutes the raw string value into the shell script before the shell executes it. This raw substitution allows shell command injection if the input contains shell metacharacters like ", ;,
Citations:
- 1: GHSA-3g6g-gq4r-xjm9
- 2: https://docs.github.com/en/actions/concepts/security/script-injections
- 3: https://securitylab.github.com/resources/github-actions-untrusted-input
- 4: https://github.blog/2023-08-09-four-tips-to-keep-your-github-actions-workflows-secure
Quote inputs.deployment-mode before passing to shell.
${{ inputs.deployment-mode }} is unquoted at line 32 while adjacent inputs are properly quoted. GitHub Actions does not auto-escape inputs in run: blocks—raw substitution allows shell metacharacters to break argument boundaries and inject arbitrary commands (CWE-78). The input defaults to serverless or raw per the action definition, but remains user-controlled and must be quoted.
🛠️ Proposed fix
- ./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" "${{ inputs.autoscaler }}"
+ ./test/scripts/gh-actions/setup-deps.sh "${{ inputs.deployment-mode }}" "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" "${{ inputs.autoscaler }}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| run: | | |
| shopt -s nocasematch | |
| echo "Selected network layer ${{ inputs.network-layer }}" | |
| ./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" | |
| ./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" "${{ inputs.autoscaler }}" | |
| run: | | |
| shopt -s nocasematch | |
| echo "Selected network layer ${{ inputs.network-layer }}" | |
| ./test/scripts/gh-actions/setup-deps.sh "${{ inputs.deployment-mode }}" "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.deploy-llmisvc }}" "${{ inputs.autoscaler }}" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/actions/kserve-dep-setup/action.yml around lines 29 - 32, The run
block in action.yml passes the unquoted GitHub Actions input ${{
inputs.deployment-mode }} to the setup script, which can allow shell
metacharacter injection; update the command that invokes
./test/scripts/gh-actions/setup-deps.sh so that the deployment-mode argument is
wrapped in quotes just like the adjacent inputs (i.e., pass "${{
inputs.deployment-mode }}" alongside "${{ inputs.network-layer }}" etc.),
ensuring all inputs are consistently quoted before being forwarded to the shell.
| test-llmisvc-wva-autoscaling-hpa: | ||
| runs-on: ubuntu-latest | ||
| needs: [detect-changes, llmisvc-image-build] | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| install-method: ${{ fromJSON(needs.detect-changes.outputs.install_methods) }} | ||
| steps: | ||
| - name: Checkout source | ||
| uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1 | ||
|
|
||
| - name: Merge target branch | ||
| if: github.event_name == 'pull_request' | ||
| run: | | ||
| git fetch --unshallow origin | ||
| git fetch origin ${{ github.event.pull_request.base.ref }} | ||
| git config user.email "ci@kserve.io" | ||
| git config user.name "CI Bot" | ||
| git merge --no-edit origin/${{ github.event.pull_request.base.ref }} | ||
|
|
||
| - name: Load KServe environment variables | ||
| run: ./kserve-images.sh --ci | ||
|
|
||
| - name: Free-up disk space | ||
| uses: ./.github/actions/free-up-disk-space | ||
|
|
||
| - name: Setup Go | ||
| uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5.6.0 | ||
| with: | ||
| go-version-file: go.mod | ||
|
|
||
| - name: Setup Python | ||
| uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0 | ||
| with: | ||
| python-version: "3.9" | ||
|
|
||
| - name: Setup Minikube | ||
| uses: ./.github/actions/minikube-setup | ||
|
|
||
| - name: Download LLMISvc controller image | ||
| uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0 | ||
| with: | ||
| name: ${{ env.LLMISVC_CONTROLLER_IMG }}-${{ env.TAG }} | ||
| path: ./tmp | ||
|
|
||
| - name: Download storage-initializer image | ||
| uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0 | ||
| with: | ||
| name: ${{ env.STORAGE_INIT_IMG }}-${{ env.TAG }} | ||
| path: ./tmp | ||
|
|
||
| - name: Load LLMISvc controller image into minikube | ||
| run: | | ||
| minikube image load ./tmp/${{ env.LLMISVC_CONTROLLER_IMG }}-${{ env.TAG }} | ||
| echo "Images in minikube:" | ||
| minikube image ls | grep ${{ env.LLMISVC_CONTROLLER_IMG }} || echo "No llmisvc-controller images found" | ||
|
|
||
| - name: Load storage-initializer image into minikube | ||
| run: | | ||
| minikube image load ./tmp/${{ env.STORAGE_INIT_IMG }}-${{ env.TAG }} | ||
| echo "Images in minikube:" | ||
| minikube image ls | grep ${{ env.STORAGE_INIT_IMG }} || echo "No storage-initializer images found" | ||
|
|
||
| - name: KServe dependency setup (HPA autoscaling) | ||
| uses: ./.github/actions/kserve-dep-setup | ||
| with: | ||
| deploy-llmisvc: "true" | ||
| autoscaler: "hpa" | ||
|
|
||
| - name: Install UV | ||
| run: ./test/scripts/gh-actions/setup-uv.sh | ||
|
|
||
| - name: Install KServe | ||
| run: | | ||
| ${{ matrix.install-method == 'helm' && 'export INSTALL_METHOD="helm"' || '' }} | ||
| ./test/scripts/gh-actions/setup-kserve.sh | ||
|
|
||
| - name: Verify HPA autoscaling setup | ||
| run: | | ||
| echo "Verifying HPA autoscaling infrastructure..." | ||
|
|
||
| echo "Prometheus:" | ||
| kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus || true | ||
|
|
||
| echo "Prometheus Adapter:" | ||
| kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus-adapter || true | ||
|
|
||
| echo "WVA:" | ||
| kubectl get pods -n wva-system || true | ||
|
|
||
| echo "CRDs:" | ||
| kubectl get crd variantautoscalings.llmd.ai || true | ||
|
|
||
| echo "External Metrics API:" | ||
| kubectl get apiservice v1beta1.external.metrics.k8s.io || true | ||
|
|
||
| echo "" | ||
| echo "HPA autoscaling setup verification complete!" | ||
|
|
||
| - name: Run HPA autoscaling E2E tests | ||
| id: autoscaling-hpa-tests | ||
| timeout-minutes: 40 | ||
| run: | | ||
| ./test/scripts/gh-actions/run-e2e-tests.sh "llminferenceservice and autoscaling_hpa and cluster_cpu" 0 "envoy-gateway" | ||
|
|
||
| - name: Check system status | ||
| if: always() | ||
| run: | | ||
| ./test/scripts/gh-actions/status-check.sh "llmisvc" | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat .github/workflows/e2e-test-llmisvc.yamlRepository: opendatahub-io/kserve
Length of output: 16862
Declare explicit permissions: blocks on all jobs to enforce least-privilege token scope (CWE-732).
The workflow lacks job-level permissions: declarations. Every job (detect-changes, llmisvc-image-build, test-llmisvc, and both autoscaling jobs) inherits the repository default token scope, allowing full GITHUB_TOKEN access if any step is compromised. Additionally, the "Merge target branch" steps across multiple jobs directly interpolate ${{ github.event.pull_request.base.ref }} in shell commands, violating the requirement to avoid event data interpolation in run: blocks (CWE-94). Use github.base_ref or intermediate environment variables instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/e2e-test-llmisvc.yaml around lines 244 - 353, The job
lacks explicit least-privilege job-level permissions and unsafe interpolation of
event data in run blocks: add a permissions: block to each job (e.g.,
detect-changes, llmisvc-image-build, test-llmisvc,
test-llmisvc-wva-autoscaling-hpa and the other autoscaling job) granting only
the scopes required (likely minimal read for contents and write only if needed
for status/artifacts), and remove direct use of ${{
github.event.pull_request.base.ref }} inside the "Merge target branch" run
steps; instead expose the base ref via a safe context or env var (e.g., set env:
BASE_REF: ${{ github.base_ref }} on the job) and reference that env var (e.g.,
$BASE_REF) in the git commands, or use action inputs like github.base_ref in a
preceding step to avoid interpolating event payloads in run:.
| - name: Merge target branch | ||
| if: github.event_name == 'pull_request' | ||
| run: | | ||
| git fetch --unshallow origin | ||
| git fetch origin ${{ github.event.pull_request.base.ref }} | ||
| git config user.email "ci@kserve.io" | ||
| git config user.name "CI Bot" | ||
| git merge --no-edit origin/${{ github.event.pull_request.base.ref }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and its line count
wc -l .github/workflows/e2e-test-llmisvc.yamlRepository: opendatahub-io/kserve
Length of output: 109
🏁 Script executed:
# View lines 255-262 to confirm the snippet
sed -n '255,262p' .github/workflows/e2e-test-llmisvc.yamlRepository: opendatahub-io/kserve
Length of output: 436
🏁 Script executed:
# View lines 365-372 to verify the second instance
sed -n '365,372p' .github/workflows/e2e-test-llmisvc.yamlRepository: opendatahub-io/kserve
Length of output: 436
Stop injecting github.event.pull_request.base.ref straight into the shell.
Splicing event data directly into run: blocks exposes these steps to script injection (CWE-94). Pass the ref through env: and quote it in the shell instead.
🔒 Proposed fix
- name: Merge target branch
if: github.event_name == 'pull_request'
+ env:
+ BASE_REF: ${{ github.event.pull_request.base.ref }}
run: |
git fetch --unshallow origin
- git fetch origin ${{ github.event.pull_request.base.ref }}
+ git fetch origin "$BASE_REF"
git config user.email "ci@kserve.io"
git config user.name "CI Bot"
- git merge --no-edit origin/${{ github.event.pull_request.base.ref }}
+ git merge --no-edit "origin/$BASE_REF"Same issue at lines 365-372.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Merge target branch | |
| if: github.event_name == 'pull_request' | |
| run: | | |
| git fetch --unshallow origin | |
| git fetch origin ${{ github.event.pull_request.base.ref }} | |
| git config user.email "ci@kserve.io" | |
| git config user.name "CI Bot" | |
| git merge --no-edit origin/${{ github.event.pull_request.base.ref }} | |
| - name: Merge target branch | |
| if: github.event_name == 'pull_request' | |
| env: | |
| BASE_REF: ${{ github.event.pull_request.base.ref }} | |
| run: | | |
| git fetch --unshallow origin | |
| git fetch origin "$BASE_REF" | |
| git config user.email "ci@kserve.io" | |
| git config user.name "CI Bot" | |
| git merge --no-edit "origin/$BASE_REF" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/e2e-test-llmisvc.yaml around lines 255 - 262, The workflow
step "Merge target branch" injects github.event.pull_request.base.ref directly
into the run shell, which risks script injection; change the step to pass the
ref via an env variable (e.g., BASE_REF: ${{ github.event.pull_request.base.ref
}}) and then reference and quote that env var in the run commands (use
"$BASE_REF" when calling git fetch and git merge) to ensure safe shell
expansion; apply the same env-based fix to the other Merge/merge-ref step found
later in the file.
| - name: Merge target branch | ||
| if: github.event_name == 'pull_request' | ||
| run: | | ||
| git fetch --unshallow origin | ||
| git fetch origin ${{ github.event.pull_request.base.ref }} | ||
| git config user.email "ci@kserve.io" | ||
| git config user.name "CI Bot" | ||
| git merge --no-edit origin/${{ github.event.pull_request.base.ref }} | ||
| - uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and display the workflow file
find . -name "e2e-test-quick-install.yaml" -o -name "e2e-test-quick-install.yml"Repository: opendatahub-io/kserve
Length of output: 113
🏁 Script executed:
# Read the workflow file to examine the flagged sections
cat -n .github/workflows/e2e-test-quick-install.yamlRepository: opendatahub-io/kserve
Length of output: 9185
Move github.event.pull_request.base.ref to environment variables and quote all uses.
Direct event data interpolation in run: blocks enables script injection (CWE-94). Pass event values via env: with proper shell quoting instead.
This pattern appears at lines 41, 44, 76, 79, 143, and 146—all instances of the same vulnerability across the three "Merge target branch" steps.
Proposed fix
- name: Merge target branch
if: github.event_name == 'pull_request'
+ env:
+ BASE_REF: ${{ github.event.pull_request.base.ref }}
run: |
git fetch --unshallow origin
- git fetch origin ${{ github.event.pull_request.base.ref }}
+ git fetch origin "$BASE_REF"
git config user.email "ci@kserve.io"
git config user.name "CI Bot"
- git merge --no-edit origin/${{ github.event.pull_request.base.ref }}
+ git merge --no-edit "origin/$BASE_REF"Apply this to all three "Merge target branch" steps (lines 37–45, 72–80, 139–147).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/e2e-test-quick-install.yaml around lines 37 - 45, In each
"Merge target branch" step, avoid direct event interpolation in the run block:
add an env entry (e.g., TARGET_REF: ${{ github.event.pull_request.base.ref }})
to the step and replace direct uses of ${{ github.event.pull_request.base.ref }}
in the run script with the shell-quoted variable "$TARGET_REF"; specifically
update the git fetch origin ... and git merge --no-edit origin/... commands in
the "Merge target branch" steps (also apply the same change to the other two
identical steps) so all event data comes from env and every shell use is
properly quoted.
| name: mlpipeline-s3-artifact | ||
| key: secretkey | ||
| image: docker.io/chrislusf/seaweedfs:4.07@sha256:10fa7df90911dd83439f4d3d792a1c5c6c630121cb2094ba209f42d4b0ca975d | ||
| image: quay.io/flysangel/chrislusf/seaweedfs@sha256:a27e9c432f1dfaafb5bb3f5b39065c0df7a15423a3894025c714b3f06f998aeb |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the official SeaweedFS container image repository (Docker Hub and Quay), maintained by the SeaweedFS project?
💡 Result:
The official SeaweedFS container image repository maintained by the SeaweedFS project is on Docker Hub at chrislusf/seaweedfs. There is no official Quay.io repository identified in the sources.
Citations:
- 1: https://hub.docker.com/r/chrislusf/seaweedfs
- 2: https://github.com/seaweedfs/seaweedfs/tree/master/docker
- 3: https://github.com/seaweedfs/seaweedfs
- 4: https://hub.docker.com/r/chrislusf/seaweedfs/tags
- 5: https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-in-Docker-Swarm
- 6: https://seaweedfs.org/
Critical security issue: image sourced from unofficial registry introduces supply-chain compromise risk (CWE-829, CWE-345).
Exploit scenario: non-official Quay.io repositories bypass upstream provenance verification. Account takeover or registry misconfiguration allows attacker-controlled binaries to execute with cluster credentials and full network access.
Remediation (use official Docker Hub upstream)
- image: quay.io/flysangel/chrislusf/seaweedfs@sha256:a27e9c432f1dfaafb5bb3f5b39065c0df7a15423a3894025c714b3f06f998aeb
+ image: docker.io/chrislusf/seaweedfs@sha256:a27e9c432f1dfaafb5bb3f5b39065c0df7a15423a3894025c714b3f06f998aebSeaweedFS maintains its official image on Docker Hub (chrislusf/seaweedfs), not Quay.io. Enforce signature verification and registry allowlists via Kyverno/Gatekeeper admission policies so digest pinning alone does not become a false sense of security.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/overlays/test/s3-local-backend/seaweedfs-deployment.yaml` at line 34,
Replace the non-official image reference in the image: line
(quay.io/flysangel/chrislusf/seaweedfs@sha256:...) with the official SeaweedFS
image (chrislusf/seaweedfs) pinned to a verified digest or explicit trusted tag;
update the image source to Docker Hub (or your approved internal registry) and
ensure you verify signatures (not just digest) before deploying. Additionally,
add/enable admission controls (Kyverno or Gatekeeper) to enforce a registry
allowlist and image signature verification so future diffs cannot reintroduce
quay.io/flysangel/* images without policy approval.
| helm uninstall "${PROMETHEUS_RELEASE_NAME}" -n "${PROMETHEUS_NAMESPACE}" 2>/dev/null || true | ||
| kubectl delete all --all -n "${PROMETHEUS_NAMESPACE}" --force --grace-period=0 2>/dev/null || true | ||
| kubectl delete namespace "${PROMETHEUS_NAMESPACE}" --wait=true --timeout=60s --force --grace-period=0 2>/dev/null || true |
There was a problem hiding this comment.
kubectl delete all --all -n <ns> is a blast radius grenade.
Uninstall in uninstall_prometheus_helm, uninstall_keda_helm, and uninstall_wva_helm blindly wipes every all resource in monitoring / keda / wva-system. On a cluster where those namespaces are shared (monitoring is a very common shared namespace), unrelated tenants lose workloads with --force --grace-period=0. helm uninstall alone already removes chart-owned resources via owner refs; this line is unnecessary and actively dangerous.
Same pattern in all three uninstall functions.
🛠️ Proposed fix
helm uninstall "${PROMETHEUS_RELEASE_NAME}" -n "${PROMETHEUS_NAMESPACE}" 2>/dev/null || true
- kubectl delete all --all -n "${PROMETHEUS_NAMESPACE}" --force --grace-period=0 2>/dev/null || true
- kubectl delete namespace "${PROMETHEUS_NAMESPACE}" --wait=true --timeout=60s --force --grace-period=0 2>/dev/null || true
+ # Namespace deletion is only safe for dedicated test namespaces; remove if sharing.
+ kubectl delete namespace "${PROMETHEUS_NAMESPACE}" --wait=true --timeout=60s 2>/dev/null || trueAt minimum, gate the nuke behind an explicit flag (e.g. --purge-namespace) so a CI teardown can't strip a shared cluster.
Also applies to: 860-861, 903-905
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh`
around lines 771 - 773, The uninstall functions uninstall_prometheus_helm,
uninstall_keda_helm, and uninstall_wva_helm currently run "kubectl delete all
--all -n <namespace> --force --grace-period=0" which will indiscriminately
delete non-chart resources; remove these kubectl delete all calls or wrap them
behind an explicit opt-in flag (e.g. --purge-namespace) checked in the uninstall
functions so the default uninstall only uses helm uninstall (and optional
kubectl delete namespace if desired) and the destructive "delete all" is only
executed when the purge flag is provided.
| --set wva.prometheus.baseURL="${WVA_PROMETHEUS_URL}" \ | ||
| --set wva.prometheus.tls.insecureSkipVerify=true \ | ||
| --set wva.prometheus.serviceAccountName="prometheus-kube-prometheus-prometheus" \ | ||
| --set wva.prometheus.monitoringNamespace="${PROMETHEUS_NAMESPACE:-monitoring}" \ | ||
| --set wva.metrics.secure=false \ |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
workload-variant-autoscaler helm chart prometheus tls caSecretName values
💡 Result:
Answer: The workload-variant-autoscaler Helm chart does not expose a value named prometheus.tls.caSecretName for configuring Prometheus TLS CA verification. Instead, it expects you to provide the Prometheus CA certificate via a file/inline value (caCert) which is then mounted/used by the controller, and it uses the env/config keys PROMETHEUS_CA_CERT_PATH (path) plus PROMETHEUS_TLS_INSECURE_SKIP_VERIFY (boolean) for TLS behavior. What to use instead (chart/repo-supported knobs): 1) In the WVA controller deployment, Prometheus TLS verification is configured via: - PROMETHEUS_BASE_URL (must be HTTPS in production) - PROMETHEUS_CA_CERT_PATH (path to the CA cert for TLS verification) - PROMETHEUS_TLS_INSECURE_SKIP_VERIFY (dev/test only) 2) In the Helm values, Prometheus TLS CA is provided as a certificate value/file (shown in the repo’s install/values guidance as a file upload flag; and in values snippets as wva.prometheus.caCert / tls.caCertPath). Practical implication: - If you were looking for a “caSecretName” pattern (like many charts use), it’s not present in this chart’s documented configuration. - You should configure CA verification by supplying the CA cert content/file so the chart can place it into the controller pod and set PROMETHEUS_CA_CERT_PATH accordingly. Example (from the repo’s documented Helm install flow): helm upgrade -i workload-variant-autoscaler ./charts/workload-variant-autoscaler \ --namespace workload-variant-autoscaler-system \ --set-file prometheus.caCert=/tmp/prometheus-ca.crt \ ... Evidence: - The Prometheus integration doc lists PROMETHEUS_CA_CERT_PATH and PROMETHEUS_TLS_INSECURE_SKIP_VERIFY as the TLS controls. - The install/README shows --set-file prometheus.caCert=... as the way to provide the CA cert. - A Helm values snippet in an issue shows the chart’s Prometheus TLS section uses tls.caCertPath (path) and caCert (content), not caSecretName.
Citations:
- 1: https://github.com/llm-d/llm-d-workload-variant-autoscaler/blob/main/docs/integrations/prometheus.md
- 2: https://github.com/llm-d/llm-d-workload-variant-autoscaler/blob/main/charts/workload-variant-autoscaler/README.md
- 3: https://github.com/llm-d-incubation/workload-variant-autoscaler/blob/main/docs/user-guide/configuration.md
- 4: https://github.com/llm-d/llm-d-workload-variant-autoscaler/blob/main/docs/integrations/hpa-integration.md
🏁 Script executed:
#!/bin/bash
# Search for WVA/workload-variant chart references and values files
fd -t f 'values.*\.ya?ml' | head -20Repository: opendatahub-io/kserve
Length of output: 476
🏁 Script executed:
#!/bin/bash
# Search for Prometheus TLS config patterns in the installation script
rg 'prometheus.*tls|caSecretName|caCert|PROMETHEUS' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh -A 2 -B 2Repository: opendatahub-io/kserve
Length of output: 4156
🏁 Script executed:
#!/bin/bash
# Look for WVA chart documentation or references within the repo
fd -t f -i 'readme|doc.*prometheus|chart.*values' | grep -i wvaRepository: opendatahub-io/kserve
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for WVA-related chart files
fd -t f 'values.*\.ya?ml' charts/ | xargs grep -l 'prometheus\|wva' 2>/dev/nullRepository: opendatahub-io/kserve
Length of output: 49
🏁 Script executed:
#!/bin/bash
# Look for WVA chart directory
find charts/ -type d -iname '*wva*' -o -iname '*workload*' 2>/dev/nullRepository: opendatahub-io/kserve
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search within the install script for WVA helm chart source/repo
rg 'helm.*repo|helm.*add|wva.*chart' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh -B 2 -A 2Repository: opendatahub-io/kserve
Length of output: 498
🏁 Script executed:
#!/bin/bash
# Find WVA helm repo source and install command in the script
rg 'helm.*install.*wva|helm.*add.*wva|llm-d-wva' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh -B 5 -A 15Repository: opendatahub-io/kserve
Length of output: 820
🏁 Script executed:
#!/bin/bash
# Get context around line 933 where the WVA settings are
sed -n '900,950p' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.shRepository: opendatahub-io/kserve
Length of output: 2101
🏁 Script executed:
#!/bin/bash
# Search for any WVA or llm-d documentation in the repo
rg -i 'llm-d|workload.*variant|wva' --type md hack/setup/Repository: opendatahub-io/kserve
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Look for environment variable setup or certificate extraction logic
rg 'prometheus.*tls|caCert|PROMETHEUS_CA' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh -B 3 -A 3Repository: opendatahub-io/kserve
Length of output: 1713
🏁 Script executed:
#!/bin/bash
# Check if there's a way to extract the prometheus-tls cert in the script
rg 'kubectl.*secret|kubectl.*get.*prometheus-tls' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.shRepository: opendatahub-io/kserve
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Check for any secret extraction or certificate handling in the script
rg 'kubectl.*extract|kubectl.*get.*secret|base64|tls.crt|ca.crt' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh | head -20Repository: opendatahub-io/kserve
Length of output: 145
🏁 Script executed:
#!/bin/bash
# Look for any helm install patterns that use --set-file with secrets
rg '\-\-set-file|extract.*secret|ca.*crt' hack/setup/quick-install/ -rRepository: opendatahub-io/kserve
Length of output: 479
🏁 Script executed:
#!/bin/bash
# Get full WVA install section to see all parameters
sed -n '925,950p' hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.shRepository: opendatahub-io/kserve
Length of output: 1167
insecureSkipVerify=true on Prometheus TLS config — CWE-295 (Improper Certificate Validation).
WVA connects to Prometheus over HTTPS (baseURL is https://...) but skips hostname/issuer verification, defeating the cert-manager-issued cert created above and enabling MITM on the Prometheus control-plane traffic (credential sniffing, metric spoofing that drives autoscaling decisions → replica flaps). Combined with wva.metrics.secure=false, the install posture is "TLS off by default".
The self-signed CA is already minted and stored in the prometheus-tls secret — mount it and verify instead. The WVA chart does not support caSecretName; use prometheus.caCert instead:
🛠️ Proposed direction
Extract the CA certificate from the prometheus-tls secret and pass it to WVA:
- --set wva.prometheus.tls.insecureSkipVerify=true \
+ --set wva.prometheus.tls.insecureSkipVerify=false \
+ --set-file wva.prometheus.caCert=<(kubectl get secret prometheus-tls -n monitoring -o jsonpath='{.data.ca\.crt}' | base64 -d) \Or inline the cert content directly if prometheus.caCert is the parameter. Verify against the WVA chart's values.yaml for exact parameter names (may be prometheus.caCert, prometheus.tls.caCert, or similar).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| --set wva.prometheus.baseURL="${WVA_PROMETHEUS_URL}" \ | |
| --set wva.prometheus.tls.insecureSkipVerify=true \ | |
| --set wva.prometheus.serviceAccountName="prometheus-kube-prometheus-prometheus" \ | |
| --set wva.prometheus.monitoringNamespace="${PROMETHEUS_NAMESPACE:-monitoring}" \ | |
| --set wva.metrics.secure=false \ | |
| --set wva.prometheus.baseURL="${WVA_PROMETHEUS_URL}" \ | |
| --set wva.prometheus.tls.insecureSkipVerify=false \ | |
| --set-file wva.prometheus.caCert=<(kubectl get secret prometheus-tls -n monitoring -o jsonpath='{.data.ca\.crt}' | base64 -d) \ | |
| --set wva.prometheus.serviceAccountName="prometheus-kube-prometheus-prometheus" \ | |
| --set wva.prometheus.monitoringNamespace="${PROMETHEUS_NAMESPACE:-monitoring}" \ | |
| --set wva.metrics.secure=false \ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@hack/setup/quick-install/llmisvc-autoscaling-keda-dependency-install.sh`
around lines 933 - 937, The Helm install currently disables TLS verification
(wva.prometheus.tls.insecureSkipVerify=true) and sets wva.metrics.secure=false;
change this to enable verification by removing or setting
wva.prometheus.tls.insecureSkipVerify=false and enabling secure metrics, and
provide the Prometheus CA to the WVA chart using the chart's CA parameter (e.g.,
prometheus.caCert or wva.prometheus.caCert as defined in the chart values).
Extract the CA certificate stored in the prometheus-tls secret and pass its PEM
content to the Helm command's caCert parameter (or update the values file) so
WVA validates Prometheus' server cert instead of skipping verification; ensure
references to the prometheus-tls secret and the chart parameter names
(prometheus-tls, wva.prometheus.tls.insecureSkipVerify, wva.metrics.secure,
prometheus.caCert / wva.prometheus.caCert) are used to locate the change.
| // Verify no scheduler deployment was created for the AMD instance | ||
| amdScheduler := &appsv1.Deployment{} | ||
| err = envTest.Get(ctx, client.ObjectKey{ | ||
| Name: amdSvcName + "-kserve-epp", | ||
| Namespace: testNs.Name, | ||
| }, amdScheduler) | ||
| Expect(errors.IsNotFound(err)).To(BeTrue(), | ||
| "AMD instance should not create a scheduler deployment") |
There was a problem hiding this comment.
Check the real scheduler Deployment name in this negative assertion.
The rest of this file resolves scheduler Deployments as -kserve-router-scheduler, but this test looks up -kserve-epp. That makes the assertion pass even if the AMD service actually creates a scheduler Deployment, so the regression check is ineffective.
🐛 Proposed fix
amdScheduler := &appsv1.Deployment{}
err = envTest.Get(ctx, client.ObjectKey{
- Name: amdSvcName + "-kserve-epp",
+ Name: amdSvcName + "-kserve-router-scheduler",
Namespace: testNs.Name,
}, amdScheduler)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Verify no scheduler deployment was created for the AMD instance | |
| amdScheduler := &appsv1.Deployment{} | |
| err = envTest.Get(ctx, client.ObjectKey{ | |
| Name: amdSvcName + "-kserve-epp", | |
| Namespace: testNs.Name, | |
| }, amdScheduler) | |
| Expect(errors.IsNotFound(err)).To(BeTrue(), | |
| "AMD instance should not create a scheduler deployment") | |
| // Verify no scheduler deployment was created for the AMD instance | |
| amdScheduler := &appsv1.Deployment{} | |
| err = envTest.Get(ctx, client.ObjectKey{ | |
| Name: amdSvcName + "-kserve-router-scheduler", | |
| Namespace: testNs.Name, | |
| }, amdScheduler) | |
| Expect(errors.IsNotFound(err)).To(BeTrue(), | |
| "AMD instance should not create a scheduler deployment") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/v1alpha2/llmisvc/controller_int_scheduler_config_test.go`
around lines 1571 - 1578, The test's negative assertion checks the wrong
Deployment name: it looks up amdSvcName + "-kserve-epp" but all other tests
expect scheduler Deployments named amdSvcName + "-kserve-router-scheduler", so
the check can pass incorrectly; update the Get() call in the test to use
amdSvcName + "-kserve-router-scheduler" (the same scheduler name used elsewhere)
and keep the Expect(errors.IsNotFound(err)).To(BeTrue()) assertion to ensure no
scheduler Deployment was created for the AMD instance.
| func schedulerTransform(ctx context.Context, d *appsv1.Deployment) error { | ||
| version, ok := d.Spec.Template.Annotations["app.kubernetes.io/version"] | ||
| if !ok || version == "" { | ||
| version = "0.0.0" | ||
| } | ||
| v, err := semver.NewVersion(version) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to parse version %q: %w", version, err) | ||
| } | ||
| if v.Compare(*semver.New("0.7.0")) < 0 { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
github.com/coreos/go-semver NewVersion behavior with leading "v" prefix and non-strict versions
💡 Result:
The github.com/coreos/go-semver package provides semver.NewVersion(version string) (*Version, error), which parses the input string via the Version.Set method. This splits off metadata (+), prerelease (-), and then splits the core version into major.minor.patch using strings.SplitN(version, ".", 3). It validates each numeric part as int64 and identifiers with validateIdentifier. For leading "v" prefix: No explicit stripping or handling of a leading "v" is mentioned in available source snippets. Since examples use semver.New("1.2.3") without "v" and parsing starts directly on the version string expecting numeric parts after splitting, a leading "v" like "v1.2.3" will cause SplitN to produce ["v1","2","3"], failing numeric parsing for major (cannot parse "v1" as int64). Thus, NewVersion("v1.2.3") returns an error. Users should strip the "v" prefix manually before passing to NewVersion if present. For non-strict versions: The library enforces semver rules strictly. Numeric parts must parse to int64 without leading zeros (likely via validateIdentifier checking no leading zeros), identifiers limited to [0-9A-Za-z-], no empty parts. Invalid formats like "1.2" (missing patch), leading zeros "01.2.3", or invalid chars fail with errors like "invalid dotted-tri format" or validation errors. It does not coerce incomplete versions to ".0"; requires full major.minor.patch.
Citations:
- 1: https://github.com/coreos/go-semver
- 2: https://github.com/coreos/go-semver/blob/main/semver/semver.go
- 3: https://pkg.go.dev/github.com/coreos/go-semver@v0.3.0/semver
🏁 Script executed:
# Find the scheduler.go file and examine the function call chain
fd "scheduler.go" -t f | head -5Repository: opendatahub-io/kserve
Length of output: 110
🏁 Script executed:
# Search for expectedSchedulerDeployment and its calls to schedulerTransform
rg "expectedSchedulerDeployment|reconcileSchedulerDeployment" --type go -A 3 -B 1Repository: opendatahub-io/kserve
Length of output: 1470
🏁 Script executed:
# Verify the call chain by looking for where schedulerTransform is called
rg "schedulerTransform" --type go -B 2 -A 2Repository: opendatahub-io/kserve
Length of output: 2797
🏁 Script executed:
# Examine the scheduler_test.go file around schedulerTransform tests
rg "schedulerTransform" -A 15 -B 5 pkg/controller/v1alpha2/llmisvc/scheduler_test.go | head -80Repository: opendatahub-io/kserve
Length of output: 1545
🏁 Script executed:
# Look for test cases with version annotations to understand expected behavior
rg "app.kubernetes.io/version" --type go -B 3 -A 3Repository: opendatahub-io/kserve
Length of output: 3085
🏁 Script executed:
# Find all version test cases in the scheduler_test.go
rg "version.*=" pkg/controller/v1alpha2/llmisvc/scheduler_test.go | grep -E "version\s*:" | head -30Repository: opendatahub-io/kserve
Length of output: 47
🏁 Script executed:
# Look at TestSchedulerTransformThreshold test structure to see version values
sed -n '/func TestSchedulerTransformThreshold/,/^func /p' pkg/controller/v1alpha2/llmisvc/scheduler_test.go | head -100Repository: opendatahub-io/kserve
Length of output: 3506
🏁 Script executed:
# Check if there are any tests with malformed versions or error cases
rg "TestSchedulerTransform|malformed|invalid|error" pkg/controller/v1alpha2/llmisvc/scheduler_test.go | head -20Repository: opendatahub-io/kserve
Length of output: 162
🏁 Script executed:
# Look more carefully at all test cases to see if any test invalid versions
sed -n '/version.*:/,/validateConfig/p' pkg/controller/v1alpha2/llmisvc/scheduler_test.go | grep -E 'version|name:' | head -40Repository: opendatahub-io/kserve
Length of output: 434
🏁 Script executed:
# Check for any test with version like "v0.7.0" or malformed versions
rg '"v[0-9]|malformed|invalid' pkg/controller/v1alpha2/llmisvc/scheduler_test.goRepository: opendatahub-io/kserve
Length of output: 47
🏁 Script executed:
# Look for all test cases to see the full range of tested versions
sed -n '/tests := \[\]/,/for _, tt := range tests/p' pkg/controller/v1alpha2/llmisvc/scheduler_test.go | grep -E 'name:|version:' | head -50Repository: opendatahub-io/kserve
Length of output: 2619
🏁 Script executed:
# Check if there's admission control or validation webhook for this annotation
fd "admission|webhook|validat" --type f | grep -i "llm\|scheduler"Repository: opendatahub-io/kserve
Length of output: 972
🏁 Script executed:
# Search for any version validation or admission webhook in the controller
rg "ValidatingWebhook|app.kubernetes.io/version" --type go -B 3 -A 3 | head -40Repository: opendatahub-io/kserve
Length of output: 3085
🏁 Script executed:
# Let's see how the version annotation might be set - check user-facing docs or examples
fd "README|example|sample" -t f | xargs grep -l "app.kubernetes.io/version" 2>/dev/nullRepository: opendatahub-io/kserve
Length of output: 49
🏁 Script executed:
# Check if the annotation is set anywhere in the codebase - look for where it's created
rg "app.kubernetes.io/version" --type go | grep -v "test\|//"Repository: opendatahub-io/kserve
Length of output: 183
🏁 Script executed:
# Examine the validation webhooks to see if they validate pod annotations
head -100 pkg/apis/serving/v1alpha2/llm_inference_service_validation.goRepository: opendatahub-io/kserve
Length of output: 3644
🏁 Script executed:
# Check if scheduler.go is actually in a controller package to verify it's part of reconciliation
head -50 pkg/controller/v1alpha2/llmisvc/scheduler.goRepository: opendatahub-io/kserve
Length of output: 1581
🏁 Script executed:
# Verify that the annotation is actually user-provided by checking if Deployment is created from LLMInferenceService spec
rg "LLMInferenceService" pkg/controller/v1alpha2/llmisvc/scheduler.go | head -5Repository: opendatahub-io/kserve
Length of output: 738
Malformed app.kubernetes.io/version annotation blocks scheduler reconciliation.
semver.NewVersion from github.com/coreos/go-semver is strict: a leading v (e.g., v0.7.0), missing patch component (0.7), or any non-SemVer2 format returns an error. This error propagates through expectedSchedulerDeployment → reconcileSchedulerDeployment and blocks all subsequent reconciliation attempts until the annotation is manually corrected. A typo or common format variation (e.g., pasting a release tag with v prefix) becomes a permanent outage for the scheduler.
Trim the leading v, default to 0.0.0 on parse error with logging, or reject invalid annotations at admission time rather than at reconcile time.
Suggested fix
version, ok := d.Spec.Template.Annotations["app.kubernetes.io/version"]
if !ok || version == "" {
version = "0.0.0"
}
- v, err := semver.NewVersion(version)
+ v, err := semver.NewVersion(strings.TrimPrefix(version, "v"))
if err != nil {
- return fmt.Errorf("failed to parse version %q: %w", version, err)
+ log.FromContext(ctx).Info("Unparseable app.kubernetes.io/version, skipping v0.7 scheduler migrations", "version", version, "error", err.Error())
+ return nil
}
if v.Compare(*semver.New("0.7.0")) < 0 {
return nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func schedulerTransform(ctx context.Context, d *appsv1.Deployment) error { | |
| version, ok := d.Spec.Template.Annotations["app.kubernetes.io/version"] | |
| if !ok || version == "" { | |
| version = "0.0.0" | |
| } | |
| v, err := semver.NewVersion(version) | |
| if err != nil { | |
| return fmt.Errorf("failed to parse version %q: %w", version, err) | |
| } | |
| if v.Compare(*semver.New("0.7.0")) < 0 { | |
| return nil | |
| } | |
| func schedulerTransform(ctx context.Context, d *appsv1.Deployment) error { | |
| version, ok := d.Spec.Template.Annotations["app.kubernetes.io/version"] | |
| if !ok || version == "" { | |
| version = "0.0.0" | |
| } | |
| v, err := semver.NewVersion(strings.TrimPrefix(version, "v")) | |
| if err != nil { | |
| log.FromContext(ctx).Info("Unparseable app.kubernetes.io/version, skipping v0.7 scheduler migrations", "version", version, "error", err.Error()) | |
| return nil | |
| } | |
| if v.Compare(*semver.New("0.7.0")) < 0 { | |
| return nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/v1alpha2/llmisvc/scheduler.go` around lines 1111 - 1122,
schedulerTransform currently fails reconciliation when the
app.kubernetes.io/version annotation is non-strict SemVer; instead, sanitize and
tolerate common formats: in schedulerTransform trim a leading "v" (and
whitespace) from d.Spec.Template.Annotations["app.kubernetes.io/version"] before
calling semver.NewVersion, and if semver.NewVersion still returns an error, log
the parse failure and set version to "0.0.0" (do not return the error) so
expectedSchedulerDeployment/reconcileSchedulerDeployment continue reconciling;
ensure log messages include the raw annotation and the fallback decision for
debugging.
| if [[ $LLMISVC_AUTOSCALER == "hpa" ]]; then | ||
| echo "Installing LLMISVC HPA autoscaling components (Prometheus, Prometheus Adapter, WVA)..." | ||
| ${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh | ||
| ${REPO_ROOT}/hack/setup/infra/manage.prometheus-adapter-helm.sh | ||
| ${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh | ||
|
|
||
| elif [[ $LLMISVC_AUTOSCALER == "keda" ]]; then | ||
| echo "Installing LLMISVC KEDA autoscaling components (Prometheus, KEDA, WVA)..." | ||
| ${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh | ||
| ${REPO_ROOT}/hack/setup/infra/manage.keda-helm.sh | ||
| ${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh | ||
| fi |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Quote ${REPO_ROOT} expansions in the new autoscaler invocations.
Shellcheck SC2086 flags lines 112–114 and 118–120. If REPO_ROOT ever resolves to a path with whitespace or glob metachars, word splitting will break the invocation. As per coding guidelines (**/*.sh: "Quote all variables to prevent injection").
🔧 Proposed fix
if [[ $LLMISVC_AUTOSCALER == "hpa" ]]; then
echo "Installing LLMISVC HPA autoscaling components (Prometheus, Prometheus Adapter, WVA)..."
- ${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh
- ${REPO_ROOT}/hack/setup/infra/manage.prometheus-adapter-helm.sh
- ${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh
+ "${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh"
+ "${REPO_ROOT}/hack/setup/infra/manage.prometheus-adapter-helm.sh"
+ "${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh"
elif [[ $LLMISVC_AUTOSCALER == "keda" ]]; then
echo "Installing LLMISVC KEDA autoscaling components (Prometheus, KEDA, WVA)..."
- ${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh
- ${REPO_ROOT}/hack/setup/infra/manage.keda-helm.sh
- ${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh
+ "${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh"
+ "${REPO_ROOT}/hack/setup/infra/manage.keda-helm.sh"
+ "${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh"
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [[ $LLMISVC_AUTOSCALER == "hpa" ]]; then | |
| echo "Installing LLMISVC HPA autoscaling components (Prometheus, Prometheus Adapter, WVA)..." | |
| ${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh | |
| ${REPO_ROOT}/hack/setup/infra/manage.prometheus-adapter-helm.sh | |
| ${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh | |
| elif [[ $LLMISVC_AUTOSCALER == "keda" ]]; then | |
| echo "Installing LLMISVC KEDA autoscaling components (Prometheus, KEDA, WVA)..." | |
| ${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh | |
| ${REPO_ROOT}/hack/setup/infra/manage.keda-helm.sh | |
| ${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh | |
| fi | |
| if [[ $LLMISVC_AUTOSCALER == "hpa" ]]; then | |
| echo "Installing LLMISVC HPA autoscaling components (Prometheus, Prometheus Adapter, WVA)..." | |
| "${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh" | |
| "${REPO_ROOT}/hack/setup/infra/manage.prometheus-adapter-helm.sh" | |
| "${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh" | |
| elif [[ $LLMISVC_AUTOSCALER == "keda" ]]; then | |
| echo "Installing LLMISVC KEDA autoscaling components (Prometheus, KEDA, WVA)..." | |
| "${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh" | |
| "${REPO_ROOT}/hack/setup/infra/manage.keda-helm.sh" | |
| "${REPO_ROOT}/hack/setup/infra/manage.wva-helm.sh" | |
| fi |
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 112-112: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 113-113: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 114-114: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 118-118: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 119-119: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 120-120: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/scripts/gh-actions/setup-deps.sh` around lines 110 - 121, The shell
invocations using ${REPO_ROOT} (e.g., calls to manage.prometheus-helm.sh,
manage.prometheus-adapter-helm.sh, manage.wva-helm.sh, and manage.keda-helm.sh)
are unquoted and trigger ShellCheck SC2086; update each invocation to quote the
variable so the commands use
"${REPO_ROOT}/hack/setup/infra/manage.prometheus-helm.sh" (and likewise for
manage.prometheus-adapter-helm.sh, manage.wva-helm.sh, manage.keda-helm.sh) to
prevent word-splitting or glob expansion if REPO_ROOT contains spaces or
metacharacters.
|
@vivekk16: The following test has Failed: OCI Artifact Browser URLInspecting Test Artifacts ManuallyTo inspect your test artifacts manually, follow these steps:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:kserve-group-test-9rr7z |
|
@vivekk16: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
As per title
Summary by CodeRabbit
Release Notes
New Features
Improvements
Infrastructure