Skip to content

fix: add jobset and argo events to devtools#3124

Open
saikonen wants to merge 2 commits intomasterfrom
fix/add-jobset-and-argo-events-to-devtools
Open

fix: add jobset and argo events to devtools#3124
saikonen wants to merge 2 commits intomasterfrom
fix/add-jobset-and-argo-events-to-devtools

Conversation

@saikonen
Copy link
Copy Markdown
Collaborator

@saikonen saikonen commented Apr 17, 2026

PR Type

  • Bug fix
  • New feature
  • Core Runtime change (higher bar -- see CONTRIBUTING.md)
  • Docs / tooling
  • Refactoring

Summary

Recent rework to the metaflow-dev setup removed support for argo-events and jobsets. Add these back.

Issue

Fixes #

Reproduction

Runtime:

Commands to run:

# paste exact commands

Where evidence shows up:

Before (error / log snippet)
paste here
After (evidence that fix works)
paste here

Root Cause

Why This Fix Is Correct

Failure Modes Considered

Tests

  • Unit tests added/updated
  • Reproduction script provided (required for Core Runtime)
  • CI passes
  • If tests are impractical: explain why below and provide manual evidence above

Non-Goals

AI Tool Usage

  • No AI tools were used in this contribution
  • AI tools were used (describe below)

@saikonen saikonen requested a review from npow April 17, 2026 12:59
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 17, 2026

Greptile Summary

Restores argo-events and jobset support to the devtools Tilt environment, which was lost in a prior rework. The change adds two new tiltfiles, their supporting Kubernetes manifests, and wires them into the main Tiltfile and service picker following the existing patterns.

Confidence Score: 5/5

Safe to merge; all findings are P2 style suggestions that do not affect correctness.

Both new components follow the established devtools patterns closely. The two P2 comments (EventBus replicas=3 and curl without checksum) are improvement suggestions for a developer-only environment and do not block merge.

devtools/tilt/k8s/argo-events-eventbus.yaml (replicas=3 in a local dev cluster) and devtools/tilt/jobset.tiltfile (remote curl without integrity check).

Important Files Changed

Filename Overview
devtools/Tiltfile Adds version variables, component graph entries, tiltfile loads, and ctx struct fields for argo-events and jobset — clean, consistent with the existing pattern.
devtools/pick_services.sh Adds argo-events and jobset to the interactive service picker menu; straightforward two-line addition.
devtools/tilt/argo_events.tiltfile New setup function for argo-events via Helm; versions[0] uses :latest image tags (versions[1] with pinned tags is what the EventBus actually uses, so low practical impact), and the port-forward local_resource is functional.
devtools/tilt/jobset.tiltfile New setup function for jobset; fetches manifests at runtime via curl without checksum verification — works but is a supply-chain hygiene concern for a dev tool.
devtools/tilt/k8s/argo-events-eventbus.yaml Defines the default EventBus with replicas=3; unnecessarily resource-heavy for a single-node devtools cluster.
devtools/tilt/k8s/argo-events-eventsource.yaml Defines the webhook EventSource for metaflow-event on port 12000; looks correct.
devtools/tilt/k8s/argo-events-rbac.yaml Creates operate-workflow-sa with workflow management permissions and a view-events role for the argo-workflows SA; scoped to the default namespace which is appropriate for devtools.
devtools/tilt/k8s/argo-events-webhook-svc.yaml Creates a companion Service with a -tilt suffix for port-forwarding, avoiding conflicts with the auto-generated eventsource service; selector labels look correct.
devtools/tilt/k8s/jobset-rbac.yaml Grants the default SA cluster-wide full access to jobsets; broad but expected for a devtools environment where workflow pods use the default service account.

Reviews (1): Last reviewed commit: "add argo-events and jobset to pick servi..." | Re-trigger Greptile

spec:
jetstream:
version: "2.9.15"
replicas: 3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 EventBus replicas=3 is heavy for a dev environment

replicas: 3 launches three JetStream pods, each consuming 100m CPU and 128Mi memory, totalling 300m CPU and ~384Mi. In a typical single-node local cluster (kind/minikube/docker-desktop) this provides no HA benefit since all three pods land on the same node. replicas: 1 is the standard choice for devtools setups and keeps resource consumption in line with the rest of the stack.

Suggested change
replicas: 3
replicas: 1

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


def setup_jobset(ctx):
jobset_manifest_url = "https://github.com/kubernetes-sigs/jobset/releases/download/%s/manifests.yaml" % ctx.versions.jobset
k8s_yaml(local("curl -sSL %s" % jobset_manifest_url))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Remote manifest fetched without integrity check

curl -sSL downloads the jobset manifest from GitHub at every tilt up without verifying its checksum. While the URL is version-pinned, an upstream compromise or MITM between the developer and GitHub would apply arbitrary manifests to the cluster. Consider caching the manifest in-tree (as done with the RBAC file) or at least documenting the known SHA so developers can verify out-of-band.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants