feat(dev): add lightweight KFP local dev cluster via k3d by cordeirops · Pull Request #753 · kubeflow/kale

cordeirops · 2026-04-13T00:44:06Z

Summary

Running a full Kubeflow Pipelines stack for local development currently requires minikube (or equivalent), kubectl port-forward, and enough memory to host both the Kubernetes control plane (~2 GB) and the KFP services (~2 GB). This makes it impractical for contributors with resource-constrained machines or those who only need to work on Kale's frontend or backend — not on actual pipeline execution.

This PR introduces a k3d-based local development cluster as a lightweight, one-command alternative.

k3d runs k3s (a certified, minimal Kubernetes distribution) inside Docker containers. Its control plane uses an embedded SQLite store instead of etcd, reducing cluster overhead from ~2 GB (minikube) to ~512 MB, bringing the total footprint for a full KFP stack down to approximately 2.5 GB.

Changes

`scripts/kfp-dev-setup.sh` (new)

An idempotent Bash script that handles the full first-time setup:

Verifies docker and kubectl are available
Installs k3d automatically via its official install script if not present
Creates a k3d cluster named kale-kfp (Traefik ingress disabled to save memory)
Deploys KFP v2.16 using the official platform-agnostic kustomize manifests from github.com/kubeflow/pipelines
Waits for all KFP pods to reach Ready state
Starts a kubectl port-forward in the background, tracking its PID in .kfp-dev-pf.pid for clean teardown
Smoke-tests the UI endpoint and prints the next steps

The script is safe to re-run — it skips cluster creation and KFP deployment if they already exist.

`Makefile` (updated)

Five new targets under a dedicated "KFP Local Cluster" section:

Target	Description
`make kfp-dev-setup`	First-time cluster creation and KFP deployment (~5 min)
`make kfp-dev-start`	Daily driver: start the cluster and port-forward UI to `localhost:8080`
`make kfp-dev-stop`	Stop the port-forward and pause the cluster (all pipeline data is preserved)
`make kfp-dev-delete`	Wipe the cluster entirely and free all resources
`make kfp-dev-status`	Show k3d cluster state and KFP pod status

All targets are configurable via Makefile variables (KFP_CLUSTER_NAME, KFP_PIPELINE_VERSION, KFP_LOCAL_PORT).

The clean target is updated to also remove the port-forward PID file.

Developer workflow

# First time only (~5 min)
make kfp-dev-setup

# Every day
make kfp-dev-start
make kfp-run NB=examples/base/candies_sharing.ipynb KFP_HOST=http://localhost:8080

# End of day
make kfp-dev-stop

# When done with the cluster entirely
make kfp-dev-delete

Requirements

Docker (Docker Desktop on macOS/Windows, or Docker Engine on Linux)
kubectl (available via brew install kubectl or bundled with Docker Desktop)
k3d — installed automatically by kfp-dev-setup if not present

Resource comparison

Setup	Cluster overhead	Total (cluster + KFP)
minikube + KFP	~2 GB	~4–6 GB
k3d + KFP (this PR)	~512 MB	~2.5 GB

Introduce a k3d-based local development cluster as a low-resource alternative to running minikube + KFP for contributors who only need to work on Kale's frontend or backend. k3d (k3s in Docker) uses ~512 MB of cluster overhead compared to ~2 GB for minikube, bringing the total footprint for KFP down to ~2.5 GB vs ~4–6 GB with the previous approach. No Kubernetes knowledge is required beyond installing Docker and kubectl. Changes: - Add scripts/kfp-dev-setup.sh: idempotent setup script that installs k3d if missing, creates a cluster, deploys KFP v2.16 via the official platform-agnostic kustomize manifests, waits for all pods to be Ready, and starts a background port-forward with PID tracking - Add five Makefile targets under a new "KFP Local Cluster" section: make kfp-dev-setup # first-time cluster creation (~5 min) make kfp-dev-start # daily: start cluster + port-forward to :8080 make kfp-dev-stop # pause cluster, preserves all pipeline data make kfp-dev-delete # wipe cluster entirely make kfp-dev-status # inspect k3d and KFP pod status - Update clean target to remove the port-forward PID file After kfp-dev-start, the KFP UI and API are available at http://localhost:8080, compatible with the existing kfp-run target. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pedro Sbaraini Cordeiro <pedro.sbarainicordeiro@gmail.com>

google-oss-prow · 2026-04-13T00:44:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ederign for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ada333

Hi, @cordeirops I really like this idea - thank you for making this PR!
I left a few suggestions.

ada333 · 2026-04-13T12:24:37Z

+
+deploy_kfp() {
+    # Idempotent: if the kubeflow namespace already has the ML pipeline CRD, skip
+    if kubectl get namespace kubeflow >/dev/null 2>&1 && \


I would do this differently so KFP can be upgraded on existing cluster without the need to delete it - or maybe a new function like upgrade_kfp can be added?

ada333 · 2026-04-13T12:41:45Z

+    if k3d cluster list 2>/dev/null | grep -q "^${CLUSTER_NAME}[[:space:]]"; then
+        warn "Cluster '${CLUSTER_NAME}' already exists — skipping creation."
+        info "Starting cluster in case it was stopped..."
+        k3d cluster start "${CLUSTER_NAME}"


before this we should switch kubectl context - something like
kubectl config use-context k3d-kale-kfp

ada333 · 2026-04-13T12:42:01Z

+	@bash scripts/kfp-dev-setup.sh "$(KFP_CLUSTER_NAME)" "$(KFP_PIPELINE_VERSION)" "$(KFP_LOCAL_PORT)" "$(KFP_PID_FILE)"
+
+kfp-dev-start: ## Start existing cluster and port-forward KFP UI to localhost:8080
+	@printf "$(BLUE)Starting k3d cluster '$(KFP_CLUSTER_NAME)'...\n$(NC)"


also here: kubectl config use-context k3d-kale-kfp

ada333 · 2026-04-13T12:42:23Z

+
+kfp-dev-status: ## Show cluster and KFP pod status
+	@printf "$(BLUE)k3d clusters:\n$(NC)"
+	@k3d cluster list 2>/dev/null || printf "$(YELLOW)k3d not installed\n$(NC)"


also here: kubectl config use-context k3d-kale-kfp

- Switch kubectl context to k3d-{cluster} after cluster start/create to ensure all subsequent kubectl commands target the correct cluster, regardless of the active context before the script runs. - Extract shared manifest-apply logic into _apply_kfp_manifests() and introduce upgrade_kfp() alongside the existing deploy_kfp(). The deploy function skips if KFP is already present; the upgrade function always re-applies the manifests, allowing in-place version bumps without deleting the cluster and losing experiment/run history. - Add make kfp-dev-upgrade target (accepts KFP_PIPELINE_VERSION=X.Y.Z) to expose the upgrade path from the Makefile. - Add kubectl context switch to kfp-dev-start Makefile target for consistency with the setup script. Signed-off-by: Pedro Sbaraini Cordeiro <pedro.sbarainicordeiro@gmail.com>

cordeirops · 2026-04-13T13:30:05Z

Thanks for the thorough review, @ada333! I've addressed all three suggestions in the latest commit (933ff5b).

Changes made

1. kubectl config use-context k3d-{cluster} — two locations

Added an explicit context switch in both places flagged:

In create_cluster() (script) — after cluster creation or start, kubectl is immediately pointed at k3d-${CLUSTER_NAME} before any kubectl apply or kubectl wait calls. This prevents commands from silently targeting a different cluster (e.g. a leftover minikube context).
In kfp-dev-start (Makefile) — same switch added right after k3d cluster start, so the daily-driver workflow is also safe for users who switch between multiple clusters.

2. KFP upgrade without deleting the cluster

Refactored deploy_kfp() by extracting the shared manifest-apply logic into a private helper _apply_kfp_manifests(). Two distinct functions now exist:

deploy_kfp() — unchanged behaviour: skips if KFP is already present, avoids re-applying on every setup run.
upgrade_kfp() — always calls _apply_kfp_manifests(), allowing kubectl apply -k to reconcile the diff between the currently installed version and the target version in-place, preserving all experiment and run history.

A new make kfp-dev-upgrade Makefile target exposes this:

# Upgrade to a newer KFP version on an existing cluster
make kfp-dev-upgrade KFP_PIPELINE_VERSION=2.17.0

Let me know if anything else needs adjusting!

Signed-off-by: Pedro Sbaraini Cordeiro <pedro.sbarainicordeiro@gmail.com>

cordeirops · 2026-04-13T13:51:26Z

Build failure (browser_check timeout)

The build job failed with a page.waitForSelector: Timeout 100000ms exceeded error in uv run python -m jupyterlab.browser_check. This step spins up a headless browser to verify JupyterLab loads, it is unrelated to the changes in this PR (Makefile + bash script, no frontend code touched).

The same workflow passed on PR #754 which ran earlier the same day. This is a known flaky behaviour in CI environments when the runner is under resource pressure.

I've pushed an empty commit to re-trigger the workflow.

google-oss-prow Bot requested a review from ederign April 13, 2026 00:44

google-oss-prow Bot requested a review from StefanoFioravanzo April 13, 2026 00:44

google-oss-prow Bot added the size/L label Apr 13, 2026

ada333 suggested changes Apr 13, 2026

View reviewed changes

google-oss-prow Bot assigned ada333 Apr 13, 2026

cordeirops requested a review from ada333 April 13, 2026 13:30

ci: re-trigger browser_check (flaky runner)

986880d

Signed-off-by: Pedro Sbaraini Cordeiro <pedro.sbarainicordeiro@gmail.com>

ada333 approved these changes Apr 13, 2026

View reviewed changes

google-oss-prow Bot added the lgtm label Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dev): add lightweight KFP local dev cluster via k3d#753

feat(dev): add lightweight KFP local dev cluster via k3d#753
cordeirops wants to merge 3 commits intokubeflow:mainfrom
cordeirops:feat/k3d-local-dev-cluster

cordeirops commented Apr 13, 2026

Uh oh!

google-oss-prow Bot commented Apr 13, 2026

Uh oh!

ada333 left a comment

Uh oh!

ada333 Apr 13, 2026

Uh oh!

ada333 Apr 13, 2026

Uh oh!

ada333 Apr 13, 2026

Uh oh!

ada333 Apr 13, 2026

Uh oh!

cordeirops commented Apr 13, 2026

Uh oh!

cordeirops commented Apr 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cordeirops commented Apr 13, 2026

Summary

Changes

scripts/kfp-dev-setup.sh (new)

Makefile (updated)

Developer workflow

Requirements

Resource comparison

Uh oh!

google-oss-prow Bot commented Apr 13, 2026

Uh oh!

ada333 left a comment

Choose a reason for hiding this comment

Uh oh!

ada333 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

ada333 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

ada333 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

ada333 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

cordeirops commented Apr 13, 2026

Changes made

Uh oh!

cordeirops commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`scripts/kfp-dev-setup.sh` (new)

`Makefile` (updated)

cordeirops commented Apr 13, 2026 •

edited

Loading