Skip to content

roadmap: KSail strategy & roadmap (June 2026)Β #4988

@devantler

Description

@devantler

πŸ€– Generated by the Daily AI Assistant

KSail Monthly Strategy β€” June 2026. This is the roadmap home for KSail. Per the portfolio
strategy scheme, epic / theme-level items carry the roadmap label; their actionable children use
the normal labels (enhancement, security, bug, …). This issue assesses where the flagship is,
groups the existing feature backlog into coherent themes (problem β†’ direction β†’ rough size), and sets
priorities. It replaces ad-hoc backlog sprawl β€” KSail previously had no roadmap label and no roadmap
structure
despite ~10 substantive open feature issues.

Where KSail is today

KSail is the single SDK for local and cloud Kubernetes β€” Kind / K3d / Talos / vCluster / KWOK
locally (Docker-only), EKS in the cloud β€” plus declarative GitOps bootstrap (Flux/Argo), an MCP server,
a desktop app, and a Homebrew tap. Recent shipping (late May β†’ early June 2026) has been deliberately in
a quality / release-hardening phase: test-coverage backfill (uiserver, kwok, kubescape,
fsutil), Homebrew-cask correctness (desc/homepage/verified/brew-style), govulncheck
risk-acceptance, validate-go workflow bumps, Talos/Hetzner scale-up fixes, and desktop polish. The core
is mature; the open backlog is almost entirely forward feature expansion, which is what this roadmap
organises.

Operational health note (not roadmap, tracked separately): two system-test lanes have been red 6+
weeks on infrastructure/credential issues β€” Hetzner HCLOUD_TOKEN invalid (#4972) and Omni "no machines
registered" (#4973). These are environment blockers, not product work, but they currently mean the
cloud-provider lanes ship without live E2E coverage β€” relevant context for Theme A below.

Themes

Theme A β€” Cloud-provider & distribution expansion β†’ epic #4627 (roadmap)

Problem. KSail's cloud story today is Hetzner (Talos) + AWS EKS; users want one SDK across the major
clouds and distributions. Direction. Broaden provider/distribution coverage behind the existing
provider abstraction, no required new external deps (cloud SDKs/credentials stay optional).
Children: #3983 (Hetzner β†’ K3s & Vanilla/Kind), #4328 (complete AWS EKS), #4510 (add GKE & AKS).
Size: L (multi-release). Priority: High β€” but gated on restoring the Hetzner/Omni system-test
lanes (#4972/#4973) so new providers ship with live E2E coverage rather than blind.

Theme B β€” Supply-chain security & verification

Problem. GitOps artifacts KSail generates aren't cryptographically verified end-to-end.
Direction. Make signed-and-verified the default path. Children: #4987 (cosign spec.verify on
the generated Flux OCIRepository). Size: M. Priority: High β€” security-labelled, self-contained,
and aligns with the portfolio-wide supply-chain push (cf. platform #1570 cosign-verify infra) β†’ strong
candidate for the next feature PR. Gap to fill: sign-on-publish for KSail's own OCI artifacts is a
natural follow-up child if not already covered.

Theme C β€” Inner-loop developer experience

Problem. The local dev loop stops at "cluster up"; onboarding and remote/local bridging are manual.
Direction. Make KSail the full inner-loop tool. Children: #4777 (DevContainer scaffolding in
cluster init), #4521 (local↔remote service mirroring, Telepresence/mirrord-style). Size: M each.
Priority: Medium β€” high user value, additive, no infra risk. Candidate to split into its own epic if
it grows.

Theme D β€” Workload observability

Problem. No first-class network/traffic visibility for workloads. Direction. Surface eBPF-based
observability through KSail. Children: #4778 (Hubble eBPF traffic observability). Size: M.
Priority: Medium.

Theme E β€” Cluster/component lifecycle & auth

Problem. Component install and multi-cluster auth are not operator-driven/federated. Direction.
Operator-managed component lifecycle + federated auth. Children: #4899 (operator-driven
component-install lifecycle), #4602 (OIDC federation & multi-cluster auth). Size: L. Priority:
Medium β€” larger design effort; each warrants an ADR before implementation.

Suggested sequencing (June β†’ Q3 2026)

  1. Unblock E2E β€” restore Hetzner/Omni system-test credentials (bug(ci): Hetzner system test red for 6+ weeks β€” HCLOUD_TOKEN invalid (unauthorized)Β #4972/bug(ci): Omni system test red for 6 weeks β€” no available machines registered in Omni instanceΒ #4973) so cloud lanes have
    coverage. (Operational; maintainer/infra.)
  2. Theme B feat: support cosign signature verification (spec.verify) on the generated Flux OCIRepositoryΒ #4987 (cosign verify) β€” smallest high-value security win; ship next.
  3. Theme A β€” resume provider expansion once (1) is green.
  4. Themes C/D β€” additive DX/observability features, parallelisable.
  5. Theme E β€” ADR-first, larger lifecycle/auth work.

How to use this roadmap

  • New feature work should map to a theme above (or propose a new one here first).
  • Implementing PRs close their child issue with Fixes #N; epics close when their children do.
  • This is a living document β€” refreshed on the monthly KSail strategy cadence.

Strategy pass grounded in the live issue backlog, the existing parent epic #4627, and the May–June 2026
merge history. No new duplicate epics were minted β€” existing tracking issues are organised in place; the
roadmap label + this home establish the structure that was missing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions