Skip to content

bug: LogicalCluster conditions are ordering-dependent #3924

@tgoodwin

Description

@tgoodwin

Background

I'm developing a tool that systematically explores controller reconciliation ordering, staleness, and fault injection (kamera).

Describe the bug

I observe that the LogicalCluster's conditions can diverge depending on which controller reconciles last during workspace initialization.

Both APIBinderInitializerController and DefaultAPIBindingLifecycleController write conditions on the LogicalCluster. If the APIBindingReconciler resets InitialBindingCompleted to False (because resource-bindings annotation is missing), it triggers the APIBinderInitializer to re-reconcile LogicalCluster. The final LogicalCluster conditions depend on which controller commits last:

  • If APIBinderInitializer runs last → its view of conditions is reflected
  • If DefaultAPIBindingLifecycle runs last → its view of conditions is reflected

This also surfaces when a WorkspaceType's DefaultAPIBindings change mid-initialization (e.g., adding a binding) — the same condition write conflict occurs.

Expected Behaviour

LogicalCluster conditions should converge to the same state regardless of controller reconcile ordering.

Analysis

The two controllers write to different condition types — APIBinderInitializer writes WorkspaceAPIBindingsInitialized (apibinder_initializer_reconcile.go:67-251) and DefaultAPIBindingLifecycle writes WorkspaceAPIBindingsReconciled (default_apibinding_lifecycle_reconcile.go:226-235). The conditions themselves don't conflict, but both controllers deep-copy the LogicalCluster, reconcile independently, and commit via the KCP committer. Since both patch the full status, the last writer's view of the status wins — overwriting condition changes from the other controller's concurrent commit.

One possible approach: scope each controller's committer to only patch the conditions it owns, so that concurrent commits don't overwrite each other's condition changes. Another option: if the APIBindingReconciler's condition reset (setting InitialBindingCompleted=False) is what triggers the re-reconciliation cascade, the APIBindingReconciler could avoid resetting that condition when the resource-bindings annotation is simply missing (as opposed to explicitly removed).

Additional Context

This could manifest as non-deterministic LogicalCluster conditions during workspace initialization.

Versions

  • kcp: v0.30.0 (commit 7952f476d)
  • Kubernetes: simulated via kamera (based on k8s.io/client-go v0.35.0 / Kubernetes 1.35)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions