Skip to content

Handle API lifecycle #62

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
May 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
2285d35
prepare to publish more than 1 version of a given GVK in a PublishedR…
xrstf May 15, 2025
1650e9d
codegen
xrstf May 15, 2025
ce77504
refactor crd-puller to pull all versions (i.e. work on GK basis not G…
xrstf May 16, 2025
3c7647a
create new APIResourceSchemas when CRDs or PRs change, improve mutati…
xrstf May 16, 2025
a8bb1e8
correctly merge and update existing ResourceSchemas with new ones whe…
xrstf May 16, 2025
382725f
update remaining controllers to perform the new routine to determine …
xrstf May 16, 2025
4eb8c44
fix discovery of built-in resources (oh Kubernetes, why u like this s…
xrstf May 16, 2025
4fd3f75
lint
xrstf May 16, 2025
4401446
simplify projecting versions
xrstf May 16, 2025
cf9825a
codegen
xrstf May 16, 2025
7077af0
add discovery e2e tests
xrstf May 20, 2025
e08b7cf
add more unit tests for CRD projections
xrstf May 20, 2025
87d26ce
add more e2e tests for the APIExport controller
xrstf May 20, 2025
23b8553
fix: make apiresourceschema controller watch for CRD changes
xrstf May 20, 2025
bfcb666
make selecting ARS easier by labelling them with the agent name
xrstf May 20, 2025
c19bcfe
this test obviously failed since we fixed the CRD watching issue earl…
xrstf May 20, 2025
b3c56e8
gimps
xrstf May 20, 2025
736a5a8
update docs
xrstf May 20, 2025
bce4988
enable reconcile logger to see when APIExports are updated
xrstf May 20, 2025
dc7ac47
fix docs
xrstf May 21, 2025
caa9af7
rename function
xrstf May 21, 2025
fc2b82c
update existing docs
xrstf May 21, 2025
ff1c788
extend FAQ
xrstf May 21, 2025
f66cfa8
extend docs
xrstf May 21, 2025
5a2456f
gimps
xrstf May 21, 2025
58721d6
fix missing conversion on multi-version ARS
xrstf May 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions cmd/api-syncagent/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import (
"github.com/kcp-dev/logicalcluster/v3"
"github.com/spf13/pflag"
"go.uber.org/zap"
reconcilerlog "k8c.io/reconciler/pkg/log"

"github.com/kcp-dev/api-syncagent/internal/controller/apiexport"
"github.com/kcp-dev/api-syncagent/internal/controller/apiresourceschema"
Expand Down Expand Up @@ -80,6 +81,7 @@ func main() {

// set the logger used by sigs.k8s.io/controller-runtime
ctrlruntimelog.SetLogger(zapr.NewLogger(log.WithOptions(zap.AddCallerSkip(1))))
reconcilerlog.SetLogger(sugar)

if err := run(ctx, sugar, opts); err != nil {
sugar.Fatalw("Sync Agent has encountered an error", zap.Error(err))
Expand Down
2 changes: 2 additions & 0 deletions cmd/crd-puller/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/crd-puller
*.yaml
9 changes: 5 additions & 4 deletions cmd/crd-puller/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
# CRD Puller

The `crd-puller` can be used for testing and development in order to export a
CustomResourceDefinition for any Group/Version/Kind (GVK) in a Kubernetes cluster.
CustomResourceDefinition for any Group/Kind (GK) in a Kubernetes cluster.

The main difference between this and kcp's own `crd-puller` is that this one
works based on GVKs and not resources (i.e. on `apps/v1 Deployment` instead of
works based on GKs and not resources (i.e. on `apps/Deployment` instead of
`apps.deployments`). This is more useful since a PublishedResource publishes a
specific Kind and version.
specific Kind and version. Also, this puller pulls all available versions, not
just the preferred version.

## Usage

```shell
export KUBECONFIG=/path/to/kubeconfig

./crd-puller Deployment.v1.apps.k8s.io
./crd-puller Deployment.apps.k8s.io
```
9 changes: 3 additions & 6 deletions cmd/crd-puller/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,10 @@ func main() {
pflag.Parse()

if pflag.NArg() == 0 {
log.Fatal("No argument given. Please specify a GVK in the form 'Kind.version.apigroup.com' to pull.")
log.Fatal("No argument given. Please specify a GroupKind in the form 'Kind.apigroup.com' (case-sensitive) to pull.")
}

gvk, _ := schema.ParseKindArg(pflag.Arg(0))
if gvk == nil {
log.Fatal("Invalid GVK, please use the format 'Kind.version.apigroup.com'.")
}
gk := schema.ParseGroupKind(pflag.Arg(0))

loadingRules := clientcmd.NewDefaultClientConfigLoadingRules()
loadingRules.ExplicitPath = kubeconfigPath
Expand All @@ -67,7 +64,7 @@ func main() {
log.Fatalf("Failed to create discovery client: %v.", err)
}

crd, err := discoveryClient.RetrieveCRD(ctx, *gvk)
crd, err := discoveryClient.RetrieveCRD(ctx, gk)
if err != nil {
log.Fatalf("Failed to pull CRD: %v.", err)
}
Expand Down
31 changes: 27 additions & 4 deletions deploy/crd/kcp.io/syncagent.kcp.io_publishedresources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ spec:
type: string
type: array
group:
description: The API group, for example "myservice.example.com".
description: The API group, for example "myservice.example.com". Leave empty to not modify the API group.
type: string
kind:
description: |-
Expand Down Expand Up @@ -316,8 +316,20 @@ spec:
type: string
type: array
version:
description: The API version, for example "v1beta1".
description: |-
The API version, for example "v1beta1". Leave empty to not modify the version.

This field must not be set when multiple versions have been selected.

Deprecated: Use .versions instead.
type: string
versions:
additionalProperties:
type: string
description: |-
Versions allows to map API versions onto new values in kcp. Leave empty to not modify the
versions.
type: object
type: object
related:
items:
Expand Down Expand Up @@ -674,12 +686,23 @@ spec:
description: The resource Kind, for example "Database".
type: string
version:
description: The API version, for example "v1beta1".
description: |-
The API version, for example "v1beta1". Setting this field will only publish
the given version, otherwise all versions for the group/kind will be
published.

Deprecated: Use .versions instead.
type: string
versions:
description: |-
Versions allows to select a subset of versions to publish. Leave empty
to publish all available versions.
items:
type: string
type: array
required:
- apiGroup
- kind
- version
type: object
required:
- resource
Expand Down
64 changes: 64 additions & 0 deletions docs/content/api-lifecycle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# API Lifecycle

In only the rarest of cases will the first version of a CRD be also its final version. Instead usually
CRDs evolve over time and Kubernetes has strong, though sometimes hard to use, support for managing
different versions of CRDs and their resources.

To understand how CRDs work in the context of the Sync Agent, it's important to first get familiar
with the [regular Kubernetes behaviour](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/)
regarding CRD versioning.

## Basics

The Sync Agent will, whenever a published CRD changes (this can also happen when the projection rules
inside a `PublishedResource` are updated), create a new `APIResourceSchema` (ARS) in kcp. The name and
version of this ARS are based on a hash of the projected CRD. Undoing a change would make the agent
re-use the previously created ARS (ARS are immutable).

After every reconciliation, the list of latest resource schemas in the configured `APIExport` is
updated. For this the agent will find all ARS that belong to it (based on an ownership label) and
then merge them into the `APIExport`. Resource schemas for unknown group/resource combinations are
left untouched, so admins are free to add additional resource schemas to an `APIExport`.

This means that every change to a CRD on the service cluster is applied practically immediately in
each workspace that consumes the `APIExport`. Administrators are wise to act carefully when working
with their CRDs on their service cluster. Sometimes it can make sense to turn-off the agent before
testing new CRDs, even though this will temporarily suspend the synchronization.

## Single-Version CRDs

A very common scenario is to only ever have a single version inside each CRD and keeping this version
perpetually backwards-compatible. As long as all consumers are aware that certain fields might not
be set yet in older objects, this scheme works out generally fine.

The agent will handle this scenario just fine by itself. Whenever a CRD is updated, it will reflect
those changes back into a new `APIResourceSchema` and update the `APIExport`, making the changes
immediately available to all consumers. Since the agent itself doesn't much care for the contents of
objects, it itself is not affected by any structural changes in CRDs, as long as it is able to apply
them on the underlying Kubernetes cluster.

## Multi-Version CRDs

Having multiple versions in a single CRD is immediately much more work, since in Kubernetes all
versions of a CRD must be _losslessly_ convertible to every other version. Without CEL expressions
or a dedicated conversion webhook this is practically impossible to achieve.

At the moment kcp does not support CEL-based conversions, and there is no support for configuring a
conversion webhook inside the Sync Agent either. This is because such a webhook would need to run
very close to the kcp shards and it's simply out of scope for such a component to be described and
deployed by the Sync Agent, let alone a trust nightmare for the kcp operators who would have to run
foreign webhooks on their cluster.

Since both conversion mechanisms are not usable in the current state of kcp and the Sync Agent,
having multiple versions in a CRD can be difficult to manage.

Generally the Sync Agent itself does not care much about the schemas of each CRD version or the
convertibility between them. The synchronization works by using unstructured clients to the storage
versison of the CRD on both sides (in kcp and on the service cluster). Which version is the storage
version is up to the CRD author.

When publishing multiple versions of a CRD

* only those versions marked as `served` can be picked and
* if no `storage` version is picked, the latest (highest) version will be chosen automatically as
the storage version in kcp.
24 changes: 12 additions & 12 deletions docs/content/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,18 @@ Only if you have distinct API groups (and therefore also distinct `PublishedReso
You cannot currently publish the same API group onto multiple kcp setups. See issue #13 for more
information.

## What happens when CRDs are updated?

At the moment, nothing. `APIResourceSchemas` in kcp are immutable and the Sync Agent currently does
not attempt to update existing schemas in an `APIExport`. If you add a _new_ CRD that you want to
publish, that's fine, it will be added to the `APIExport`. But changes to existing CRDs require
manual work.

To trigger an update:

* remove the `APIResourceSchema` from the `latestResourceSchemas`,
* delete the `APIResourceSchema` object in kcp,
* restart the api-syncagent
## Can I have additional resources in APIExports, unmanaged by the Sync Agent?

Yes, you can. The agent will only ever change those resourceSchemas that match group/resource of
the configured `PublishedResources`. So if you configure the agent to publish
`cert-manager.io/Certificate`, this would "claim" all resource schemas ending in
`.certificates.cert-manager.io`. When updating the `APIExport`, the agent will only touch schemas
with this suffix and leave all others alone.

This is also used when a `PublishedResource` is deleted: Since the `APIResourceSchema` remains in kcp,
but is no longer configured in the agent, the agent will simply ignore the schema in the `APIExport`.
This allows for async cleanup processes to happen before an admin ultimately removes the old
schema from the `APIExport`.

## Does the Sync Agent handle permission claims?

Expand Down
41 changes: 28 additions & 13 deletions docs/content/publish-resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ For each of the CRDs on the service cluster that should be published, the servic
`PublishedResource` object, which will contain both which CRD to publish, as well as numerous other
important settings that influence the behaviour around handling the CRD.

When publishing a resource (CRD), exactly one version is published. All others are ignored from the
standpoint of the resource synchronization logic.
When publishing a resource (CRD), service owners can choose to restrict it to a subset of available
versions and even change API group, versions and names in transit (for example published a v1 from
the service cluster as v1beta1 within kcp). This process of changing the identity of a CRD is called
"projection" in the agent.

All published resources together form the APIExport. When a service is enabled in a workspace
(i.e. it is bound to it), users can manage objects for the projected resources described by the
Expand All @@ -46,11 +48,18 @@ spec:
resource:
kind: Certificate
apiGroup: cert-manager.io
version: v1
versions: [v1]
```

However, you will most likely apply more configuration and use features described below.

You always have to select at least one version, and all selected versions must be marked as `served`
on the service cluster. If the storage version is selected to be published, it stays the storage
version in kcp. If no storage version is selected, the latest selected version becomes the storage
version.

For more information refer to the [API lifecycle](api-lifecycle.md).

### Filtering

The Sync Agent can be instructed to only work on a subset of resources in kcp. This can be restricted
Expand All @@ -70,16 +79,18 @@ spec:
foo: bar
```

The configuration above would mean the agent only synchronizes objects from `my-app` namespaces (in
each of the kcp workspaces) that also have a `foo=bar` label on them.

### Schema

**Warning:** The actual CRD schema is always copied verbatim. All projections <!--, mutations -->
etc. have to take into account that the resource contents must be expressible without changes to the
schema, so you cannot define entirely new fields in an object that are not defined by the original
CRD.
**Warning:** The actual CRD schema is always copied verbatim. All projections, mutations etc. have
to take into account that the resource contents must be expressible without changes to the schema,
so you cannot define entirely new fields in an object that are not defined by the original CRD.

### Projection

For stronger separation of concerns and to enable whitelabelling of services, the type meta for
For stronger separation of concerns and to enable whitelabelling of services, the type meta for CRDs
can be projected, i.e. changed between the local service cluster and kcp. You could for example
rename `Certificate` from cert-manager to `Sertifikat` inside kcp.

Expand All @@ -103,18 +114,22 @@ metadata:
spec:
resource: ...
projection:
version: v1beta1
# all of these options are optional
kind: Sertifikat
plural: Sertifikater
shortNames: [serts]
versions:
# old version => new version;
# this must not map multiple versions to the same new version.
v1: v1beta1
# categories: [management]
# scope: Namespaced # change only when you know what you're doing
```

Consumers (end users) in kcp would then ultimately see projected names only. Note that GVK
projection applies only to the synced object itself and has no effect on the contents of these
objects. To change the contents, use external solutions like Crossplane to transform objects.
<!-- To change the contents, use *Mutations*. -->
To change the contents, use *Mutations*.

### (Re-)Naming

Expand Down Expand Up @@ -274,7 +289,7 @@ spec:
resource:
kind: Certificate
apiGroup: cert-manager.io
version: v1
versions: [v1]

naming:
# this is where our CA and Issuer live in this example
Expand Down Expand Up @@ -360,7 +375,7 @@ spec:
resource:
kind: Certificate
apiGroup: cert-manager.io
version: v1
versions: [v1]

naming:
namespace: kube-system
Expand Down Expand Up @@ -445,7 +460,7 @@ spec:
resource:
kind: Certificate
apiGroup: cert-manager.io
version: v1
versions: [v1]

naming:
# this is where our CA and Issuer live in this example
Expand Down
2 changes: 1 addition & 1 deletion docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.


site_name: api-syncagent
repo_url: https://github.com/kcp-dev/api-syncagent
repo_name: kcp-dev/api-syncagent
Expand All @@ -24,6 +23,7 @@ nav:
- Getting Started: getting-started.md
- Publishing Resources: publish-resources.md
- Consuming Services: consuming-services.md
- API Lifecycle: api-lifecycle.md
- FAQ: faq.md
- Release Process: releasing.md

Expand Down
10 changes: 4 additions & 6 deletions internal/controller/apiexport/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package apiexport
import (
"context"
"fmt"
"slices"

"github.com/kcp-dev/logicalcluster/v3"
"go.uber.org/zap"
Expand Down Expand Up @@ -121,12 +122,9 @@ func (r *Reconciler) reconcile(ctx context.Context) error {
}

// filter out those PRs that have not yet been processed into an ARS
filteredPubResources := []syncagentv1alpha1.PublishedResource{}
for i, pubResource := range pubResources.Items {
if pubResource.Status.ResourceSchemaName != "" {
filteredPubResources = append(filteredPubResources, pubResources.Items[i])
}
}
filteredPubResources := slices.DeleteFunc(pubResources.Items, func(pr syncagentv1alpha1.PublishedResource) bool {
return pr.Status.ResourceSchemaName == ""
})

// for each PR, we note down the created ARS and also the GVKs of related resources
arsList := sets.New[string]()
Expand Down
Loading