-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Adds resourceVersion support to k8sObject receiver #46543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
dmitryax
merged 26 commits into
open-telemetry:main
from
dhruv-shah-sumo:add-resourceversion-docs
Apr 21, 2026
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
8693981
Adds resourceVersion support to k8sObject receiver
dhruv-shah-sumo 0dbdd04
Add resourceVersion persistence feature to k8s receivers
dhruv-shah-sumo 0705e53
Refactor resourceVersion persistence with prioritized checkpoint loading
dhruv-shah-sumo 5a07726
Add persist_resource_version documentation to k8s receivers
dhruv-shah-sumo dab4bdd
Add debug logging and fix test context usage
dhruv-shah-sumo 74a3d2b
Fix CI failures: lint, porto, changelog, and module versions
dhruv-shah-sumo 9e92f03
receiver/k8sobjects: add persist_resource_version support for watch mode
dhruv-shah-sumo 93a8187
receiver/k8sobjects: fix lint errors in observer.go
dhruv-shah-sumo 85edef4
receiver/k8sobjects: close race window between sendInitialState and w…
dhruv-shah-sumo 3fd1164
receiver/k8sobjects: improve checkpointer reliability and initial sta…
dhruv-shah-sumo 84ee801
internal/k8sinventory/watch: remove observer tests invalidated by con…
dhruv-shah-sumo 4538dce
chore: fix go.mod deps, crosslink replace directives, and gci formatting
dhruv-shah-sumo e20ade7
chore: go mod tidy after rebase
dhruv-shah-sumo 09ef108
Move persist_resource_version to top-level config in k8sobjectsreceiver
dhruv-shah-sumo f27a8f0
Fix config.schema.yaml: move persist_resource_version to top-level
dhruv-shah-sumo ec2223c
receiver/k8sobjects: remove persist_resource_version config, auto-per…
dhruv-shah-sumo 1c166c0
receiver/k8sobjects: fix deployment example to use PersistentVolumeCl…
dhruv-shah-sumo a15da3a
docs: add storage caveats and fix watchobserver.New signature
dhruv-shah-sumo 0ad8d4b
receiver/k8sobjects: inline getStorageClient to remove pkg/stanza dep…
dhruv-shah-sumo 3cbe3f0
chore: run crosslink to prune stale replace directives
dhruv-shah-sumo 4c19e4b
rebase
dhruv-shah-sumo 327e60f
fix: go mod tidy for missing go.sum entries
dhruv-shah-sumo 809b419
fix: crosslink and go.sum
dhruv-shah-sumo e5fa5a6
Merge branch 'main' into add-resourceversion-docs
dhruv-shah-sumo 573b40c
fix: bump drainprocessor collector deps
dhruv-shah-sumo 6828119
Merge branch 'main' into add-resourceversion-docs
dhruv-shah-sumo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| change_type: enhancement | ||
|
|
||
| component: receiver/k8sobjects | ||
|
|
||
| note: When `storage` is configured, watch-mode objects automatically resume from the last seen resourceVersion across restarts, preventing event duplication. | ||
|
|
||
| issues: [46543] | ||
|
|
||
| subtext: | ||
|
|
||
| change_logs: [user] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,181 @@ | ||
| // Copyright The OpenTelemetry Authors | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| package watch // import "github.com/open-telemetry/opentelemetry-collector-contrib/internal/k8sinventory/watch" | ||
|
|
||
| import ( | ||
| "context" | ||
| "errors" | ||
| "fmt" | ||
| "strconv" | ||
| "sync" | ||
|
|
||
| "go.opentelemetry.io/collector/extension/xextension/storage" | ||
| "go.uber.org/zap" | ||
| ) | ||
|
|
||
| type checkpointer struct { | ||
| client storage.Client | ||
| logger *zap.Logger | ||
|
|
||
| // pending holds the latest resourceVersion per storage key, buffered in | ||
| // memory across all watch streams. Flush() drains it to persistent storage. | ||
| mu sync.Mutex | ||
| pending map[string]string | ||
| } | ||
|
|
||
| const checkpointKeyFormat = "latestResourceVersion/%s" | ||
|
|
||
| func newCheckpointer(client storage.Client, logger *zap.Logger) *checkpointer { | ||
| return &checkpointer{ | ||
| client: client, | ||
| logger: logger, | ||
| pending: make(map[string]string), | ||
| } | ||
| } | ||
|
|
||
| func (c *checkpointer) GetCheckpoint(ctx context.Context, namespace, objectType string) (string, error) { | ||
| if c.client == nil { | ||
| return "", errors.New("storage client is nil") | ||
| } | ||
|
|
||
| checkPointKey := c.getCheckpointKey(namespace, objectType) | ||
| c.logger.Debug("Retrieving checkpoint, key: "+checkPointKey, | ||
| zap.String("namespace", namespace), | ||
| zap.String("objectType", objectType)) | ||
| data, err := c.client.Get(ctx, checkPointKey) | ||
| if err != nil { | ||
| c.logger.Warn("Error retrieving checkpoint", | ||
| zap.String("namespace", namespace), | ||
| zap.String("objectType", objectType), | ||
| zap.Error(err)) | ||
| return "", fmt.Errorf("failed to retrieve checkpoint: %w", err) | ||
| } | ||
|
|
||
| // If key is not found, data and error is nil | ||
| if len(data) == 0 { | ||
| c.logger.Debug("No checkpoint found, starting from the beginning", | ||
| zap.String("key", checkPointKey)) | ||
| return "", nil | ||
| } | ||
| return string(data), nil | ||
| } | ||
|
|
||
| // SetCheckpoint buffers the latest resourceVersion for the given namespace and | ||
| // objectType in memory. Call Flush to persist all buffered values to storage. | ||
| // Only updates the in-memory value if the new resourceVersion is numerically | ||
| // greater than the current one, acting as a high-watermark. This guards against | ||
| // out-of-order resourceVersions from List() responses (which are ordered by | ||
| // object key, not by resourceVersion). | ||
| func (c *checkpointer) SetCheckpoint( | ||
|
ChrsMark marked this conversation as resolved.
|
||
| _ context.Context, | ||
| namespace, objectType, resourceVersion string, | ||
| ) error { | ||
| key := c.getCheckpointKey(namespace, objectType) | ||
| if key == "" { | ||
| return fmt.Errorf("checkpoint key is empty: %s, %s", namespace, objectType) | ||
| } | ||
|
|
||
| newRV, err := strconv.ParseInt(resourceVersion, 10, 64) | ||
| if err != nil { | ||
| return fmt.Errorf("invalid resourceVersion %q: %w", resourceVersion, err) | ||
| } | ||
|
|
||
| c.mu.Lock() | ||
| if existing, ok := c.pending[key]; ok { | ||
| if existingRV, err := strconv.ParseInt(existing, 10, 64); err == nil && newRV <= existingRV { | ||
| c.mu.Unlock() | ||
| return nil | ||
| } | ||
| } | ||
| c.pending[key] = resourceVersion | ||
| c.mu.Unlock() | ||
|
|
||
| c.logger.Debug("buffered resourceVersion checkpoint", | ||
| zap.String("key", key), | ||
| zap.String("resourceVersion", resourceVersion)) | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| // Flush writes all buffered checkpoints to persistent storage. Only the latest | ||
| // value per key is written, discarding any intermediate updates since the last | ||
| // flush. It is safe to call concurrently from multiple goroutines. | ||
| func (c *checkpointer) Flush(ctx context.Context) error { | ||
| if c.client == nil { | ||
| return errors.New("storage client is nil") | ||
| } | ||
|
|
||
| c.mu.Lock() | ||
| if len(c.pending) == 0 { | ||
| c.mu.Unlock() | ||
| return nil | ||
| } | ||
| snapshot := c.pending | ||
| // Setting c.pending to an empty map to avoid unnecessary writes when there are no pending updates | ||
| // to be flushed to the disk. | ||
| c.pending = make(map[string]string) | ||
| c.mu.Unlock() | ||
|
|
||
| failed := false | ||
| for key, rv := range snapshot { | ||
| if err := c.client.Set(ctx, key, []byte(rv)); err != nil { | ||
| c.logger.Error("failed to flush checkpoint", | ||
| zap.String("key", key), | ||
| zap.String("resourceVersion", rv), | ||
| zap.Error(err)) | ||
| failed = true | ||
| continue | ||
| } | ||
| c.logger.Debug("flushed resourceVersion checkpoint", | ||
| zap.String("key", key), | ||
| zap.String("resourceVersion", rv)) | ||
| } | ||
|
dhruv-shah-sumo marked this conversation as resolved.
|
||
| if failed { | ||
| return errors.New("one or more checkpoints failed to be stored") | ||
|
dhruv-shah-sumo marked this conversation as resolved.
|
||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // DeleteCheckpoint deletes the persisted checkpoint for a given namespace and object type. | ||
| // This is used when the persisted resourceVersion is no longer valid (e.g., after a 410 Gone error). | ||
| func (c *checkpointer) DeleteCheckpoint( | ||
|
ChrsMark marked this conversation as resolved.
|
||
| ctx context.Context, | ||
| namespace, objectType string, | ||
| ) error { | ||
| if c.client == nil { | ||
| return errors.New("storage client is nil") | ||
| } | ||
|
|
||
| key := c.getCheckpointKey(namespace, objectType) | ||
| if key == "" { | ||
| return fmt.Errorf("checkpoint key is empty: %s, %s", namespace, objectType) | ||
| } | ||
|
|
||
| if err := c.client.Delete(ctx, key); err != nil { | ||
| return fmt.Errorf("failed to delete resourceVersion with key %s: %w", key, err) | ||
| } | ||
|
|
||
| c.logger.Debug("Checkpoint deleted with key: "+key, | ||
| zap.String("namespace", namespace), | ||
| zap.String("objectType", objectType)) | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| // getCheckpointKey generates a unique storage key | ||
| // returns resourceVersion key for global watch stream (without namespace) or | ||
| // per namespace watch stream. | ||
| func (*checkpointer) getCheckpointKey(namespace, objectType string) string { | ||
| // when watch stream is cluster-wide or cluster-scoped resource (no namespace), | ||
| // the resource version is persisted per object type only. | ||
| if namespace == "" { | ||
| // example: latestResourceVersion/nodes, latestResourceVersion/namespaces | ||
| return fmt.Sprintf(checkpointKeyFormat, objectType) | ||
| } | ||
|
|
||
| // when watch stream is created per namespace, the resource version is persisted | ||
| // per object type per namespace. | ||
| // example: latestResourceVersion/pods.default, latestResourceVersion/configmaps.kube-system | ||
| return fmt.Sprintf("%s.%s", fmt.Sprintf(checkpointKeyFormat, objectType), namespace) | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.