The gardener-extension-otelcol repo provides a Gardener Extension for an
OpenTelemetry Collector, which runs in the shoot control-plane namespace and
forwards observability signals for control-plane components to a remote
OpenTelemetry Collector receiver.
Warning
This extension is in early development state. Do not use it in a production environment.
- Go 1.25.x or later
- GNU Make
- Docker for local development
- Gardener Local Setup for local development
The project repo uses the following code structure.
| Package | Description |
|---|---|
cmd |
Command-line application of the extension |
pkg/admission |
Implementations for the Gardener extension admission Validator and Mutator interfaces |
pkg/apis |
Extension API types, e.g. configuration spec, etc. |
pkg/actuator |
Implementations for the Gardener Extension Actuator interfaces |
pkg/controller |
Utility wrappers for creating Kubernetes reconcilers for Gardener Actuators |
pkg/imagevector |
Image vector for container images |
pkg/heartbeat |
Utility wrappers for creating heartbeat reconcilers for Gardener extensions |
pkg/metrics |
Metrics emitted by the extension |
pkg/mgr |
Utility wrappers for creating controller-runtime managers using functional options API |
pkg/version |
Version metadata information about the extension |
internal/tools |
Go-based tools used for testing and linting the project |
charts |
Helm charts for deploying the extension |
examples |
Example Kubernetes resources, which can be used in a dev environment |
test |
Various files (e.g. schemas, CRDs, etc.), used during testing |
You can enable the extension for a Gardener Shoot
cluster by
updating the .spec.extensions of your shoot manifest.
The following example shoot manifest snippet enables the extension and configures the OpenTelemetry Collector to emit the signals for the shoot control-plane components via the Debug Exporter.
...
spec:
extensions:
- type: otelcol
providerConfig:
apiVersion: otelcol.extensions.gardener.cloud/v1alpha1
kind: CollectorConfig
spec:
exporters:
debug:
enabled: true
verbosity: basic # basic, normal or detailedThis configuration however is only useful while developing or troubleshooting an issue with the collector, because signals are not actually forwarded to a remote OpenTelemetry Collector receiver.
The following configuration snippet enables the extension for a shoot and configures it to forward the signals of the control-plane components to a remote collector using the OTLP HTTP exporter.
...
spec:
extensions:
- type: otelcol
providerConfig:
apiVersion: otelcol.extensions.gardener.cloud/v1alpha1
kind: CollectorConfig
spec:
exporters:
# OTLP HTTP exporter settings
otlp_http:
enabled: true
endpoint: "https://opentelemetry-receiver.example.org"The following example snippet expands on the previous one by adding TLS configuration settings and Bearer token authentication with the remote collector.
...
spec:
extensions:
- type: otelcol
providerConfig:
apiVersion: otelcol.extensions.gardener.cloud/v1alpha1
kind: CollectorConfig
spec:
exporters:
# OTLP HTTP exporter settings
otlp_http:
enabled: true
endpoint: "https://opentelemetry-receiver.example.org"
token:
resourceRef:
name: otelcol-bearer-token
dataKey: token
tls:
ca:
resourceRef:
name: otelcol-tls
dataKey: ca.crt
cert:
resourceRef:
name: otelcol-tls
dataKey: client.crt
key:
resourceRef:
name: otelcol-tls
dataKey: client.key
resources:
- name: otelcol-bearer-token
resourceRef:
apiVersion: v1
kind: Secret
name: my-otelcol-bearer-token
- name: otelcol-tls
resourceRef:
apiVersion: v1
kind: Secret
name: my-otelcol-tlsIn order to provide the otelcol-tls and otelcol-bearer-token secrets from
the example above to the extension, you should first create the respective
secrets in the shoot project namespace, which can then be referenced via
Gardener Referenced Resources.
This example snippet enables the extension to forward the signals of the control-plane components to a remote collector using the OTLP gRPC exporter.
extensions:
- type: otelcol
providerConfig:
apiVersion: otelcol.extensions.gardener.cloud/v1alpha1
kind: CollectorConfig
spec:
# Exporters settings
exporters:
# OTLP gRPC exporter settings
otlp_grpc:
enabled: true
endpoint: "https://opentelemetry-receiver.default.svc.cluster.local:4317"
token:
resourceRef:
name: otelcol-bearer-token
dataKey: token
tls:
ca:
resourceRef:
name: otelcol-tls
dataKey: ca.crt
cert:
resourceRef:
name: otelcol-tls
dataKey: client.crt
key:
resourceRef:
name: otelcol-tls
dataKey: client.keyFor additional configuration settings, which can be provided to the extension, please make sure to check the OTel Extension API spec documentation.
In order to build a binary of the extension, you can use the following command.
make buildThe resulting binary can be found in bin/extension.
In order to build a Docker image of the extension, you can use the following command.
make docker-buildRun the following command to get usage info about the available Makefile targets.
make helpFor local development of the gardener-extension-otelcol it is recommended that
you setup a development Gardener environment.
Please refer to the next sections for more information about deploying and testing the extension in a Gardener development environment.
The extension can also be deployed via the Gardener Operator.
In order to start a local development environment with the Gardener Operator, please refer to the following documentations.
In summary, these are the steps you need to follow in order to start a local development environment with the Gardener Operator, however, please make sure that you read the documents above for additional details.
make kind-up gardener-upBefore you continue with the next steps, make sure that you configure your
KUBECONFIG to point to the kubeconfig file of the cluster, which runs the
Gardener Operator.
There will be two kubeconfig files created for you, after the dev environment has been created.
| Path | Description |
|---|---|
/path/to/gardener/dev-setup/kubeconfigs/runtime/kubeconfig |
The runtime cluster (gardener-operator runs in it) |
/path/to/gardener/dev-setup/kubeconfigs/virtual-garden/kubeconfig |
The virtual garden cluster |
Throughout this document we will refer to the kubeconfigs for runtime and
virtual clusters as $KUBECONFIG_RUNTIME and $KUBECONFIG_VIRTUAL
respectively.
Before deploying the extension we need to target the runtime cluster, since
this is where the extension resources for gardener-operator reside.
export KUBECONFIG=$KUBECONFIG_RUNTIMEIn order to deploy the extension, execute the following command.
make deploy-operatorThe deploy-operator target takes care of the following.
- Builds a Docker image of the extension
- Loads the image into the
kindcluster nodes - Packages the Helm charts and pushes them to the local registry
- Deploys the
Extension(from groupoperator.gardener.cloud/v1alpha1) to the runtime cluster
Verify that we have successfully created the
Extension (from group operator.gardener.cloud/v1alpha1) resource.
$ kubectl --kubeconfig $KUBECONFIG_RUNTIME get extop otelcol
NAME INSTALLED REQUIRED RUNTIME REQUIRED VIRTUAL AGE
otelcol True False False 13sVerify that the respective ControllerRegistration and ControllerDeployment
resources have been created by the gardener-operator in the virtual garden
cluster.
$ kubectl --kubeconfig $KUBECONFIG_VIRTUAL get controllerregistrations,controllerdeployments otelcol
NAME RESOURCES AGE
controllerregistration.core.gardener.cloud/otelcol Extension/otelcol 42s
NAME AGE
controllerdeployment.core.gardener.cloud/otelcol 42sFinally, we can create an example shoot with our extension enabled. The examples/shoot.yaml file provides a ready-to-use shoot manifest with the extension enabled and configured.
The provided example shoot references secrets from the project namespace, which
are used to configure the TLS settings between the exporter and a local dev
receiver, running in the default namespace.
The following commands will create the TLS secrets, a dev OpenTelemetry receiver
in the default namespace, and a dev shoot, configured with the extension.
kubectl --kubeconfig $KUBECONFIG_RUNTIME apply -f examples/opentelemetry-receiver.yaml
kubectl --kubeconfig $KUBECONFIG_VIRTUAL apply -f examples/secret-tls.yaml
kubectl --kubeconfig $KUBECONFIG_VIRTUAL apply -f examples/secret-bearer-token.yaml
kubectl --kubeconfig $KUBECONFIG_VIRTUAL apply -f examples/shoot.yamlIf you have an already existing and running shoot, for which you want to enable the extension, simply follow the instructions from the previous sections in order to enable and configure the extension manually.
Once we create the shoot cluster, gardenlet will start deploying our
gardener-extension-otelcol, since it is required by our shoot.
Verify that the extension has been successfully installed by checking the
corresponding ControllerInstallation resource for our extension.
$ kubectl --kubeconfig $KUBECONFIG_VIRTUAL get controllerinstallations
NAME REGISTRATION SEED VALID INSTALLED HEALTHY PROGRESSING AGE
otelcol-8rvmn otelcol local True True True False 64sAfter your shoot cluster has been successfully created and reconciled, verify that the extension is healthy.
$ kubectl --kubeconfig $KUBECONFIG_RUNTIME --namespace shoot--local--local get extensions otelcol
NAME INSTALLED REQUIRED RUNTIME REQUIRED VIRTUAL AGE
otelcol True False True 13mVerify that the ManagedResource created by the extension is healthy as well.
$ kubectl --kubeconfig $KUBECONFIG_RUNTIME --namespace shoot--local--local get managedresource external-otelcol
NAME CLASS APPLIED HEALTHY PROGRESSING AGE
external-otelcol seed True True False 6m20sAfter successful reconciliation we should see the following OpenTelemetry collectors in the shoot control-plane namespace.
$ kubectl --kubeconfig $KUBECONFIG_RUNTIME --namespace shoot--local--local get otelcol external-otelcol
NAME MODE VERSION READY AGE IMAGE MANAGEMENT
external-otelcol statefulset 0.141.0 1/1 6m45s europe-docker.pkg.dev/gardener-project/releases/3rd/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.141.0 managedWe should also see that the Collector and Target Allocator are running and healthy.
$ kubectl --kubeconfig $KUBECONFIG_RUNTIME --namespace shoot--local--local get sts external-otelcol-collector
NAME READY AGE
external-otelcol-collector 1/1 3m30s
$ kubectl --kubeconfig $KUBECONFIG_RUNTIME --namespace shoot--local--local get deployment external-otelcol-targetallocator
NAME READY UP-TO-DATE AVAILABLE AGE
external-otelcol-targetallocator 1/1 1 1 3m38sIn order to trigger reconciliation of the extension you can annotate the extension resource.
kubectl --kubeconfig $KUBECONFIG_RUNTIME --namespace shoot--local--local annotate extensions otelcol gardener.cloud/operation=reconcileIn order to delete the dev shoot, TLS secrets and dev OpenTelemetry receiver you can run the following commands.
kubectl --kubeconfig $KUBECONFIG_VIRTUAL --namespace garden-local annotate shoot local confirmation.gardener.cloud/deletion=true --overwrite
kubectl --kubeconfig $KUBECONFIG_VIRTUAL delete -f examples/shoot.yaml --ignore-not-found=true --wait=false
kubectl --kubeconfig $KUBECONFIG_RUNTIME delete -f examples/opentelemetry-receiver.yaml --ignore-not-found=true --wait=false
kubectl --kubeconfig $KUBECONFIG_VIRTUAL delete -f examples/secret-tls.yaml --ignore-not-found=true --wait=false
kubectl --kubeconfig $KUBECONFIG_VIRTUAL delete -f examples/secret-bearer-token.yaml --ignore-not-found=true --wait=falseThis section provides some hints related to troubleshooting the OpenTelemetry Collector, which is managed by the Gardener extension.
Make sure that you check the following official OpenTelemetry documentation:
Check the logs of the deployment/external-otelcol-targetallocator and
statefulset/external-otelcol-collector, e.g.
kubectl --namespace shoot--local--local logs -f deployments/external-otelcol-targetallocator
kubectl --namespace shoot--local--local logs -f statefulset/external-otelcol-collectorThe Target Allocator deployed by the extension is configured to discover
ServiceMonitor resources with the following labels:
prometheus=shoot
Confirm that ServiceMonitors with these labels exist in the shoot
control-plane namespace, e.g.
kubectl --namespace shoot--local--local get servicemonitors -l prometheus=shootThe Target Allocator and Collector configmaps are labeled with
observability.gardener.cloud/app=external-otelcol. Check and confirm that the
configuration settings in these configmaps are correct.
$ kubectl --namespace shoot--local--local get cm -l observability.gardener.cloud/app=external-otelcol
NAME DATA AGE
external-otelcol-collector-c30d03f4 1 13m
external-otelcol-targetallocator-config 1 13mThe communication between the Target Allocator and the Collector happens over mTLS, so we will need the client certificate of the collector, in order to confirm that the Target Allocator has discovered targets for scraping.
First, get the secret which contains the client certificate used by the Collector, e.g.
kubectl --namespace shoot--local--local get secret -l name=otelcol-collector-clientSave the client TLS secret locally:
mkdir client-cert
for k in tls.key tls.crt; do
kubectl --namespace shoot--local--local get secret -l name=otelcol-collector-client -o yaml | \
yq ".items[0].data.\"${k}\"" | base64 -d > "./client-cert/${k}"
doneNext, we need to port-forward the Target Allocator service locally.
kubectl --namespace shoot--local--local port-forward service/external-otelcol-targetallocator-https 8443:443Now we can query the Target Allocator and review the jobs and the scrape targets it can dispatch to collectors.
curl -k --cert client-cert/tls.crt --key client-cert/tls.key -X GET 'https://localhost:8443/jobs' | jq '.'The following command will query the Target Allocator for the scrape configs.
curl -k --cert client-cert/tls.crt --key client-cert/tls.key -X GET 'https://localhost:8443/scrape_configs' | jq '.'In addition to the API paths served by the Target Allocator you can also inspect
the configuration via /debug endpoints using your browser.
In order to do that we can use mitmproxy. Keep in
mind that mitmpoxy expects to find the key and certificate in a single
PEM-encoded file.
cat client-cert/tls.crt client-cert/tls.key > client-cert/mitmproxy.pemNow we can start the mitmproxy.
mitmproxy -k \
--listen-port=8080 \
--set client_certs=client-cert/mitmproxy.pem \
--mode upstream:https://localhost:8443Open up your browser at http://localhost:8080/ in order to view the Target Collector jobs, scrape configs, and assigned collectors.
In order to run the tests use the command below:
make testIn order to test the Helm chart and the manifests provided by it you can run the following command.
make check-helmIn order to test the example resources from the examples/ directory you can
run the following command.
make check-examplesMake sure to check the following documents for more information about Gardener Extensions and the available extensions API.
- Gardener: Extensibility Overview
- Gardener: Registering Extension Controllers
- Gardener: Extension Resources
- Gardener: Extensions API Contract
- Gardener: How to Set Up a Gardener Landscape
- Gardener: Extension API Packages (Go)
gardener-extension-otelcol is hosted on
Github.
Please contribute by reporting issues, suggesting features or by sending patches using pull requests.
This project is Open Source and licensed under Apache License 2.0.
