feat: Add configurable failurePolicy and timeoutSeconds for webhooks #440
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes the MutatingWebhookConfiguration more flexible by exposing configuration options for
failurePolicyandtimeoutSeconds. Previously, these values were hardcoded, preventing users from customizing webhook behavior for different environments or requirements.Changes
New Configuration Options in values.yaml
1.
admissionWebhooks.failurePolicyFailFail,IgnoreWhen to use
Fail(default):When to use
Ignore:2.
admissionWebhooks.podFailurePolicyIgnoreFail,IgnoreWhy separate from
failurePolicy:Ignoreprevents blocking critical workloads if operator is downFailif strict instrumentation enforcement is required3.
admissionWebhooks.timeoutSecondsnull(uses Kubernetes default, typically 10s)When to adjust:
Webhook Structure
The MutatingWebhookConfiguration contains 4 webhooks:
minstrumentation-v1beta2.kb.io) - path:/mutate-newrelic-com-v1beta2-instrumentationminstrumentation-v1beta1.kb.io) - path:/mutate-newrelic-com-v1beta1-instrumentationminstrumentation-v1alpha2.kb.io) - path:/mutate-newrelic-com-v1alpha2-instrumentationmpod.kb.io) - path:/mutate-v1-podTemplate Changes
Updated
charts/k8s-agents-operator/templates/instrumentation-crd.yaml:{{ .Values.admissionWebhooks.failurePolicy }}for Instrumentation webhooks{{ .Values.admissionWebhooks.podFailurePolicy }}for Pod webhooktimeoutSecondswhen configuredValidation
Added input validation to prevent misconfigurations:
timeoutSeconds: Must be between 1 and 30 seconds (Kubernetes requirement)failurePolicy: Must be either 'Fail' or 'Ignore'podFailurePolicy: Must be either 'Fail' or 'Ignore'Validation errors are raised during Helm template rendering with clear error messages.
Testing
Added comprehensive Helm unit tests (
charts/k8s-agents-operator/tests/webhook_configuration_test.yaml):All tests pass successfully.
Use Cases
Use Case 1: Strict Instrumentation Enforcement
Ensures all instrumentation is validated and applied, blocking deployments if operator is unavailable.
Use Case 2: High Availability / Resilient Deployments
Allows deployments to proceed even if operator is temporarily down, prioritizing availability.
Use Case 3: High Latency Environments
Increases timeout to handle network latency while maintaining validation for Instrumentation resources.
Backward Compatibility
All changes are fully backward compatible:
Testing Checklist
Documentation
values.yamlwith detailed comments explaining each option