[FLINK-39541] Improve operator metrics documentation and bundle addit…#1102
Open
Dennis-Mircea wants to merge 1 commit intoapache:mainfrom
Open
[FLINK-39541] Improve operator metrics documentation and bundle addit…#1102Dennis-Mircea wants to merge 1 commit intoapache:mainfrom
Dennis-Mircea wants to merge 1 commit intoapache:mainfrom
Conversation
…ional metric reporters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ional metric reporters
What is the purpose of the change
This pull request rewrites the
Metrics and Loggingdocumentation page of the Flink Kubernetes Operator to make the operator's metric surface discoverable and unambiguous, extends the operator image with two additional metric reporter plugins (Dropwizard, OpenTelemetry), and clarifies in code that every Flinkmetrics.*key is honoured by the operator under thekubernetes.operator.metrics.*prefix.Brief change log
docs/content/docs/operations/metrics-logging.md+ Chinese translation):Scopesection with a newHow Metric Identifiers Are Builtsubsection explaining the difference between scope components and logical scope, and how non-labeling (SLF4J/JMX/Graphite) vs. labeling (Prometheus/Datadog/InfluxDB) reporters assemble metric identifiers. Added concrete Prometheus and SLF4J/JMX examples for System / Namespace / Resource scopes.Operator Custom Resource Metricstable following the Flink metrics reference styling, grouped byScope / Resource type / Metrics / Description / Type, coveringFlinkDeployment,FlinkSessionJob,FlinkBlueGreenDeploymentandFlinkStateSnapshotacross System / Namespace / Resource scopes (including autoscaler counters, version / resource-usage gauges, blue-greenFailurescounter, snapshot state gauges).FlinkDeployment Version and Resource Usage,FlinkDeployment / FlinkSessionJob Lifecycle metrics,FlinkBlueGreenDeployment Lifecycle metrics,FlinkDeployment / FlinkSessionJob JobStatus Tracking,FlinkBlueGreenDeployment JobStatus Tracking,FlinkStateSnapshot State Tracking, andScaling metrics.Scaling metricssubsection with a high-level paragraph and an alphabetically sorted<ScalingMetric>table (previously not documented).Kubernetes Client MetricsandKubernetes client metrics by Http Response Codetables to the same<table class="table table-bordered">styling used by the other operator metric tables, with the metric names sorted alphabetically.JOSDK Metrics: linked to the upstream JOSDK metrics documentation and clarified that those metrics are subject to the same scope/reporter rules.Metric Reporters: updated the bundled-reporters list (adds Dropwizard and OpenTelemetry), added anOperator-scoped Metric Configurationsubsection explaining thekubernetes.operator.metrics.*→metrics.*prefix stripping at startup, and aConfiguring Reporters on a FlinkDeploymentexample clarifying thatspec.flinkConfigurationuses the plainmetrics.reporter.*prefix.flink-kubernetes-operator/pom.xml): addedflink-metrics-dropwizardandflink-metrics-otelto themaven-dependency-pluginartifactItemsso both plugins end up under/opt/flink/plugins/in the operator image.KubernetesOperatorMetricOptions.java): expanded the class-level javadoc to state that only operator-specific toggles andk8soperator.*scope formats are declared here, and that Flinkmetrics.*keys are honoured when prefixed withkubernetes.operator.(stripped and forwarded byOperatorMetricUtils#createMetricConfig). Reporter options are intentionally not redeclared as typedConfigOptions.Verifying this change
This change is a documentation / packaging / javadoc change without any new runtime logic.
/opt/flink/plugins/flink-metrics-dropwizardand/opt/flink/plugins/flink-metrics-otelare present.kubernetes.operator.metrics.reporter.prom.factory.class=...and confirmed metrics are exposed on the configured port (no behaviour change expected).Does this pull request potentially affect one of the following parts:
CustomResourceDescriptors: noDocumentation