Use podMonitor instead of serviceMonitor to prevent monitoring data leakage #108

paulfantom · 2023-06-22T09:43:46Z

This PR is a different (IMHO proper) fix to issue raised in #107. By using PodMonitor instead of ServiceMonitor we can simplify and fix a few things:

Using SVC of type LB won't accidentally expoe /metrics endpoint outside of kubernetes.
.Values.metrics.port is no longer needed as PodMonitor attaches to Pod instead of SVC.
SVC object template is a bit less complicated.

The downside is that this is a breaking change. Alternative approach which is not breaking, but also not fixes all those issues is in

lucasfcnunes · 2024-06-20T13:43:48Z

Why not split the service?

*-docker-registry -> 5000 (http) (Now, can set type to LB without metrics leakage)
*-docker-registry-metrics -> 5001 (http-metrics)

paulfantom · 2025-01-08T13:00:25Z

Why not split the service?

At that point, why add another Service?

lucasfcnunes · 2025-01-08T14:42:51Z

At that point, why add another Service?

No breaking changes
"(...) to prevent monitoring data leakage"
~~TargetDown Alert (https://runbooks.prometheus-operator.dev/runbooks/general/targetdown/)~~

paulfantom · 2025-04-08T13:23:53Z

TargetDown doesn't require a Service. It requires a prometheus scrape job, which is created by using any of ServiceMonitor, PodMonitor, ScrapeConfig, Probe, etc. (ps. I wrote that runbook).

As for preventing data leakage, that's exactly a point of using PodMonitor. You have less moving parts and less points which can lead to data leakage.

lucasfcnunes · 2025-04-09T05:08:05Z

TargetDown doesn't require a Service. It requires a prometheus scrape job, which is created by using any of ServiceMonitor, PodMonitor, ScrapeConfig, Probe, etc. (ps. I wrote that runbook).

The language is all around Service and ServiceMonitor, so I assumed wrongly so.

TargetDown #
Meaning #
The alert means that one or more prometheus scrape targets are down. It fires when at least 10% of scrape targets in a Service are unreachable.

Full context
Prometheus works by sending an HTTP GET request to all of its “targets” every few seconds. So TargetDown really means that Prometheus just can’t access your service, which may or may not mean it’s actually down. If your service appears to be running fine, a common cause could be a misconfigured ServiceMonitor (maybe the port or path is incorrect), a misconfigured NetworkPolicy, or Service with incorrect labelSelectors that isn’t selecting any Pods.

joshsizer · 2025-04-09T23:14:13Z

Hi @paulfantom, please recommit with signed commits. Also, I would prefer to have backward compatible behavior - ie, if people are using service monitor, they should continue to use service monitors on upgrade.

I see value in being able to decide, at the end-user level, whether to use a podMonitor or a service monitor

…eakage Signed-off-by: paulfantom <[email protected]>

paulfantom · 2025-04-12T11:20:31Z

I've added my signature.

As for backwards compatibility, IMHO requiring to have both objects (PodMonitor and ServiceMonitor) being optional puts unnecessary load on maintainers while not providing any tangible value. From prometheus perspective, any object (PodMonitor or ServiceMonitor) will result in the same data being gathered. While from maintenance perspective using ServiceMonitor requires additional logic to be present, ex. Service which exposes metrics port or defining metrics port value. Having both, doesn't give you any benefits while forcing to maintain more code and providing unclear API to end users. However since it is clearer to see in code, I've decided to create a branch where this should be clearly visible - https://github.com/twuni/docker-registry.helm/compare/main...paulfantom:docker-registry.helm:pod-sm-together?expand=1, especially check validate.yaml which now would need to be added to prevent misconfiguration.

From my perspective, your API that should be backwards compatible is not in installed objects (when they provide the same results), but in values.yaml. For example, we could remove .Values.serviceMonitor options however make templates still react to those. This way it provides a way to easily deprecate a setting.

That said, I am not a maintainer and it is up to you to choose a path forward :)

joshsizer · 2025-04-14T20:35:25Z

However since it is clearer to see in code, I've decided to create a branch where this should be clearly visible - https://github.com/twuni/docker-registry.helm/compare/main...paulfantom:docker-registry.helm:pod-sm-together?expand=1, especially check validate.yaml which now would need to be added to prevent misconfiguration.

@paulfantom I actually really like the approach you outlined above ^^

A major problem I see with removing ServiceMonitor is that if I am setting .Values.metrics.serviceMonitor.enabled: true, and then upgrade with the changes in this MR, I would no longer be scraping metrics. I think you are suggesting we can remedy that:

For example, we could remove .Values.serviceMonitor options however make templates still react to those. This way it provides a way to easily deprecate a setting.

But IMO that would be confusing. Why does setting .Values.metrics.serviceMonitor.enabled: true create a podMonitor?

paulfantom mentioned this pull request Jun 22, 2023

Better control over metrics port in SVC #107

Closed

paulfantom changed the title ~~podMonitor instead of serviceMonitor to prevent monitoring data leakage~~ Use podMonitor instead of serviceMonitor to prevent monitoring data leakage Jun 22, 2023

This was referenced Apr 8, 2025

Add option to conditionally create Service resource #159

Open

Is this project still active? #161

Open

use podMonitor instead of serviceMonitor to prevent monitoring data l…

f913e55

…eakage Signed-off-by: paulfantom <[email protected]>

paulfantom force-pushed the use-podmonitor branch from 9ec4747 to f913e55 Compare April 12, 2025 11:00

joshsizer self-requested a review April 14, 2025 20:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use podMonitor instead of serviceMonitor to prevent monitoring data leakage #108

Use podMonitor instead of serviceMonitor to prevent monitoring data leakage #108

Uh oh!

paulfantom commented Jun 22, 2023 •

edited

Loading

Uh oh!

lucasfcnunes commented Jun 20, 2024 •

edited

Loading

Uh oh!

paulfantom commented Jan 8, 2025

Uh oh!

lucasfcnunes commented Jan 8, 2025 •

edited

Loading

Uh oh!

paulfantom commented Apr 8, 2025

Uh oh!

lucasfcnunes commented Apr 9, 2025

Uh oh!

joshsizer commented Apr 9, 2025

Uh oh!

paulfantom commented Apr 12, 2025 •

edited

Loading

Uh oh!

joshsizer commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Use podMonitor instead of serviceMonitor to prevent monitoring data leakage #108

Are you sure you want to change the base?

Use podMonitor instead of serviceMonitor to prevent monitoring data leakage #108

Uh oh!

Conversation

paulfantom commented Jun 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucasfcnunes commented Jun 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfantom commented Jan 8, 2025

Uh oh!

lucasfcnunes commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfantom commented Apr 8, 2025

Uh oh!

lucasfcnunes commented Apr 9, 2025

Uh oh!

joshsizer commented Apr 9, 2025

Uh oh!

paulfantom commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joshsizer commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

paulfantom commented Jun 22, 2023 •

edited

Loading

lucasfcnunes commented Jun 20, 2024 •

edited

Loading

lucasfcnunes commented Jan 8, 2025 •

edited

Loading

paulfantom commented Apr 12, 2025 •

edited

Loading