Skip to content

fix(helm): Avoid scraping metrics from gateway pod #13350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

Skaronator
Copy link

@Skaronator Skaronator commented Jun 28, 2024

What this PR does / why we need it:

Rename the nginx port name to avoid beeing scraped by the ServiceMonitor:

- port: http-metrics
path: /metrics

Which issue(s) this PR fixes:
Fixes #13201

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@Skaronator Skaronator requested a review from a team as a code owner June 28, 2024 09:39
@CLAassistant
Copy link

CLAassistant commented Jun 28, 2024

CLA assistant check
All committers have signed the CLA.

@Skaronator Skaronator changed the title Avoid scraping metrics from gateway pod fix: Avoid scraping metrics from gateway pod Jun 28, 2024
@Skaronator
Copy link
Author

I am not sure if we need to do a major version bump. Technically, it's a breaking change, but I don't think it will affect anyone.

@Skaronator Skaronator changed the title fix: Avoid scraping metrics from gateway pod fix(helm): Avoid scraping metrics from gateway pod Jun 28, 2024
@adinhodovic
Copy link
Contributor

Ah opened a similar PR a while back #12746, was alerting. Seems to resolve the same thing.

@Skaronator Skaronator force-pushed the fix-nginx-metrics-scrape branch from 2fed5f0 to dc02e25 Compare July 5, 2024 06:09
@Skaronator
Copy link
Author

I've rebased and fixed the merge conflicts that have come up since creating this PR.

@Skaronator
Copy link
Author

I've rebased this PR again and fixed the merge conflicts that have come up.

@lblazewski
Copy link

Looking forward for this PR!

@Skaronator
Copy link
Author

Resolved merge conflicts again. Hoping for a timely review.

@Skaronator
Copy link
Author

cc @vlad-diachenko

@@ -101,7 +101,7 @@ spec:
{{ toYaml .Values.enterpriseGateway.extraVolumeMounts | nindent 12 }}
{{- end }}
ports:
- name: http-metrics
- name: http
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the loki/gel image is run in mode gateway(-target=gateway), it exposes the metrics under /metrics url ....

@vlad-diachenko
Copy link
Member

cc @vlad-diachenko

I believe that we have the issue only when nginx is deployed as a gateway.... so, I would configure nginx to expose the metrics under /metrics path. wdyt?

@vlad-diachenko vlad-diachenko self-assigned this Aug 13, 2024
@francesco-kekko-damico
Copy link

Hi, is there an update on this? We need this fix as we have the service monitor target for nginx pod down for quite long time now

@nlamirault
Copy link
Contributor

any news on this fix ?

@Starefossen
Copy link
Contributor

@vlad-diachenko @hervenicol anything I can do to help unblock a fix for this issue?

@vlad-diachenko
Copy link
Member

@vlad-diachenko @hervenicol anything I can do to help unblock a fix for this issue?

Hey @Starefossen .
We can not merge it because of the thing I mentioned in the comment .
This port should be scraped if GEL image is used for the gateway.
So, if the issue happens only when NGING is used as a gateway, I would add label to the gateway pod prometheus.io/scrape: "false" , so it won't be scraped. (it needs to be tested because I googled this solution)
wdyt?

@Starefossen
Copy link
Contributor

I can confirm that adding the prometheus.io/service-monitor: "false" label to the gateway pods disables the erroneous scraping:

  gateway:
    service:
      labels:
        prometheus.io/service-monitor: "false"

@vlad-diachenko
Copy link
Member

I can confirm that adding the prometheus.io/service-monitor: "false" label to the gateway pods disables the erroneous scraping:

  gateway:
    service:
      labels:
        prometheus.io/service-monitor: "false"

I believe that it's the way we need to fix the issue.
Huge thanks for the confirmation

@adinhodovic
Copy link
Contributor

Can we merge this? #12746 - opened this fix a while back and works locally, just want it upstream.

@vlad-diachenko
Copy link
Member

this PR can be closed because the issue is fixed in #12746

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default Loki ServiceMonitor is trying to scrape gateways non-existent /metrics endpoint
9 participants