Fix Loki to work with Flux grafana/prometheus examples again by kingdonb · Pull Request #47 · fluxcd/flux2-monitoring-example

kingdonb · 2025-02-12T17:46:15Z

Fix #19, Close #23

I got this working locally, running on Kind

In the grafana dashboard, there is a loki data source configured with address loki-stack:3100

I was unable to find this string in the kube-prometheus-stack chart, so I'm guessing it has been inferred somehow by Grafana's chart? The old loki-stack chart that provided promtail is obsolete.

I updated this to use loki chart instead, that is maintained, there is an OCI chart but it isn't kept up-to-date (the tag is labeled "weekly" but is several months out-of-date, and the latest release is in the legacy helm repository for grafana charts)

The loki chart is kept current. There's a promtail chart in the grafana helm charts repo as well.

I disabled authentication, because I believe that's how kube-prometheus-stack did it. In no way is this a sensible Loki configuration, but it works in kind for demonstrative purposes. I will personally be disabling loki on my clusters, not because this is too hard, but because we're using cloudtrail for logs (which I think has Grafana integration too)

Anyway, we can't demo cloudtrail in a kind cluster so this is better.

It still needs some documentation updates, the README for example shows the loki and promtail pods deployed from loki-stack chart, but mine looks a bit more like this now:

monitoring           kube-prometheus-stack-grafana-58977b8976-vw65n              3/3     Running   0          78m
monitoring           kube-prometheus-stack-kube-state-metrics-676657b78f-sfknq   1/1     Running   0          78m
monitoring           kube-prometheus-stack-operator-7467d457b7-x2jwc             1/1     Running   0          78m
monitoring           kube-prometheus-stack-prometheus-node-exporter-cq2nn        1/1     Running   0          78m
monitoring           loki-canary-w6c8r                                           1/1     Running   0          21m
monitoring           loki-stack-0                                                2/2     Running   0          21m
monitoring           loki-stack-gateway-589dcdd94f-qljpd                         1/1     Running   0          23m
monitoring           loki-stack-minio-0                                          1/1     Running   0          23m
monitoring           prometheus-kube-prometheus-stack-prometheus-0               2/2     Running   0          77m
monitoring           promtail-m4snt                                              1/1     Running   0          10m

I also disabled the two redis instances that loki installs in the default monolithic version, and verified that I can see some logs in the "Flux Logs" dashboard. I am not certain about this config but I will come back and review it ASAP so we can close out long-standing issues like #19 and hopefully avoid removing Loki from the example.

Some of these logs do not look like they are from Flux:

I have a feeling if I look more closely at the config for promtail that I discarded, I'll find that it was meant to select the flux pods somehow. Unsure. I can't spend more time on this right now, but will try to come back with another try before the dev meeting next week. (At a minimum I'll clean up the bootstrap artifacts and other noise from this PR so we can merge it.)

kingdonb · 2025-02-12T17:54:05Z

rebasing from babf064 to squash commits and fix the DCO

kingdonb · 2025-02-12T18:01:56Z

This should be basically in shape to merge. Worth a second test, and someone could possibly look at that Loki Flux Logs dashboard and tell whether it's working properly or not, I am not sure the Flux Events checkbox is performing its function.

kingdonb · 2025-02-12T18:06:17Z

There are a couple of changes I'd like to borrow from #19 and #23 before this merges.

It looks like we don't really need this canary service, and we could use the configmap for data sources to ensure that loki is picked up at the proper address. Also, promtail may have some helm tests that we want to be sure we're not invoking. (??)

I built this PR in isolation without looking at #23 but I should probably go back and check that one out before we call it good here.

kingdonb · 2025-02-12T19:04:53Z

I reviewed the changes in #23 and the only one I can't make sense of is compaction.

Setting the compaction as it was set there results in:

% k -n monitoring logs -f loki-stack-0 -c loki
failed parsing config: /etc/loki/config/config.yaml: yaml: unmarshal errors:
  line 25: field shared_store not found in type compactor.Config. Use `-config.expand-env=true` flag if you want to expand environment variables in your config file

which indicates that shared_store: filesystem is no longer a valid setting.

I'm not a Grafana or Loki expert, but I think that setting compactor: replicas: 0 probably pre-empts this compaction config and renders it inert, anyway. I'll delete it, since I'm sure we are not thinking about retention for this example.

kingdonb · 2025-02-12T19:05:11Z

I did not end up borrowing this section:

https://github.com/fluxcd/flux2-monitoring-example/pull/23/files#diff-5e041afacf25eb055565b4a1c32d5b81201ddce29c84adf13a6ae88463e0832bR62-R82

which would have allowed us to call the release loki instead of loki-stack

It may be worth adopting in a later follow-on PR 👍

kingdonb · 2025-02-12T19:08:38Z

I think that loki must depend on minio now, the error:

caller=log.go:216 msg="error running loki" err="at least one bucket name must be specified

I had it working with minio earlier, but I was trying to remove unnecessary components that might have licensing issues. I don't think this really does, because we aren't distributing a modified minio, just using it in an example, but I don't see the point of deploying a minio when we're trying to use local filesystem for storage. Maybe that just isn't supported anymore.

I'm not sure how to solve this, will come back to it later.

kingdonb · 2025-02-12T19:15:33Z

It does look like the filesystem store just is not supported anymore:

https://grafana.com/docs/loki/latest/operations/storage/filesystem/

kingdonb · 2025-02-12T19:42:18Z

I did wind up getting everything working. Two things I am not sure about:

The dependency order. Does promtail need to go after loki? If it does not, installation could be much faster.
The dashboard connection to the data source. When you go to the Flux Logs page, the Loki datasource is usually selected. But depending on the order, it might need to be selected and you see a data source not found error.

It is working on a kind cluster without issue, I don't think either of those issues are show-stoppers. I did wind up needing that config from #23, in df39529. Without it, the data source was not being created. Maybe there is some auto-detection that happens in some odd cases, or on grafana boot, but since grafana is in kube-prometheus-stack it boots first.

So, upon repeated testing I found that without this config section I would not get a data source for loki at all.

I think this is good to merge 👍

kingdonb · 2025-02-12T20:20:40Z

I would like to resolve the few remaining references to loki-stack out before this merges, sec... (retesting)

* Select latest version of kube-prometheus-stack (69.x.x) * remove loki-stack parts from chart * Grafana loki chart is at 6.26.0 * Disable auth: loki (this is not for production!) * Add back promtail manually (this was previously included in the loki-stack chart) * Continue using the name loki-stack Something in grafana is preconfigured to pull from the data source loki-stack:3100 > Provisioned data source > This data source was added by config and cannot be modified using the UI. Please contact your server admin to update this data source. I was unable to find the location where this string comes from! Perhaps the use of a misleading name is not needed, but this worked? * try out later monolithic Loki config from the docs, with these divergences: * Disable chunksCache that requests 8GB+ memory by default * resultsCache:enabled: false as well Signed-off-by: Kingdon B <kingdon@urmanac.com> fix README for loki-stack change Signed-off-by: Kingdon B <kingdon@urmanac.com> disable minio, use filesystem for logging instead Signed-off-by: Kingdon B <kingdon@urmanac.com> clean up duplicate yaml Signed-off-by: Kingdon B <kingdon@urmanac.com> disable test (loki) Signed-off-by: Kingdon B <kingdon@urmanac.com> remove compactor config Signed-off-by: Kingdon B <kingdon@urmanac.com> just use minio Signed-off-by: Kingdon B <kingdon@urmanac.com> turns out we need this after all Signed-off-by: Kingdon B <kingdon@urmanac.com> update README to match Signed-off-by: Kingdon B <kingdon@urmanac.com> update loki-stack -> loki (strings) Signed-off-by: Kingdon B <kingdon@urmanac.com> depend on kube-prometheus-stack instead don't depend on loki here Signed-off-by: Kingdon B <kingdon@urmanac.com> update README Signed-off-by: Kingdon B <kingdon@urmanac.com>

Signed-off-by: Kingdon B <kingdon@urmanac.com>

kingdonb · 2025-02-12T20:34:57Z

LGTM now 👍

kingdonb force-pushed the fix-loki branch from babf064 to 2d52a8d Compare February 12, 2025 17:58

kingdonb marked this pull request as ready for review February 12, 2025 18:00

kingdonb mentioned this pull request Feb 12, 2025

Replace loki-stack helm chart with loki + promtail #23

Closed

kingdonb marked this pull request as draft February 12, 2025 18:04

kingdonb marked this pull request as ready for review February 12, 2025 19:42

kingdonb mentioned this pull request Feb 12, 2025

loki-stack helm charts is deprecated #19

Closed

kingdonb and others added 3 commits February 12, 2025 15:32

Update actions/checkout action to v4

7ebdda9

Signed-off-by: Kingdon B <kingdon@urmanac.com>

Update helm/kind-action action to v1.12.0

ed808c6

Signed-off-by: Kingdon B <kingdon@urmanac.com>

kingdonb force-pushed the fix-loki branch from f7824c0 to ed808c6 Compare February 12, 2025 20:33

matheuscscp approved these changes Feb 12, 2025

View reviewed changes

matheuscscp merged commit 2703382 into fluxcd:main Feb 12, 2025
2 checks passed

kingdonb deleted the fix-loki branch February 12, 2025 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Loki to work with Flux grafana/prometheus examples again#47

Fix Loki to work with Flux grafana/prometheus examples again#47
matheuscscp merged 3 commits intofluxcd:mainfrom
kingdonb:fix-loki

kingdonb commented Feb 12, 2025 •

edited

Loading

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kingdonb commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

kingdonb commented Feb 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kingdonb commented Feb 12, 2025 •

edited

Loading