Fix Loki to work with Flux grafana/prometheus examples again#47
Fix Loki to work with Flux grafana/prometheus examples again#47matheuscscp merged 3 commits intofluxcd:mainfrom
Conversation
|
rebasing from babf064 to squash commits and fix the DCO |
|
This should be basically in shape to merge. Worth a second test, and someone could possibly look at that Loki Flux Logs dashboard and tell whether it's working properly or not, I am not sure the |
|
There are a couple of changes I'd like to borrow from #19 and #23 before this merges. It looks like we don't really need this canary service, and we could use the configmap for data sources to ensure that loki is picked up at the proper address. Also, promtail may have some helm tests that we want to be sure we're not invoking. (??) I built this PR in isolation without looking at #23 but I should probably go back and check that one out before we call it good here. |
|
I reviewed the changes in #23 and the only one I can't make sense of is compaction. Setting the compaction as it was set there results in: which indicates that I'm not a Grafana or Loki expert, but I think that setting |
|
I did not end up borrowing this section: which would have allowed us to call the release It may be worth adopting in a later follow-on PR 👍 |
|
I think that loki must depend on minio now, the error: I had it working with minio earlier, but I was trying to remove unnecessary components that might have licensing issues. I don't think this really does, because we aren't distributing a modified minio, just using it in an example, but I don't see the point of deploying a minio when we're trying to use local filesystem for storage. Maybe that just isn't supported anymore. I'm not sure how to solve this, will come back to it later. |
|
It does look like the filesystem store just is not supported anymore: https://grafana.com/docs/loki/latest/operations/storage/filesystem/ |
|
I did wind up getting everything working. Two things I am not sure about:
It is working on a kind cluster without issue, I don't think either of those issues are show-stoppers. I did wind up needing that config from #23, in df39529. Without it, the data source was not being created. Maybe there is some auto-detection that happens in some odd cases, or on grafana boot, but since grafana is in kube-prometheus-stack it boots first. So, upon repeated testing I found that without this config section I would not get a data source for loki at all. I think this is good to merge 👍 |
|
I would like to resolve the few remaining references to |
* Select latest version of kube-prometheus-stack (69.x.x) * remove loki-stack parts from chart * Grafana loki chart is at 6.26.0 * Disable auth: loki (this is not for production!) * Add back promtail manually (this was previously included in the loki-stack chart) * Continue using the name loki-stack Something in grafana is preconfigured to pull from the data source loki-stack:3100 > Provisioned data source > This data source was added by config and cannot be modified using the UI. Please contact your server admin to update this data source. I was unable to find the location where this string comes from! Perhaps the use of a misleading name is not needed, but this worked? * try out later monolithic Loki config from the docs, with these divergences: * Disable chunksCache that requests 8GB+ memory by default * resultsCache:enabled: false as well Signed-off-by: Kingdon B <kingdon@urmanac.com> fix README for loki-stack change Signed-off-by: Kingdon B <kingdon@urmanac.com> disable minio, use filesystem for logging instead Signed-off-by: Kingdon B <kingdon@urmanac.com> clean up duplicate yaml Signed-off-by: Kingdon B <kingdon@urmanac.com> disable test (loki) Signed-off-by: Kingdon B <kingdon@urmanac.com> remove compactor config Signed-off-by: Kingdon B <kingdon@urmanac.com> just use minio Signed-off-by: Kingdon B <kingdon@urmanac.com> turns out we need this after all Signed-off-by: Kingdon B <kingdon@urmanac.com> update README to match Signed-off-by: Kingdon B <kingdon@urmanac.com> update loki-stack -> loki (strings) Signed-off-by: Kingdon B <kingdon@urmanac.com> depend on kube-prometheus-stack instead don't depend on loki here Signed-off-by: Kingdon B <kingdon@urmanac.com> update README Signed-off-by: Kingdon B <kingdon@urmanac.com>
Signed-off-by: Kingdon B <kingdon@urmanac.com>
Signed-off-by: Kingdon B <kingdon@urmanac.com>
|
LGTM now 👍 |
Fix #19, Close #23
I got this working locally, running on Kind
In the grafana dashboard, there is a loki data source configured with address
loki-stack:3100I was unable to find this string in the kube-prometheus-stack chart, so I'm guessing it has been inferred somehow by Grafana's chart? The old
loki-stackchart that provided promtail is obsolete.I updated this to use
lokichart instead, that is maintained, there is an OCI chart but it isn't kept up-to-date (the tag is labeled "weekly" but is several months out-of-date, and the latest release is in the legacy helm repository for grafana charts)The
lokichart is kept current. There's apromtailchart in the grafana helm charts repo as well.I disabled authentication, because I believe that's how kube-prometheus-stack did it. In no way is this a sensible Loki configuration, but it works in kind for demonstrative purposes. I will personally be disabling loki on my clusters, not because this is too hard, but because we're using cloudtrail for logs (which I think has Grafana integration too)
Anyway, we can't demo cloudtrail in a kind cluster so this is better.
It still needs some documentation updates, the README for example shows the loki and promtail pods deployed from loki-stack chart, but mine looks a bit more like this now:
I also disabled the two redis instances that loki installs in the default monolithic version, and verified that I can see some logs in the "Flux Logs" dashboard. I am not certain about this config but I will come back and review it ASAP so we can close out long-standing issues like #19 and hopefully avoid removing Loki from the example.
Some of these logs do not look like they are from Flux:
I have a feeling if I look more closely at the config for promtail that I discarded, I'll find that it was meant to select the flux pods somehow. Unsure. I can't spend more time on this right now, but will try to come back with another try before the dev meeting next week. (At a minimum I'll clean up the bootstrap artifacts and other noise from this PR so we can merge it.)