Extract key value pairs during demuxing #1410
Replies: 2 comments 9 replies
-
By this, do you mean exporting search results as mentioned in the following link? https://go2docs.graylog.org/current/interacting_with_your_log_data/export_search_results.html If so, is the export format being used the "Log file/plain text" one? Perhaps life would be easier if the export was NDJSON and the demux support was updated to work with that?
Yes, the keys at the end, without a clear demarcation of where the message is, makes life kinda hard. |
Beta Was this translation helpful? Give feedback.
-
I went back and requested the logs to be exported as NDJSON, which has improved the situation a lot! I created a log format and get it all nicely structured to a certain extent. All of my remaining complaints fall squarely into the category "garbage in, garbage out", and will probably need manual cleanup. The log lines that I now get look like this (newlines for readability): {
"kubernetes_pod_ip": "10.0.0.1",
"source": "k8shost1",
"message": "2025-03-14 06:06:28,629 INFO ...",
"kubernetes_host": "k8shost1",
"kubernetes_pod_name": "hdfs-namenode-default-0",
"timestamp": "2025-03-14T07:06:29.568+01:00"
} The issues with this are:
I have tried updating the timestamp and loglevel via sql with something like ;update
graylog_json
set
log_level = regexp_match('^\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2},\d+\s+([A-Z]+)\s+.*', log_body) which didn't throw an error, but also didn't work. I suspect it is due to the fields My current idea would be to just pipe the input file through I just wanted to see if there are more elegant ways of achieving this in lnav that I have missed during my investigation? The ideal solution that ocurred to me was that I'd like to be able to specify "calculated" fields in a json logformat, which I can fill with capture groups from regexs or similar things ... but I am aware that I am probably far of the beaten path with what I am doing here and would probably remain the sole user for this :) Update:
Does what I was looking for and adds |
Beta Was this translation helpful? Give feedback.
-
I have a question on how to treat key value pairs which occur in a file that needs to be demuxed.
I'll explain the scenario first, to give a bit of context, which may make it a bit easier to understand why I would ever consider something like this :)
We run distributed systems on kubernetes and logs are gathered in a central Greylog, by reading the console output of pods.
So in Greylog, some metadata is captured, but this is all based on kubernetes
Plus, the actual line that was logged (I'm not even gonna start talking about multiline handling here, that went out the window a long time ago).
Now, if something breaks that needs investigating, we can ask for logs to be exported from Greylog, and we'll receive them in roughly this format:
2024-12-11T05:32:47.143+01:00 source=hostname1 <the actual log line> kubernetes_pod_ip=<ip> kubernetes_pod_name=<podname> kubernetes_pod_id=<podip> kubernetes_host=<host
>With the key=value pairs at the end being somewhat flexible, they can change order, some can be missing, there can be extra fields .. not really predictable, based on what the person who does the export clicks..
The first timestamp is from greylog, when the log line was received, the actually interesting timestamp will be in the body though, so we can more or less ignore that.
I have written a demux pattern to target this, which looks like this:
So basically what it does is remove the prefix (greylog timestamp and source=...) and uses the entire rest of the line as the log body.
In this body it then looks for a key=value pair where the key is "kubernetes_pod_name" and extracts that, but keeps all key=value pairs as part of the body.
This way all key value pairs will be extracted by lnav from the log line and shown in the pretty view (see screenshot).
However, it does change the logline, because these fields will now be displayed at the end of the line, but don't really "belong" there, I'd much rather remove them during demuxing.
What I'd like to happen is to extract these key value pairs during the demux stage. I know I can do that with named capture groups in the demux pattern, but I found no way of doing this for keys where I do not know the name up front (dynamic capture group names I guess..).
So basically I'd need to include every possible key that can occur in the regex, but I'd rather just have the exact same behavior that lnav already does when applying the format in the subsequent step do demuxing.
I am probably doing a horrible way of explaining this .. hopefully you can make a rough guess at what I am trying to say and I can elaborate further on questions :)
Beta Was this translation helpful? Give feedback.
All reactions