-
Notifications
You must be signed in to change notification settings - Fork 101
Open
Labels
type: bugA code related bugA code related bug
Description
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
When trying to parse a string from a source that does not consider \
to be special, and thus does not escape it (as \\
), parse_tokens does not seem to handle tokenizing that string correctly.
Given the following message (which is from haproxy via syslog, but with irrelevant bits removed):
{
"message1": "start \"GET https://vector.dev/\\ HTTP/2.0\" remainder",
"message2": "start \"GET https://vector.dev/ HTTP/2.0\" remainder"
}
(It is of course escaped here because of the JSON encoding)
and the following VRL program:
.tokens_bad = parse_tokens!(.message1)
.tokens_good = parse_tokens!(.message2)
This should result in 3 tokens for each message, but message1
has only two, and the rest of the string is messed up too, now including the surrounding quotes:
{
"tokens_bad": [
"start",
"\"GET https://vector.dev/\\ HTTP/2.0\" remainder"
],
"tokens_good": [
"start",
"GET https://vector.dev/ HTTP/2.0",
"remainder"
]
}
Version
vector 0.47.0 (x86_64-unknown-linux-gnu 3d5af22 2025-05-20 13:53:41.638057046)
Metadata
Metadata
Assignees
Labels
type: bugA code related bugA code related bug