Open
Description
First, thank you for creating Jolt. It is an extremely helpful library and we use I use it constantly in my NiFi dataflows.
Unfortunately, I found a bug in modify-overwrite-beta while handling text with UTF-8 characters.
Please see the example below:
Input
{
"text" : "Video length 2:19👇 authentication \n\nThe_Castle_Runs_RED_yes🔴 https://t.co/MZWmLJLggL",
"entities" : {
"UrlTxT" : [ {
"indices" : [ 61, 73 ]
} ],
"TimeTxT" : [ {
"indices" : [ 13, 17 ]
} ]
}
}
modify-overwrite-beta Transform
[
{
"operation": "modify-overwrite-beta",
"spec": {
"entities": {
"*": {
"*": {
"text": "=substring(@(4,text), @(1,indices[0]), @(1,indices[1]))"
}
}
}
}
}
]
Expected Output
{
"text" : "Video length 2:19👇 authentication \n\nThe_Castle_Runs_RED_yes🔴 https://t.co/MZWmLJLggL",
"entities" : {
"UrlTxT" : [ {
"indices" : [ 61, 73 ],
"text" : "https://t.co/MZWmLJLggL"
} ],
"TimeTxT" : [ {
"indices" : [ 13, 17 ],
"text" : "2:19"
} ]
}
}
Jolt Output from https://jolt-demo.appspot.com/#inception
{
"text" : "Video length 2:19?? authentication \n\nThe_Castle_Runs_RED_yes?? https://t.co/MZWmLJLggL",
"entities" : {
"UrlTxT" : [ {
"indices" : [ 61, 73 ],
"text" : "? https://t."
} ],
"TimeTxT" : [ {
"indices" : [ 13, 17 ],
"text" : "2:19"
} ]
}
}
Metadata
Metadata
Assignees
Labels
No labels