-
Notifications
You must be signed in to change notification settings - Fork 228
need to json.unmarshal data_base64 #1129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@duglin I reckon you should add a test https://github.com/cloudevents/sdk-go/blob/main/v2/event/event_unmarshal_test.go Maybe a test case like this: "base64 json encoded data that has not escaped characters v1.0": {
body: mustJsonMarshal(t, map[string]interface{}{
"specversion": "1.0",
"datacontenttype": "application/octet-stream",
"data_base64": "\\u002B\\u002B\\u002B\\u002B",
"id": "ABC-123",
"time": now.Format(time.RFC3339Nano),
"type": "com.example.test",
"dataschema": "http://example.com/schema",
"source": "http://example.com/source",
}),
want: &event.Event{
Context: event.EventContextV1{
Type: "com.example.test",
Source: *sourceV1,
DataSchema: schemaV1,
ID: "ABC-123",
Time: &now,
DataContentType: event.StringOfTextPlain(),
}.AsV1(),
DataEncoded: []byte("++++"),
DataBase64: true,
},
}, |
7d6dded
to
ddbe284
Compare
@ivan-penchev yup - was working on a testcase we you were testing... |
Sweet! This looks like exactly what I need for the fix. P.S. What about making a new release? Whom do I need to talk with / tag? I can see it has been a while/year :D, since the last release was created. |
@embano1 for a review |
v2/event/event_unmarshal.go
Outdated
@@ -385,9 +386,13 @@ func consumeData(e *Event, isBase64 bool, iter *jsoniter.Iterator) error { | |||
e.DataBase64 = true | |||
|
|||
// Allocate payload byte buffer | |||
base64Encoded := iter.ReadStringAsSlice() | |||
base64Encoded := iter.ReadString() | |||
err := json.Unmarshal([]byte(`"`+base64Encoded+`"`), &base64Encoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could there be issues here if base64Encoded
already contains surrounding ""
? For example, if it's a JSON-encoded string e.g., "{\"hello\":\"world\"}"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't prefer the in-place mutation of the base64Encoded
variable, better use dedicated variables in case the code is later extended and another developer relies on the original base64Encoded
without noticing the overwrite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: extra quotes... I think we're ok because any " in the string would need to be escaped, as you've done it. The quotes around your json object/string thingy are removed by the iter.ReadString() call, so just {\"hello\":\"world\"}
is passed into Unmarshal (after being re-wrapped with quotes).
I added another variable for the Unmarshal per your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this is only for base64 encoded stuff, not normal json strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this is only for base64 encoded stuff, not normal json strings.
In layman terms, since I am new to this package, and I need to explain it to myself:
- this only gets triggered when
data_base64
property is populated, and not whendata
property is populated. - the
json.Unmarshal
is used not to unmarshal the object into json, but to unescape any escaped string, hence the base64 string.
Hope this is helpful @embano1
return err | ||
} | ||
e.DataEncoded = make([]byte, base64.StdEncoding.DecodedLen(len(base64DeJSON))) | ||
length, err := base64.StdEncoding.Decode(e.DataEncoded, []byte(base64DeJSON)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw: as I'm trying to understand this code (existing, I know), having a hard time following what it really does without comments. For example why is e.DataEncoded[0:length]
needed if we have e.DataEncoded = make([]byte, base64.StdEncoding.DecodedLen(len(base64DeJSON)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea. I paused on those lines of code too but decided to not risk breaking anything by trying to clean it up, and instead just focus on the bug I was trying to fix.
v2/event/event_unmarshal_test.go
Outdated
body: []byte(`{ | ||
"specversion": "1.0", | ||
"datacontenttype": "text/plain", | ||
"data_base64": "\\u002B\\u002B\\u002B\\u002B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick question: why are we double-escaping here? none of my base64 decoders provide valid output for this, so just wanted to understand the test fixtures better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I tried to update the test with a base64-encoded JSON string IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK
(which is "{\"hello\":\"world\"}")
to see if we break existing users here somehow (with the new "
change). For some reason, I can't get the test to pass when I provide
"data_base64": "IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK"
and expect DataEncoded: []byte(`"{\"hello\":\"world\"}"`)
. I'm just too stupid I guess - can you please tell me what's going wrong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for your reference, my modified test case (ignore the comment)
"base64 json encoded data v1.0 - escaped json string": {
body: []byte(`{
"specversion": "1.0",
"datacontenttype": "text/plain",
"data_base64": "IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK",
"id": "ABC-123",
"time": "` + now.Format(time.RFC3339Nano) + `",
"type": "com.example.test",
"dataschema": "http://example.com/schema",
"source": "http://example.com/source"
}`),
want: &event.Event{
Context: event.EventContextV1{
Type: "com.example.test",
Source: *sourceV1,
DataSchema: schemaV1,
ID: "ABC-123",
Time: &now,
DataContentType: event.StringOfTextPlain(),
}.AsV1(),
// base64 decode of "++++"
DataEncoded: []byte(`"{\"hello\":\"world\"}"`),
DataBase64: true,
},
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @embano1
It doesn't matter if we do \\u002B\\u002B\\u002B\\u002B
or \u002B\u002B\u002B\u002B
. The test would pass either way, I prefer (and I believe @duglin just copied) the double \ as it gives you extra safety for some parsers, which would remove the \ if it is only one.
Regarding your test case, I think your data_base64 is somehow incorrect?
If I take your test case, and I run it with eyJoZWxsbyI6IndvcmxkIn0=
as a value for data_base64, which I generated via Go, this way:
package main
import (
"encoding/base64"
"encoding/json"
"fmt"
)
func main() {
// Define the JSON object
jsonObj := map[string]string{"hello": "world"}
// Marshal the JSON object to a byte slice
jsonData, err := json.Marshal(jsonObj)
if err != nil {
fmt.Println("Error marshaling JSON:", err)
return
}
// Encode the JSON byte slice to Base64
base64Data := base64.StdEncoding.EncodeToString(jsonData)
// Print the Base64 encoded string
fmt.Println("Base64 Encoded JSON:", base64Data)
}
Your test case passes without issue
Sorry I can't be more of an help, I am a bit clueless how different base64 are generated/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, strange. I quickly wanted to try it, so I used the OSX CLI to generate the test string and wasn't expecting differences in encoding :
echo IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK | base64 -d
"{\"hello\":\"world\"}"
Will try using the Go encoder and verify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't explain why \\
and \
yield the same results.
Notice that I use back-tick and not double-quotes around the entire thing, so you'd think some parser would not be happy about the double-\.
But yes, I just copy-n-pasted it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would point out this though:
$ echo -e '\u002B\u002B\u002B\u002B' | base64 -d
��
(which is correct)
$ echo -e '\\u002B\\u002B\\u002B\\u002B' | base64 -d
base64: invalid input
Which is actually closer to what I would expect.
I can change it to a single \
if you (@embano1 ) want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's a golang thing to be more permissive???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was me @duglin , I would change it to a \
(single) one if that is passing the base64 test.
Maybe go is a bit more permissive :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok made them single.... testing now.
Once CI is happy I'll merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@duglin would you mind merging and possibly releasing? 🥇 Thank you in advance. |
Signed-off-by: Doug Davis <[email protected]>
Fixed #1128