-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need to json.unmarshal data_base64 #1129
base: main
Are you sure you want to change the base?
Conversation
@duglin I reckon you should add a test https://github.com/cloudevents/sdk-go/blob/main/v2/event/event_unmarshal_test.go Maybe a test case like this: "base64 json encoded data that has not escaped characters v1.0": {
body: mustJsonMarshal(t, map[string]interface{}{
"specversion": "1.0",
"datacontenttype": "application/octet-stream",
"data_base64": "\\u002B\\u002B\\u002B\\u002B",
"id": "ABC-123",
"time": now.Format(time.RFC3339Nano),
"type": "com.example.test",
"dataschema": "http://example.com/schema",
"source": "http://example.com/source",
}),
want: &event.Event{
Context: event.EventContextV1{
Type: "com.example.test",
Source: *sourceV1,
DataSchema: schemaV1,
ID: "ABC-123",
Time: &now,
DataContentType: event.StringOfTextPlain(),
}.AsV1(),
DataEncoded: []byte("++++"),
DataBase64: true,
},
}, |
7d6dded
to
ddbe284
Compare
@ivan-penchev yup - was working on a testcase we you were testing... |
Sweet! This looks like exactly what I need for the fix. P.S. What about making a new release? Whom do I need to talk with / tag? I can see it has been a while/year :D, since the last release was created. |
@embano1 for a review |
v2/event/event_unmarshal.go
Outdated
@@ -385,9 +386,13 @@ func consumeData(e *Event, isBase64 bool, iter *jsoniter.Iterator) error { | |||
e.DataBase64 = true | |||
|
|||
// Allocate payload byte buffer | |||
base64Encoded := iter.ReadStringAsSlice() | |||
base64Encoded := iter.ReadString() | |||
err := json.Unmarshal([]byte(`"`+base64Encoded+`"`), &base64Encoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could there be issues here if base64Encoded
already contains surrounding ""
? For example, if it's a JSON-encoded string e.g., "{\"hello\":\"world\"}"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't prefer the in-place mutation of the base64Encoded
variable, better use dedicated variables in case the code is later extended and another developer relies on the original base64Encoded
without noticing the overwrite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: extra quotes... I think we're ok because any " in the string would need to be escaped, as you've done it. The quotes around your json object/string thingy are removed by the iter.ReadString() call, so just {\"hello\":\"world\"}
is passed into Unmarshal (after being re-wrapped with quotes).
I added another variable for the Unmarshal per your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this is only for base64 encoded stuff, not normal json strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this is only for base64 encoded stuff, not normal json strings.
In layman terms, since I am new to this package, and I need to explain it to myself:
- this only gets triggered when
data_base64
property is populated, and not whendata
property is populated. - the
json.Unmarshal
is used not to unmarshal the object into json, but to unescape any escaped string, hence the base64 string.
Hope this is helpful @embano1
Signed-off-by: Doug Davis <[email protected]>
return err | ||
} | ||
e.DataEncoded = make([]byte, base64.StdEncoding.DecodedLen(len(base64DeJSON))) | ||
length, err := base64.StdEncoding.Decode(e.DataEncoded, []byte(base64DeJSON)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw: as I'm trying to understand this code (existing, I know), having a hard time following what it really does without comments. For example why is e.DataEncoded[0:length]
needed if we have e.DataEncoded = make([]byte, base64.StdEncoding.DecodedLen(len(base64DeJSON)))
body: []byte(`{ | ||
"specversion": "1.0", | ||
"datacontenttype": "text/plain", | ||
"data_base64": "\\u002B\\u002B\\u002B\\u002B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick question: why are we double-escaping here? none of my base64 decoders provide valid output for this, so just wanted to understand the test fixtures better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I tried to update the test with a base64-encoded JSON string IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK
(which is "{\"hello\":\"world\"}")
to see if we break existing users here somehow (with the new "
change). For some reason, I can't get the test to pass when I provide
"data_base64": "IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK"
and expect DataEncoded: []byte(`"{\"hello\":\"world\"}"`)
. I'm just too stupid I guess - can you please tell me what's going wrong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for your reference, my modified test case (ignore the comment)
"base64 json encoded data v1.0 - escaped json string": {
body: []byte(`{
"specversion": "1.0",
"datacontenttype": "text/plain",
"data_base64": "IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK",
"id": "ABC-123",
"time": "` + now.Format(time.RFC3339Nano) + `",
"type": "com.example.test",
"dataschema": "http://example.com/schema",
"source": "http://example.com/source"
}`),
want: &event.Event{
Context: event.EventContextV1{
Type: "com.example.test",
Source: *sourceV1,
DataSchema: schemaV1,
ID: "ABC-123",
Time: &now,
DataContentType: event.StringOfTextPlain(),
}.AsV1(),
// base64 decode of "++++"
DataEncoded: []byte(`"{\"hello\":\"world\"}"`),
DataBase64: true,
},
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @embano1
It doesn't matter if we do \\u002B\\u002B\\u002B\\u002B
or \u002B\u002B\u002B\u002B
. The test would pass either way, I prefer (and I believe @duglin just copied) the double \ as it gives you extra safety for some parsers, which would remove the \ if it is only one.
Regarding your test case, I think your data_base64 is somehow incorrect?
If I take your test case, and I run it with eyJoZWxsbyI6IndvcmxkIn0=
as a value for data_base64, which I generated via Go, this way:
package main
import (
"encoding/base64"
"encoding/json"
"fmt"
)
func main() {
// Define the JSON object
jsonObj := map[string]string{"hello": "world"}
// Marshal the JSON object to a byte slice
jsonData, err := json.Marshal(jsonObj)
if err != nil {
fmt.Println("Error marshaling JSON:", err)
return
}
// Encode the JSON byte slice to Base64
base64Data := base64.StdEncoding.EncodeToString(jsonData)
// Print the Base64 encoded string
fmt.Println("Base64 Encoded JSON:", base64Data)
}
Your test case passes without issue
Sorry I can't be more of an help, I am a bit clueless how different base64 are generated/
Fixed #1128