Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to json.unmarshal data_base64 #1129

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

duglin
Copy link
Contributor

@duglin duglin commented Mar 12, 2025

Fixed #1128

@ivan-penchev
Copy link

ivan-penchev commented Mar 13, 2025

@duglin I reckon you should add a test

https://github.com/cloudevents/sdk-go/blob/main/v2/event/event_unmarshal_test.go

Maybe a test case like this:

		"base64 json encoded data that has not escaped characters v1.0": {
			body: mustJsonMarshal(t, map[string]interface{}{
				"specversion":     "1.0",
				"datacontenttype": "application/octet-stream",
				"data_base64":     "\\u002B\\u002B\\u002B\\u002B",
				"id":              "ABC-123",
				"time":            now.Format(time.RFC3339Nano),
				"type":            "com.example.test",
				"dataschema":      "http://example.com/schema",
				"source":          "http://example.com/source",
			}),
			want: &event.Event{
				Context: event.EventContextV1{
					Type:            "com.example.test",
					Source:          *sourceV1,
					DataSchema:      schemaV1,
					ID:              "ABC-123",
					Time:            &now,
					DataContentType: event.StringOfTextPlain(),
				}.AsV1(),
				DataEncoded: []byte("++++"),
				DataBase64:  true,
			},
		},

@duglin duglin force-pushed the issue1128 branch 2 times, most recently from 7d6dded to ddbe284 Compare March 13, 2025 00:18
@duglin
Copy link
Contributor Author

duglin commented Mar 13, 2025

@ivan-penchev yup - was working on a testcase we you were testing...

@ivan-penchev
Copy link

ivan-penchev commented Mar 13, 2025

Sweet! This looks like exactly what I need for the fix.
Do you have in mind, who else needs to review it to be merged in main @duglin ?

P.S. What about making a new release? Whom do I need to talk with / tag? I can see it has been a while/year :D, since the last release was created.

@duglin
Copy link
Contributor Author

duglin commented Mar 13, 2025

@embano1 for a review

@@ -385,9 +386,13 @@ func consumeData(e *Event, isBase64 bool, iter *jsoniter.Iterator) error {
e.DataBase64 = true

// Allocate payload byte buffer
base64Encoded := iter.ReadStringAsSlice()
base64Encoded := iter.ReadString()
err := json.Unmarshal([]byte(`"`+base64Encoded+`"`), &base64Encoded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could there be issues here if base64Encoded already contains surrounding "" ? For example, if it's a JSON-encoded string e.g., "{\"hello\":\"world\"}"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't prefer the in-place mutation of the base64Encoded variable, better use dedicated variables in case the code is later extended and another developer relies on the original base64Encoded without noticing the overwrite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: extra quotes... I think we're ok because any " in the string would need to be escaped, as you've done it. The quotes around your json object/string thingy are removed by the iter.ReadString() call, so just {\"hello\":\"world\"} is passed into Unmarshal (after being re-wrapped with quotes).

I added another variable for the Unmarshal per your suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is only for base64 encoded stuff, not normal json strings.

Copy link

@ivan-penchev ivan-penchev Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is only for base64 encoded stuff, not normal json strings.

In layman terms, since I am new to this package, and I need to explain it to myself:

  1. this only gets triggered when data_base64 property is populated, and not when data property is populated.
  2. the json.Unmarshal is used not to unmarshal the object into json, but to unescape any escaped string, hence the base64 string.

Hope this is helpful @embano1

return err
}
e.DataEncoded = make([]byte, base64.StdEncoding.DecodedLen(len(base64DeJSON)))
length, err := base64.StdEncoding.Decode(e.DataEncoded, []byte(base64DeJSON))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw: as I'm trying to understand this code (existing, I know), having a hard time following what it really does without comments. For example why is e.DataEncoded[0:length] needed if we have e.DataEncoded = make([]byte, base64.StdEncoding.DecodedLen(len(base64DeJSON)))

body: []byte(`{
"specversion": "1.0",
"datacontenttype": "text/plain",
"data_base64": "\\u002B\\u002B\\u002B\\u002B",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question: why are we double-escaping here? none of my base64 decoders provide valid output for this, so just wanted to understand the test fixtures better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I tried to update the test with a base64-encoded JSON string IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK (which is "{\"hello\":\"world\"}") to see if we break existing users here somehow (with the new " change). For some reason, I can't get the test to pass when I provide

"data_base64": "IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK" and expect DataEncoded: []byte(`"{\"hello\":\"world\"}"`). I'm just too stupid I guess - can you please tell me what's going wrong here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for your reference, my modified test case (ignore the comment)

"base64 json encoded data v1.0 - escaped json string": {
			body: []byte(`{
				"specversion":     "1.0",
				"datacontenttype": "text/plain",
				"data_base64": "IntcImhlbGxvXCI6XCJ3b3JsZFwifSIK",
				"id":          "ABC-123",
				"time":        "` + now.Format(time.RFC3339Nano) + `",
				"type":        "com.example.test",
				"dataschema":  "http://example.com/schema",
				"source":      "http://example.com/source"
			}`),
			want: &event.Event{
				Context: event.EventContextV1{
					Type:            "com.example.test",
					Source:          *sourceV1,
					DataSchema:      schemaV1,
					ID:              "ABC-123",
					Time:            &now,
					DataContentType: event.StringOfTextPlain(),
				}.AsV1(),
				// base64 decode of "++++"
				DataEncoded: []byte(`"{\"hello\":\"world\"}"`),
				DataBase64:  true,
			},
		}

Copy link

@ivan-penchev ivan-penchev Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @embano1

It doesn't matter if we do \\u002B\\u002B\\u002B\\u002B or \u002B\u002B\u002B\u002B. The test would pass either way, I prefer (and I believe @duglin just copied) the double \ as it gives you extra safety for some parsers, which would remove the \ if it is only one.

Regarding your test case, I think your data_base64 is somehow incorrect?

If I take your test case, and I run it with eyJoZWxsbyI6IndvcmxkIn0= as a value for data_base64, which I generated via Go, this way:

package main

import (
	"encoding/base64"
	"encoding/json"
	"fmt"
)

func main() {
	// Define the JSON object
	jsonObj := map[string]string{"hello": "world"}

	// Marshal the JSON object to a byte slice
	jsonData, err := json.Marshal(jsonObj)
	if err != nil {
		fmt.Println("Error marshaling JSON:", err)
		return
	}

	// Encode the JSON byte slice to Base64
	base64Data := base64.StdEncoding.EncodeToString(jsonData)

	// Print the Base64 encoded string
	fmt.Println("Base64 Encoded JSON:", base64Data)
}

Your test case passes without issue
image

Sorry I can't be more of an help, I am a bit clueless how different base64 are generated/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to deserialize data_base64 non escaped string
3 participants