Skip to content

Decoded bytes use the same underlying array #269

Open
@lovromazgon

Description

@lovromazgon

When decoding data that contains bytes fields, the returned byte slices reference the same underlying array as the input. This can cause unexpected behavior when manipulating those byte slices. Specifically appending to these slices is not a safe operation, because they can change other byte slices.

Click to expand failing test case

Notice how appending to the slice foo had an effect on bar and binary. All three slices use the same underlying array.

package avro

import (
	"bytes"
	"testing"

	"github.com/linkedin/goavro/v2"
)

func TestBytesOverwrite(t *testing.T) {
	schema := `
{
  "name":"example",
  "type":"record",
  "fields":[
    {"name":"foo","type":"bytes"},
    {"name":"bar","type":"bytes"}
  ]
}`

	codec, err := goavro.NewCodec(schema)
	if err != nil {
		t.Fatal(err)
	}

	// binary data contains {foo:[1,2],bar:[3,4]}
	binary := []byte{4, 1, 2, 4, 3, 4}
	gotNative, _, err := codec.NativeFromBinary(binary)
	if err != nil {
		t.Fatal(err)
	}

	got := gotNative.(map[string]interface{})
	foo := got["foo"].([]byte)
	bar := got["bar"].([]byte)

	if want := []byte{1, 2}; !bytes.Equal(foo, want) {
		t.Fatalf("expected foo to be %v, actually got %v", want, foo)
	}
	if want := []byte{3, 4}; !bytes.Equal(bar, want) {
		t.Fatalf("expected bar to be %v, actually got %v", want, bar)
	}

	// because slices use the same underlying array we can overwrite the others
	// by appending to foo
	foo = append(foo, 0, 0)

	if want := []byte{1, 2, 0, 0}; !bytes.Equal(foo, want) {
		t.Fatalf("expected foo to be %v, actually got %v", want, foo)
	}
	if want := []byte{3, 4}; !bytes.Equal(bar, want) {
		// this assertion fails, bar was changed to [0, 4]
		t.Errorf("expected bar to be %v, actually got %v", want, bar)
	}
	if want := []byte{4, 1, 2, 4, 3, 4}; !bytes.Equal(binary, want) {
		// this assertion fails, binary was changed to [4, 1, 2, 0, 0, 4]
		t.Errorf("expected binary to be %v, actually got %v", want, binary)
	}
}

This could probably be fixed by allocating a new slice before returning buf[:size] in this line.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions