Use standard io Writer and Reader where possible by sagikazarmark · Pull Request #101 · cedar-policy/cedar-go

sagikazarmark · 2025-09-19T10:22:13Z

Description of changes:

This PR contains a few refactors and features:

Refactor: use standard io.Writer and io.Reader where possible instead of bytes.Buffer and []byte. This is necessary for the feature in the PR, but it's also for good hygiene (relying on standard interfaces).

Feature: add a Decoder to the parser package (and x/exp) that allows reading multiple policies from any io.Reader (practically: files). It is modeled after the decoder in the stdlib JSON package.

The primary use case is not having to read all contents of a file first, but let the component that actually needs the content do the reading (which is good practice).

Naturally, an encoder would similarly make sense, but I wanted to open a PR first to get feedback.

internal/parser/cedar_unmarshal.go

philhassey

This looks good, see the linter errors for some stuff to appease it. I suggest adding an extra test just to prove trailing line breaks work as expected. Make sure you obtain 100% code coverage as well in your unit testing.

Thanks a ton!

internal/parser/cedar_unmarshal_test.go

sagikazarmark · 2025-09-23T19:20:04Z

@philhassey I fixed the lint violations and added the linebreaks.

For the record: those errors were not handled before this change (I think golangci-lint might have some default rule for errcheck not to warn for buf.Write methods)

For now I opted to keep the previous behavior: ignore errors.

But I wonder if that's the right thing to do.

UnmarshalCedar returns an error, I think MarshalCedar should too.

But I think that should be a separate PR. Happy to open an issue with the details if you agree.

internal/parser/cedar_marshal.go

sagikazarmark · 2025-09-24T09:01:44Z

Coverage should be a 100% now.

sagikazarmark · 2025-09-24T15:42:15Z

@philhassey I made the requested changes.

But I just realized that the last write to the writer would still normally return an error:

_, _ = w.Write(buf.Bytes())

I propose returning an error from MarshalCedar (similar to the Unmarshal one).

If you agree, I can make the change.

internal/parser/cedar_marshal.go

internal/parser/cedar_unmarshal.go

x/exp/parser/marshal.go

philhassey

Thanks again, just a few more tweaks to get this over the line.

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

sagikazarmark · 2025-09-26T11:28:02Z

@philhassey I've just realized I've been working with the wrong Policy type the whole time. 🤦

Obviously, encoding/decoding parser.Policy is necessary, but far from enough since that's not the exported Policy. All these policy structs confused me. 😄

Answering your latest round of review:

No, the decoder can't be moved out of parser, because it relies on exported features
As a result, I think the encoder should remain in the parser package too
The exp/parser package is not even necessary, because it's not what as I user I want to interact with
So I kept the decoder and encoder in the parser package as they are
Reverted changes to existing marshaling/unmarshaling code as you requested (with the exception that the Tokenizer accepts a reader as well)
I did NOT add doc comments to the decoder and encoder in the parser package, because none of the other functions have them

What changed with the latest version of the PR:

I added an encoder and a decoder type to the root package.

Since they follow standard Go practices, I think their API is stable enough to skip the exp package (though I'm happy to move them if you give me guidance where exactly they should be placed)

They call into the parser encoder/decoder (which is still necessary due to how parsing works internally)

The public encoder/decoder have doc comments as requested.

Sorry for the confusion. But I think this PR is now solid.

philhassey

Okay, I think the shape looks good to me now. I'll get one more maintainer to give it a look just to be sure. Thanks a ton for the work on this! There's a little bit of coverage still missing:

I do wonder if in that function it'd be better if it used OnceValue to return the error so that the error was returned repeatedly in all future calls? Otherwise I think it might panic in line 67.

So maybe a test to call it a second time to prove it correctly emits the same error again in that case?

patjakdev

LGTM! Thank you so much for your contribution.

The one significant piece of feedback I have is around making the tokenizer stream-oriented as well so that Decode() doesn't block until EOF. It doesn't have to be done in this PR, but I'm imagining we're going to want that functionality at some point.

patjakdev · 2025-09-29T16:16:47Z

internal/parser/cedar_tokenize.go

@@ -74,9 +74,13 @@ func (t Token) intValue() (int64, error) {
 }

 func Tokenize(src []byte) ([]Token, error) {


I'd be fine just changing the signature of Tokenize() and forcing callers who have a byte slice to call bytes.NewBuffer() to convert it to an io.Reader. I think there are only two non-test callers.

See my comment about a new PR.

patjakdev · 2025-09-29T16:23:00Z

internal/parser/cedar_unmarshal.go

+type Decoder struct {
+	r io.Reader
+
+	once sync.Once


Let's not use a synchronization primitive for what's basically just a boolean. Just check for p.parser == nil in Decode().

It's not really there for synchronization, but to make sure that part of the code is called exactly once.

The Tokenizer reads the entire stream and executing that code again would yield a different error on subsequent calls.

(But as @philhassey pointed out, even this code has a bug in it: subsequent calls would indeed lead to panic at the moment.)

I'm happy to remove the once variable @patjakdev if you tell me what you would like to see instead. I don't think a nil check is enough, because that would still run the tokenizer again.

If the tokenizer becomes stream-aware, this becomes a non-issue though and we can get rid of it.

patjakdev · 2025-09-29T16:28:05Z

internal/parser/cedar_unmarshal.go

+	var err error
+
+	d.once.Do(func() {
+		tokens, e := TokenizeReader(d.r)


I wonder a bit about this...

As it stands, the tokenizer is going to read until EOF, so you can't really use the Decoder to accept a stream of policies like you might hope without changing the tokenizer to also be stream-oriented.

For example, say you had a TCP connection open to a remote server which was sending you policies but the protocol you're using requires that the client acknowledge each policy before it will send the next one. If you passed the receiver stream into the Decoder and then called Decode(), it would just hang forever waiting for EOF.

Maybe we don't have to tackle that in this PR, but we should think about it...

My thoughts exactly. I didn't want to go down that rabbit hole in this PR though.

From an API perspective, the current PR solves my problem, but I'm happy to take a stab at making the tokenizer stream-oriented in a followup PR.

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

sagikazarmark · 2025-09-30T00:03:19Z

Good catch @philhassey !

Fixed coverage and added another call Decode.

I believe I replied to both of your comments. If I left anything unanswered, let me know.

philhassey · 2025-09-30T14:45:31Z

internal/parser/cedar_unmarshal.go

+	createParser func() (*parser, error)
+}
+
+func NewDecoder(r io.Reader) *Decoder {


@patjakdev @sagikazarmark How's this for a variation, I think it handles the errors how I want, does what @sagikazarmark needs and avoids using sync like @patjakdev wants:

type Decoder struct { r io.Reader parser *parser parserErr error } func (d *Decoder) getParser() (*parser, error) { if d.parser != nil || d.parserErr != nil { return d.parser, d.parserErr } tokens, err := TokenizeReader(d.r) if err != nil { d.parserErr = err return nil, err } p := newParser(tokens) d.parser = &p return d.parser, nil } func NewDecoder(r io.Reader) *Decoder { return &Decoder{ r: r, } } func (d *Decoder) Decode(p *Policy) error { parser, err := d.getParser() if err != nil { return err } if parser.peek().isEOF() { return io.EOF } return p.fromCedar(parser) }

If the tokenizer gets rewritten anyway, I don't think it makes much of a difference (except maybe that the current solution is more idiomatic IMO).

I can make the change if you want me to, but my vote goes to the current version.

(I wouldn't mind getting this one merged and iterate later now that I managed to get the CI green)

If we're going to be serious about ensuring the same error is always returned from Decode() once an error is encountered, how about something like this?

type Decoder struct { r io.Reader parser *parser err error } func NewDecoder(r io.Reader) *Decoder { return &Decoder{ r: r, } } func (d *Decoder) decode(p *Policy) error { if d.parser != nil { tokens, err := TokenizeReader(d.r) if err != nil { return err } parser := newParser(tokens) d.parser = &parser } if d.parser.peek().isEOF() { return io.EOF } return p.fromCedar(d.parser) } func (d *Decoder) Decode(p *Policy) error { if d.err != nil { return d.err } d.err = p.decode() return d.err }

You could also inline decode() into Decode(), but I thought that ended up with a bit too much indentation for the bulk of the code in the function.

That said, I'm fine with what you've got right now and happy to merge it.

sagikazarmark · 2025-10-03T09:56:18Z

I started working on making the tokenizer branching off from this one.

If you are good with this PR, could you merge it? I can provide a PR for the tokenizer shortly.

sagikazarmark · 2025-10-07T13:57:08Z

Friendly ping :)

sagikazarmark · 2025-10-15T21:18:28Z

Friendly ping :)

Patrick OK'd it.

philhassey · 2025-10-20T18:00:15Z

Sorry for the delay here, there was some trouble with our notifications! I'm merging this, I might do a tiny follow-up PR to remove the sync dependency, then I can cut a release.

sagikazarmark commented Sep 19, 2025

View reviewed changes

internal/parser/cedar_unmarshal.go Outdated Show resolved Hide resolved

philhassey reviewed Sep 23, 2025

View reviewed changes

internal/parser/cedar_unmarshal.go Show resolved Hide resolved

philhassey suggested changes Sep 23, 2025

View reviewed changes

philhassey reviewed Sep 23, 2025

View reviewed changes

internal/parser/cedar_unmarshal_test.go Outdated Show resolved Hide resolved

philhassey reviewed Sep 23, 2025

View reviewed changes

internal/parser/cedar_marshal.go Outdated Show resolved Hide resolved

sagikazarmark force-pushed the marshal-io branch 3 times, most recently from fd72b96 to fcfc2ca Compare September 24, 2025 15:40

philhassey reviewed Sep 25, 2025

View reviewed changes

internal/parser/cedar_marshal.go Show resolved Hide resolved

philhassey reviewed Sep 25, 2025

View reviewed changes

internal/parser/cedar_unmarshal.go Show resolved Hide resolved

philhassey reviewed Sep 25, 2025

View reviewed changes

x/exp/parser/marshal.go Outdated Show resolved Hide resolved

philhassey reviewed Sep 25, 2025

View reviewed changes

sagikazarmark added 2 commits September 26, 2025 12:35

refactor(parser): accept an io.Reader in the Tokenizer

fb9f708

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

feat(parser): add a decoder to parse policies from io.Reader

119ae89

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

sagikazarmark force-pushed the marshal-io branch from fcfc2ca to 8bad87e Compare September 26, 2025 11:20

sagikazarmark added 5 commits September 26, 2025 13:21

feat(parser): add a encoder to write policies to io.Writer

1c8cd96

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

refactor: remove unnecessary exp/parser package

971afd9

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

feat: add encoder to write policies into an io.Writer

9572c76

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

feat: add decoder to read policies from an io.Reader

7ca921b

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

test: make sure encoded policies can be decoded

4160a94

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

sagikazarmark force-pushed the marshal-io branch from 8bad87e to 4160a94 Compare September 26, 2025 11:21

philhassey previously requested changes Sep 29, 2025

View reviewed changes

patjakdev approved these changes Sep 29, 2025

View reviewed changes

sagikazarmark added 2 commits September 30, 2025 01:48

test: fix parser decoder coverage

e828582

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

fix(parser): ensure subsequent calls on a failed decoder does not panic

9bc0483

Signed-off-by: Mark Sagi-Kazar <mark.sagikazar@gmail.com>

philhassey reviewed Sep 30, 2025

View reviewed changes

philhassey merged commit 960e041 into cedar-policy:main Oct 20, 2025
2 checks passed

sagikazarmark deleted the marshal-io branch November 21, 2025 18:37

		@@ -74,9 +74,13 @@ func (t Token) intValue() (int64, error) {
		}

		func Tokenize(src []byte) ([]Token, error) {

Conversation

sagikazarmark commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

philhassey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sagikazarmark commented Sep 23, 2025

Uh oh!

Uh oh!

sagikazarmark commented Sep 24, 2025

Uh oh!

sagikazarmark commented Sep 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

philhassey left a comment

Choose a reason for hiding this comment

Uh oh!

sagikazarmark commented Sep 26, 2025

Uh oh!

philhassey left a comment

Choose a reason for hiding this comment

Uh oh!

patjakdev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sagikazarmark commented Sep 30, 2025

Uh oh!

philhassey Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sagikazarmark Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sagikazarmark commented Oct 3, 2025

Uh oh!

sagikazarmark commented Oct 7, 2025

Uh oh!

sagikazarmark commented Oct 15, 2025

Uh oh!

philhassey commented Oct 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sagikazarmark commented Sep 19, 2025 •

edited

Loading

philhassey Sep 30, 2025 •

edited

Loading

sagikazarmark Sep 30, 2025 •

edited

Loading