Skip to content

Progress on Datalog v3.3 Support#178

Open
polygloton wants to merge 13 commits intoeclipse-biscuit:mainfrom
polygloton:datalog-v3.3-work
Open

Progress on Datalog v3.3 Support#178
polygloton wants to merge 13 commits intoeclipse-biscuit:mainfrom
polygloton:datalog-v3.3-work

Conversation

@polygloton
Copy link

@polygloton polygloton commented Mar 3, 2026

Summary

This PR is for work that I have done to implement Datalog v3.3 support in the biscuit-go code.

Sample Test Progress

I am driving most of my feature work using the sample tests. Here is my progress on making them pass.

  • 23: Execution Scope
  • 24: Third Party
  • 25: Check All
  • 26: Public Keys Interning
  • 27: Integer Wraparound
  • 28: Expression V4
  • 29: Reject If
  • 30: Null
  • 31: Heterogeneous Equal
  • 32: Laziness Closures
  • 33: Typeof
  • 34: Array Map
  • 35: FFI
  • 36: Secp256r1

Refactoring

I have made some large code refactors along the way; I hope they are helpful. I certainly welcome feedback and discussion about these changes. My general approach has been to follow the rust library as closely as reasonably possible, refactoring when it was needed to enable that goal. I also believe in refactoring while reading and learning code. I believe that these changes will make it easier to maintain and improve the code in the future, especially since I plan to keep working on this.

  • I refactored the Datalog evaluator, extracting out an evaluator abstraction and splitting it into three different layers.
  • I added a generic HashSet type. Sets are everywhere in this code and I wanted to follow the Rust library.
    • Golang is not very helpful here (it seems the community sentiment is to "roll your own", which I have done).
  • I added three interfaces: Ordered, Equals, and Hashable.
    • These interfaces are implemented on syntax types liberally.
    • These are needed for HashSet to work with generic types.
  • I made changes to the public API.
    • There is an AuthorizerBuilder extracted from the Authorizer.
    • The biscuit Append method was renamed to AppendBlock and it takes a BlockBuilder.

There are more; see the commit comments for details.

Conclusion

I plan to continue my progress for as long as I have time and would love to contribute. Please let me know if/how I can continue to work on this and get it approved. Maybe a new branch would be best, or maybe I can keep working in my fork.

This commit establishes the build and code generation foundation for the v3.3 specification upgrade. It introduces Buf tooling for protocol buffer management, updates the Makefile, and regenerates the protobuf schema from the official v3.3 specification. The changes replace the previous protobuf generation approach with a standardized Buf-based workflow.
…algorithm

Building on the infrastructure updates, this commit introduces a multi-algorithm cryptographic abstraction layer. It adds support for SECP256r1 signatures alongside the existing Ed25519 implementation, providing Signer and Verifier interfaces that abstract over both algorithms. This enables third-party blocks to use different signature algorithms as required by the v3.3 specification.
This commit introduces generic set data structures and helper utilities needed for the enhanced datalog evaluation engine. It implements HashSet[T] with support for ordered, hashable, and equality-comparable types, providing the foundation for efficient fact and origin tracking in subsequent commits.

The abstraction provides union, intersection, and superset operations needed for origin tracking and scope-based fact filtering. Helper functions support sorting and comparison of iterators and slices, which become essential when evaluating datalog rules with facts and rules filtered by origin.

The code uses sets in several locations, including for Datalog syntax. Since golang does not provide a handy built-in Set type, we create a generic type to use in this project.

There are helpers for sorting and comparing slices (and seqs; we make deliberate and liberal use of iterators). The helpers also introduce a hash builder type because we need to manually construct hashes for all of the types that we want to put into sets (this is another feature that golang leaves for us to implement). There will be a lot of boilerplate added to our syntax types and I tried to make it as easy/clean as possilbe to manage.
The parser is extended to support v3.3's new authorization primitives. It adds grammar rules for scope annotations (trusting authority, trusting previous, trusting <pubkey>), alternative check quantifiers (check all, reject if), and public key literals in datalog expressions. These syntax additions enable fine-grained control over fact visibility and verification semantics.

I'm more inclined to like `nom` than `participle`, but I tired to make this as idiomatic as I could. I added several tests to improve my confidence.
This commit updates the core type system to represent the new authorization primitives parsed in the previous commit. It extends Block, Check, and Rule types with scope annotations, check kinds, and public key references, and adds comprehensive conversion functions between the Go types and v3.3 protobuf representations.

To integrate with the generic set abstraction introduced earlier, this commit implements the three required interfaces (Ordered, Equality, Hashable) on key biscuit types: Fact, PublicKey, Scope, Rule, Expression, and the expression operation types (Value, UnaryOp, BinaryOp). These implementations enable facts and scopes to be stored in hash sets with deterministic ordering, which is required by the datalog evaluator (which mimics the rust implementation) and for reproducable results that will pass the sample tests (particularly ordering).

You will start to see the boilerplate required to support the ordering, hashing, and equality interfaces. I tried to make it easy to write this code and I really feel like it is worth the maintenance headache. This allows us to mimic the rust reference code closely when implementing Datalog evaluation logic. That feels like the bigger headache to me, and I didn't want to reinvent the wheel. Golang philosophy seems to be "just write your own set type," (seriously), so I suppose this is idiomatic. I iterated on the code multile times to make is as DRY and clean as I could (without stoopping to using reflection).
…kinds of checks

The Datalog evaluation engine is refactored to implement the semantic changes required by v3.3. This introduces origin tracking for all facts, scope-based fact filtering (where rules can only "see" facts from trusted origins), and support for the three check quantifiers (existential, universal, and rejection). The evaluator is extracted into a new three-layer architecture separating predicate matching from expression evaluation. A new `datalog/evaluator.go` file was added to house the refactored code.

I found the previous evaluator code difficult to understand and even more difficult to change. Also, it felt like there was a missing abastraction, so I extracted a new `Evaluator` type. The three-layer architecture allowed me to write new tests for the bottom two layers which hoepfully will help new developers come to terms with the code more quickly than I did.

Building on the set interfaces from earlier commits, this change implements Compare, Equal, and Hash methods on core datalog types: Term, Predicate, Set, Origin, and BlockID. These implementations enable the new FactSet type (mapping Origin → HashSet[Fact]) and TrustedOrigin type (a HashSet[BlockID]), which provide the foundation for scope-based fact visibility. The deterministic ordering ensures that authorization results are reproducible across implementations.

More boilerplate to support putting syntax elements into sets is added here as well. Again, I hope that the cost of maintaining this code is worth it by making the evaluator easier to understand, especially since it allows us to roughly follow the rust libraries logic for filtering facts and rules by origin.

I also extracted some of the verification logic for `Check` into a `Verify` method (out of the `World`). This moves switching logic for check kind into the `Check` type, which felt more appropriate to me.

I like iterators and, now that golang has them, I use them gratuitously. I think it improves the code and allows for some nice performance characteristics (like when we are traversing the contents of sets, for example). Lots of logic that might have leaked into other types is now contained in special purpose iterator methods on types that use sets internally.

There are some awkward types in the code (like `RuleInBlock`) that exist because golang does not provide a tuple type. I did add some methods onto those types as needed (mostly for syntatic sugar). I followed the principle of pushing logic close to the types when convenient. This reduced the amount of awkward looping logic spread around the code. One example is moving the custom block ID sorting logic into the `BlockID` new-type.
The public API is restructured to expose the new capabilities. Some of the public API was changed, and there are a lot of internal changes as well. The Authorizer is split into AuthorizerBuilder and Authorizer to separate construction from execution, the Builder API is updated to support scoped blocks, and the Biscuit token type is extended to handle third-party blocks with external signatures.

I made some opinionated API changes in this commit. I am willing to change things back, but I hope that we can consider keeping these changes for the v3.3 update.

- A new `AuthorizerBuilder` type has been extracted from `Authorizer`.
  - The previous code combined mutating the authorizer (along with resetting the internally stored evaluation state) with evaluating the world. This refactor splits then into two separate concerns.
  - The code was complex, hard to read, and scary to change. I feel like this commit improves the readability of the code.
  - Mutating the configuration of the authorizer can be done by reusing the builder and creating a new authorizer. This way there is no need to reset the authorizer. I find this simpler to understand and maintain.

- Note: The new authorizer currently assumes that `Authorize()` will only be called once. (I will make it idempotent in a future PR and I will add tests).

- I added a method to inspect the content of the authorizer which is meant to only be used by tests. The only way to access the inspector type is through a visitor function that is well documented as being meant for tests. This is idiomatic golang (we don't get `pub(crate)` visbility).
  - This allows us to do structural diffing in the sample tests, which is introduced in another commit.

- The authorizer builder is returned by the biscuit's `Authorizer()` method and the authorizer options are replaced by methods on the builder. This feels more natural to me and worked well with the newly extracted builder.

- Another user facing API change is that the biscuit's `CreateBlock` method now takes a block ID. The block builder gets a new setter for the block ID, but I felt like it would be easy for the user to forget (including me in other parts of the library code), so I made it explicitly required.

- The biscuit's `Append` method was renamed to `AppendBlock`. The function signature changed to take in a block builder instead of a block (because the biscuit owns the inputs to the `Build()` method, it seemed more natural), therefore I felt like it was a good time to change the name of the method. Append is an overloaded term in golang code.

- Options are simplified and feel more natural to me.

These are a lot of API changes, but we will never get a better opporutnity to change the API. The update to Datalog v3.3 is the right time, IMO, and I feel like these changes improve the code.
This commit introduces the complete third-party blocks workflow. It implements ThirdPartyBlockRequest and ThirdPartyBlockContents types with serialization, signature verification, and integration into the token append flow. Third-party blocks enable external parties to sign and contribute blocks using their own keys, which are verified using the scope system established in earlier commits.
The official Biscuit test suite samples are regenerated to include v3.3 test cases. This adds test vectors for the new features including scope annotations, check quantifiers, public key interning, third-party blocks, and SECP256r1 signatures. The test suite now validates compatibility with other v3.3 implementations.

More sample tests can be made to pass easily (like "test036_secp256r1"), and I will add them in a follow-up PR soon.

I changed the world comparison in the tests to use structural comparisons with diff instead of string comparisons. It felt easier to work with, closer to what we are testing, and nicer to look at test failure messages. This required me to add the `extractWorld` helper method which changes the types of the expected and actual world into a structure that is easier to diff. It uses the new authorizer inspection vistor function added previously.

This might come at the cost of de-emphasizing the string conversion code, so I will think about adding some additional tests there.
The library's example and integration tests are updated to reflect the API changes. This includes updating example code to use the new AuthorizerBuilder pattern and adjusting test assertions to match the refactored API surface.
The README examples are updated to reflect the v3.3 API changes. Code samples now demonstrate the AuthorizerBuilder pattern and other API refinements introduced in the previous commits, ensuring the documentation accurately represents the current library interface.
@Geal
Copy link
Contributor

Geal commented Mar 3, 2026

hi @polygloton, thank you for looking into this. Unfortunately, and while I appreciate the enthusiasm to get this done, there is just no way I will review a 15k lines PR that has not been discussed prior to implementation.
There is already ongoing work to get the 3rd party blocks and other features in biscuit-go. This is long, careful work that I barely have the time to look at these days, I cannot spend that precious time elsewhere

@Geal
Copy link
Contributor

Geal commented Mar 3, 2026

I have to ask though, was all of that generated with a LLM?

@polygloton
Copy link
Author

I have to ask though, was all of that generated with a LLM?

No this was all written by hand. I did use AI to help me learn the problem domain, review my code, discuss features, etc., but all the code is mine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants