-
Notifications
You must be signed in to change notification settings - Fork 11
Organize docs: front and back end; custom predicates. #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,41 @@ | ||
| # Backend types | ||
|
|
||
| On the backend, there is only a single type: `Value`. | ||
|
|
||
| A `Value` is simply a tuple of field elements. With the plonky2 backend, a `Value` is a tuple of 4 field elements. In general, the backend will expose a constant `VALUE_SIZE`, and a `Value` will be a tuple of `VALUE_SIZE` field elements. | ||
|
|
||
| ## Integers and booleans | ||
|
|
||
| The backend encoding stores integers in such a way that arithmetic operations (addition, multiplication, comparison) are inexpensive to verify in-circuit. | ||
|
|
||
| In the case of the Plonky2 backend, an integer $x$ is decomposed as | ||
| $$x = x_0 + x_1 \cdot 2^{32}$$ | ||
| with $0 \leq x_0, x_1 < 2^{32}$ and represented as | ||
| $$\texttt{map}\ \iota\ [x_0, x_1, 0, 0],$$ | ||
| where $\iota:\mathbb{N}\cup\{0\}\rightarrow\texttt{GoldilocksField}$ is the canonical projection. | ||
|
|
||
| On the backend, a boolean is stored as an integer, either 0 or 1; so logical operations on booleans are also inexpensive. | ||
|
|
||
| ## Strings | ||
|
|
||
| The backend encoding stores strings as hashes, using a hash function that might not be zk-friendly. For this reason, string operations (substrings, accessing individual characters) are hard to verify in-circuit. The POD2 system does not provide methods for manipulating strings. | ||
|
|
||
| In other words: As POD2 sees it, two strings are either equal or not equal. There are no other relationships between strings. | ||
|
|
||
| In the case of the Plonky2 backend, a string is converted to a sequence of bytes with the byte `0x01` appended as padding, then the bytes are split into 7-byte chunks starting from the left, these chunks then being interpreted as integers in little-endian form, each of which is naturally an element of `GoldilocksField`, whence the resulting sequence may be hashed via the Poseidon hash function. Symbolically, given a string $s$, its hash is defined by | ||
|
|
||
| $$\texttt{poseidon}(\texttt{map}\ (\iota\circ\jmath_\texttt{le-bytes->int})\ \texttt{chunks}_7(\jmath_\texttt{string->bytes}(s)\ \texttt{++}\ [\texttt{0x01}])),$$ | ||
|
|
||
| where `poseidon` is the Poseidon instance used by Plonky2, $\iota$ is as above, $\texttt{chunks}_{n}:[\texttt{u8}]\rightarrow [[\texttt{u8}]]$ is defined such that[^aux] | ||
|
|
||
| $$\texttt{chunks}_n(v) = \textup{if}\ v = [\ ]\ \textup{then}\ [\ ]\ \textup{else}\ [\texttt{take}_n v]\ \texttt{++}\ \texttt{chunks}_n(\texttt{drop}_n v),$$ | ||
|
|
||
| the mapping $\jmath_\texttt{le-bytes->int}: [u8] \rightarrow{N}\cup\{0\}$ is given by | ||
|
|
||
| $$[b_0,\dots,b_{N-1}]\mapsto \sum_{i=0}^{N-1} b_i \cdot 2^{8i},$$ | ||
|
|
||
| and $\jmath_\texttt{string->bytes}$ is the canonical mapping of a string to its UTF-8 representation. | ||
|
|
||
| ## Compound types | ||
|
|
||
| The three front-end compound types (`Dictionary`, `Array`, `Set`) are all represented as Merkle roots on the backend. The details of the representation are explained on a separate [Merkle tree](./merkletree.md) page. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # How to hash a custom predicate | ||
|
|
||
| Every predicate, native or custom, is identified on the backend by a predicate ID. | ||
|
|
||
| The native predicates are numbered with small integers, sequentially. The ID of a custom predicate is a hash of its definition; this guarantees that two different predicates cannot have the same ID (aside from the miniscule probability of a hash collision). | ||
|
|
||
| This document explains in some detail how the definition of a custom predicate is serialized and hashed. | ||
|
|
||
| Custom predicates are defined in _groups_ (also known as _batches_); see an [example](./customexample.md). The definition of a custom predicate in a group involves other predicates, which may include: | ||
| - native predicates | ||
| - previously-defined custom predicates | ||
| - other predicates in the same group. | ||
|
|
||
| Predicate hashing is recursive: in order to hash a group of custom predicates, we need to know IDs for all the previously-defined custom predicates it depends on. | ||
|
|
||
| The definition of the whole group of custom predicates is serialized (as explained below), and that serialization is hashed (using a zk-friendly hash -- in the case of the plonky2 backend, Poseidon) to give a _group ID_. Each predicate in the group is then referenced by | ||
| ``` | ||
| predicate_ID = (group_ID, idx) | ||
| ``` | ||
| (here `idx` is simply the index of the predicate in the group). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,17 @@ | ||
| # Frontend and backend | ||
|
|
||
| The frontend is what we want the user to see. | ||
| The backend is what we want the circuit to see. | ||
| The POD2 system consists of a frontend and a backend, connected by a middleware. This page outlines some design principles for deciding which components go where. | ||
|
|
||
| ``` | ||
| user -- frontend -- middleware -- backend -- ZK circuit | ||
| ``` | ||
|
|
||
| The frontend is what we want the user to see; the backend is what we want the circuit to see. | ||
|
|
||
| ## Circuit and proving system | ||
|
|
||
| The first implementation of POD2 uses Plonky2 as its proving system. In principle, a future implementation could use some other proving system. The frontend and middleware should not be aware of what proving system is in use: anything specific to the proving system belongs to the backend. | ||
|
|
||
| ## User-facing types versus in-circuit types | ||
|
|
||
| The frontend type system exposes human-readable types to POD developers: strings, ints, bools, and so forth. On the backend, all types are build out of field elements. The middleware should handle the conversion. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,9 @@ | ||
| # POD value types | ||
| From the frontend perspective, POD values may be one of the following[^type] types: two atomic types | ||
ax0 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - `Integer` | ||
| - `Bool` | ||
| - `String` | ||
| - `Raw` | ||
|
|
||
| and three compound types | ||
| - `Dictionary` | ||
|
|
@@ -24,6 +26,8 @@ with $0 \leq x_0, x_1 < 2^{32}$ and representing it as | |
| $$\texttt{map}\ \iota\ [x_0, x_1, 0, 0],$$ | ||
| where $\iota:\mathbb{N}\cup\{0\}\rightarrow\texttt{GoldilocksField}$ is the canonical projection. | ||
|
|
||
| ## `Bool` | ||
| In the frontend, this is a simple bool. In the backend, it will have the same encoding as an `Integer` `0` (for `false`) or `1` (for `true`). | ||
|
|
||
| ## `String` | ||
| In the frontend, this type corresponds to the usual `String`. In the backend, the string will be mapped to a sequence of field elements and hashed with the hash function employed there, thus being represented by its hash. | ||
|
|
@@ -42,7 +46,10 @@ $$[b_0,\dots,b_{N-1}]\mapsto \sum_{i=0}^{N-1} b_i \cdot 2^{8i},$$ | |
|
|
||
| and $\jmath_\texttt{string->bytes}$ is the canonical mapping of a string to its UTF-8 representation. | ||
|
|
||
| ## `Raw` | ||
| "Raw" is short for "raw value". A `Raw` exposes a backend value on the frontend. | ||
|
|
||
| With the plonky2 backend, a `Raw` is a tuple of 4 elements of the Goldilocks field. | ||
|
Comment on lines
+29
to
+32
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is the |
||
|
|
||
| ## Dictionary, array, set | ||
|
|
||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to define (not necessarily now in this PR) where the
Dictionary,Array,Setbelong: here they appear in the Backend section referencing themerkletree.mdfile, on the same time they are described at the Frontend subsectionvalues.mdfile; and in the code they are implemented in the middleware.