You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the backend, there is only a single type: `Value`.
4
+
5
+
A `Value` is simply a tuple of field elements. With the plonky2 backend, a `Value` is a tuple of 4 field elements. In general, the backend will expose a constant `VALUE_SIZE`, and a `Value` will be a tuple of `VALUE_SIZE` field elements.
6
+
7
+
## Integers and booleans
8
+
9
+
The backend encoding stores integers in such a way that arithmetic operations (addition, multiplication, comparison) are inexpensive to verify in-circuit.
10
+
11
+
In the case of the Plonky2 backend, an integer $x$ is decomposed as
12
+
$$x = x_0 + x_1 \cdot 2^{32}$$
13
+
with $0 \leq x_0, x_1 < 2^{32}$ and represented as
14
+
$$\texttt{map}\ \iota\ [x_0, x_1, 0, 0],$$
15
+
where $\iota:\mathbb{N}\cup\{0\}\rightarrow\texttt{GoldilocksField}$ is the canonical projection.
16
+
17
+
On the backend, a boolean is stored as an integer, either 0 or 1; so logical operations on booleans are also inexpensive.
18
+
19
+
## Strings
20
+
21
+
The backend encoding stores strings as hashes, using a hash function that might not be zk-friendly. For this reason, string operations (substrings, accessing individual characters) are hard to verify in-circuit. The POD2 system does not provide methods for manipulating strings.
22
+
23
+
In other words: As POD2 sees it, two strings are either equal or not equal. There are no other relationships between strings.
24
+
25
+
In the case of the Plonky2 backend, a string is converted to a sequence of bytes with the byte `0x01` appended as padding, then the bytes are split into 7-byte chunks starting from the left, these chunks then being interpreted as integers in little-endian form, each of which is naturally an element of `GoldilocksField`, whence the resulting sequence may be hashed via the Poseidon hash function. Symbolically, given a string $s$, its hash is defined by
where `poseidon` is the Poseidon instance used by Plonky2, $\iota$ is as above, $\texttt{chunks}_{n}:[\texttt{u8}]\rightarrow [[\texttt{u8}]]$ is defined such that[^aux]
and $\jmath_\texttt{string->bytes}$ is the canonical mapping of a string to its UTF-8 representation.
38
+
39
+
## Compound types
40
+
41
+
The three front-end compound types (`Dictionary`, `Array`, `Set`) are all represented as Merkle roots on the backend. The details of the representation are explained on a separate [Merkle tree](./merkletree.md) page.
Copy file name to clipboardExpand all lines: book/src/custom.md
+26-2Lines changed: 26 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,9 +10,33 @@ At the backend level, every definition of a predicate is either a conjunction or
10
10
11
11
On the backend, custom predicates are defined in _groups_. A group can contain one or more custom predicates and their associated sub-predicates. Recursive definition is only possible within a group: the definition of a predicate in a group can only depend on previously existing predicates, itself, and other predicates in the same group.
12
12
13
-
## Custom predicates and their IDs
13
+
## Arguments of custom predicates
14
+
15
+
The definition of a custom predicate might also be called an _operation_ or _deduction rule_. It includes two (or, potentially, say, five) statement templates as conditions. The arguments to the statement templates are decomposed as (origin, key) pairs: if statements are allowed to have arity at most 4, then the statement templates in a deduction rule will have at most 8 arguments (4 origins and 4 keys). The same holds for the output statement.
16
+
17
+
Each argument (origin or key) to an statement template is either a wildcard or a literal. In the backend, the wildcard arguments will be identified as *1, *2, *3, ....
18
+
19
+
## Examples
20
+
21
+
See [examples](./customexample.md)
22
+
23
+
## Hashing and predicate IDs
24
+
25
+
Each custom predicate is defined as part of a _group_ of predicates. The definitions of all statements in the group are laid out consecutively (see [examples](./customexample.md)) and hashed. For more details, see the pages on [hashing custom statements](./customhash.md) and [custom predicates](./custompred.md).
26
+
27
+
## How to prove an application of an operation
28
+
29
+
The POD proof format is inspired by "two-column proofs" (for an example, see [Wikipedia](https://en.wikipedia.org/wiki/Mathematical_proof)). A POD contains a "tabular proof", in which each row includes a "statement" and an "operation". The "operation" is the "reason" that justifies the statement: it is everything the circuit needs as a witness to verify the statement.
30
+
31
+
For a custom statement, the "reason" includes the following witnesses and verifications:
32
+
- the definition of the statement, serialized (see [examples](./customexample.md))
33
+
- if the statement is part of a group, the definition of the full group, serialized
34
+
- verify that the hash of the definition is the statement ID
35
+
- the definition will have some number of "wildcards" (*1, *2, ...) as arguments to statement templates; a value for each wildcard must be provided as a witness (each will be either an origin ID or key)
36
+
- the circuit must substitute the claimed values for the wildcards, and the resulting statements (true statements with origins and keys) will appear as witnesses
37
+
- the circuit must verify that all the input statement templates (with origins and keys) appear in the previous statements (in higher rows of the table)
38
+
- the circuit also substitutes the claimed values for the wildcards in the output statement, and verifies that it matches the claimed output statement
14
39
15
-
A custom predicate, like a built-in predicate, is identified by a _name_ on the front end and an _identifier_ on the back end. The identifier is a cryptographic hash of the definition of the group.
16
40
17
41
18
42
[^inherit]: What to call this? One POD "inherits" from another?
Every predicate, native or custom, is identified on the backend by a predicate ID.
4
+
5
+
The native predicates are numbered with small integers, sequentially. The ID of a custom predicate is a hash of its definition; this guarantees that two different predicates cannot have the same ID (aside from the miniscule probability of a hash collision).
6
+
7
+
This document explains in some detail how the definition of a custom predicate is serialized and hashed.
8
+
9
+
Custom predicates are defined in _groups_ (also known as _batches_); see an [example](./customexample.md). The definition of a custom predicate in a group involves other predicates, which may include:
10
+
- native predicates
11
+
- previously-defined custom predicates
12
+
- other predicates in the same group.
13
+
14
+
Predicate hashing is recursive: in order to hash a group of custom predicates, we need to know IDs for all the previously-defined custom predicates it depends on.
15
+
16
+
The definition of the whole group of custom predicates is serialized (as explained below), and that serialization is hashed (using a zk-friendly hash -- in the case of the plonky2 backend, Poseidon) to give a _group ID_. Each predicate in the group is then referenced by
17
+
```
18
+
predicate_ID = (group_ID, idx)
19
+
```
20
+
(here `idx` is simply the index of the predicate in the group).
The POD2 system consists of a frontend and a backend, connected by a middleware. This page outlines some design principles for deciding which components go where.
4
+
5
+
```
6
+
user -- frontend -- middleware -- backend -- ZK circuit
7
+
```
8
+
9
+
The frontend is what we want the user to see; the backend is what we want the circuit to see.
10
+
11
+
## Circuit and proving system
12
+
13
+
The first implementation of POD2 uses Plonky2 as its proving system. In principle, a future implementation could use some other proving system. The frontend and middleware should not be aware of what proving system is in use: anything specific to the proving system belongs to the backend.
14
+
15
+
## User-facing types versus in-circuit types
16
+
17
+
The frontend type system exposes human-readable types to POD developers: strings, ints, bools, and so forth. On the backend, all types are build out of field elements. The middleware should handle the conversion.
In the frontend, this type is none other than `u64`[^i64]. In the backend, it will be appropriately embedded into the codomain of the canonical hash function.
20
22
21
-
In the case of the Plonky2 backend, this is done by decomposing such an integer $x$ as
22
-
$$x = x_0 + x_1 \cdot 2^{32}$$
23
-
with $0 \leq x_0, x_1 < 2^{32}$ and representing it as
24
-
$$\texttt{map}\ \iota\ [x_0, x_1, 0, 0],$$
25
-
where $\iota:\mathbb{N}\cup\{0\}\rightarrow\texttt{GoldilocksField}$ is the canonical projection.
26
-
23
+
## `Bool`
24
+
In the frontend, this is a simple bool. In the backend, it will have the same encoding as an `Integer``0` (for `false`) or `1` (for `true`).
27
25
28
26
## `String`
29
27
In the frontend, this type corresponds to the usual `String`. In the backend, the string will be mapped to a sequence of field elements and hashed with the hash function employed there, thus being represented by its hash.
30
28
31
-
In the case of the Plonky2 backend, the string is converted to a sequence of bytes with the byte `0x01` appended as padding, then the bytes are split into 7-byte chunks starting from the left, these chunks then being interpreted as integers in little-endian form, each of which is naturally an element of `GoldilocksField`, whence the resulting sequence may be hashed via the Poseidon hash function. Symbolically, given a string $s$, its hash is defined by
where `poseidon` is the Poseidon instance used by Plonky2, $\iota$ is as above, $\texttt{chunks}_{n}:[\texttt{u8}]\rightarrow [[\texttt{u8}]]$ is defined such that[^aux]
0 commit comments