Conversation
|
Some thoughts after reading the PR description (I haven't reviewed the code yet)
This would apply in generals to literal values that appear in templates. Another example would be a request that involves a statement to show that someone is over 18 years old. To do that we need to calculate a value that is 18 years in the past from now, and this value will change every time we do a request; so being able to pass this value externally would be useful. Questions
|
ed255
left a comment
There was a problem hiding this comment.
I've finished reviewing this PR and I'm marking request changes because there are significant changes to the approach of this PR that I want to discuss.
src/lang/processor.rs
Outdated
| let hex_str = hex::encode(&bytes); | ||
| let padded_hex_str = format!("{:0>64}", hex_str); | ||
| parse_hex_to_raw_value(&padded_hex_str).map(Value::from) |
There was a problem hiding this comment.
This looks over complicated. You don't need to encode the bytes into a hex string to do that.
A RawValue is 4 goldiclocks field elements, which can be represented with 64 bits each.
You just need to build each goldilock field element with 64/8 = 8 bytes from the bytes vector.
Also you need to make a decision on the endianness of the bytes vector (and please document the endianness in the grammar).
Also to simplify things I would change the grammar to force the literal_raw to be 32 bytes (or 64 hex characters).
Finally: the goldilocks field is smaller than 2^64-1. This means that some values have aliases, for example 1 and p-1 fit into 64 bits and are the same value but look different when encoded into bytes. I suggest that the parser checks that each array of bytes corresponding to a field element represents an integer that is smaller than p. The method GoldilocksField::from_canonical_u64 performs this check but only in debug mode, not in release. We should check n < Self::GoldilocksField and return an error if that's not the case.
b070e73 to
6757d8d
Compare
src/lang/processor.rs
Outdated
| let name_only = | ||
| full_name | ||
| .strip_prefix("?") | ||
| .ok_or_else(|| ProcessorError::Semantic { | ||
| message: format!("Invalid wildcard format for BuilderArg: {}", full_name), | ||
| span: Some(get_span(arg_content_pair)), | ||
| })?; | ||
| if name_only.is_empty() { | ||
| return Err(ProcessorError::Semantic { | ||
| message: "Wildcard name cannot be empty after '?' for BuilderArg".to_string(), | ||
| span: Some(get_span(arg_content_pair)), | ||
| }); | ||
| } |
There was a problem hiding this comment.
Here you're doing two checks:
Make sure there's a ? prefix and making sure the identifier is not empty. But the grammar already enforces this!
So by the definition of the grammar, these two errors will never occur.
More over, if they were to happen, it would be a bug in the code because either pest is not returning a parsing error, or the grammar is not well defined. For those cases I strongly suggest that we panic instead, and we can do this cleanly with expect.
Please take a look at this commit where I've made some changes to simplify this following this idea:
b806322
I think there are a few more places where this changes can be applied. The only place where we need to do some checking and do proper error reporting is when handling native predicates because their argument types are not validated by the grammar.
But any time we find a rule defined in the grammar, we know that the tokens defined by the grammar will be there. For example, every time we find the Rule::anchored_key we know for sure that we'll encounter 2 Pairs. The first one will either be self_keyword or wildcard; and the second one will either be wildcard or literal_string. Any other case is not expected to happen on any kind of user input.
| Ok(result) | ||
| } | ||
|
|
||
| fn parse_hex_str_to_raw_value(hex_str: &str) -> Result<middleware::RawValue, ProcessorError> { |
There was a problem hiding this comment.
I just realized we already had this functionality here:
pod2/src/middleware/basetypes.rs
Lines 209 to 218 in d3fef83
It's used to deserialize a Hash from a hex string, and internally it creates the 4 field elements, which is the same we need to get a RawValue. Then you can do RawValue::from(h: Hash).
The encoding is different though, and it doesn't use a 0x prefix.
The encoding uses little endian, and taking a little endian hex encoded value and adding a 0x prefix is confusing from my point of view.
ed255
left a comment
There was a problem hiding this comment.
LGTM! Thanks for addressing all the requests!
This adds a simple DSL for specifying custom predicates and proof requests. It is similar to the language we have been using informally in our docs. The code includes tests, and should be capable of supporting all of our current predicates, with the exception of
SetContains/SetNotContains. A language reference is included below.Example
A proof request for ZuKYC:
Ethdos custom predicates:
Code structure
The grammar is implemented using Pest, which is able to derive a basic parser implementation from a grammar file. We have tests for this in
parser.rs. The parsed input is then turned into anASTby the code inast.rs. This closely mirrors the internal POD2 data structures, but may diverge in some ways (for example, by allowing certain pieces of syntactic sugar). Finally the AST is turned into POD2 internal data structures inprocessor.rs. There are end-to-end tests inmod.rs.Outstanding issues
Currently
SetContainsandSetNotContainsare unsupported, as de-sugaring these to lower-level templates might require some changes inMainPodBuilderor elsewhere. See #224Next steps
If we're happy with the grammar here, I will update the code examples in the book to use this syntax.
Possible future features
A few features have been considered and are currently out-of-scope, but could be included in future:
Multiple custom predicate batches
For a given Podlog document, all custom predicates will be included in the same batch. We might want to give the user more control by allowing them to specify separate batches, e.g.:
In this case, the call to
eth_friendwould be made via aCustomPredicateRefrather than aBatchSelf.Import syntax
It might be helpful to be able to import custom predicates from different files. For example:
Given that each custom predicate has a unique hash, we could also imagine a content-addressable store of custom predicates:
Language reference
Language reference
Podlog Language Syntax
Podlog is a declarative language used for defining custom proof logic and requests within the POD2 ecosystem. It allows for the specification of statement templates that can be evaluated by a prover.
Table of Contents
Overview
A Podlog document consists of zero or more top-level definitions, which can be either custom predicate definitions or a single request definition. The language is designed to be human-readable while providing precise instructions for the POD2 prover.
Comments
Single-line comments begin with
//. All text from//to the end of the line is ignored.Identifiers
Identifiers are used for naming custom predicates and for the non-prefixed part of variable names.
_).private,REQUEST,AND,OR, and native predicate names (e.g.,Equal,ValueOf) cannot be used as predicate identifiers.privatecannot be used as an identifier (e.g., for variable names).Examples:
my_predicate_internal_checkisValidUser1Invalid:
123starts_with_digithas-hyphenprivate(as a predicate or variable name component)Equal(as a predicate name)Variables
Variables represent placeholders for values that the prover will attempt to bind.
?).?must adhere to the rules for Identifiers.Examples:
?UserPod?publicKey?_tempValueLiterals
Literals represent fixed values.
Integer
A sequence of digits, optionally prefixed with a
-for negative numbers.Maps to
middleware::Value::Int.Examples:
123,-45,0String
Characters enclosed in double quotes (
").Standard JSON-like escaping rules apply for special characters:
\"for double quote\\for backslash/for forward slash (optional,/can be unescaped)\bfor backspace\ffor form feed\nfor newline\rfor carriage return\tfor tab\uXXXXfor a Unicode character specified by 4 hexadecimal digits.Maps to
middleware::Value::String.Examples:
"hello","user:name","escaped \\\" quote","\u0041"Raw (Bytes)
A hexadecimal string prefixed with
0x. Must contain an even number of hex digits (0-9, a-f, A-F) and at least two digits.Maps to
middleware::Value::Raw.Examples:
0xdeadbeef,0x00,0x0102030405060708090a0b0c0d0e0f10Invalid:
0xabc(odd length),0x(empty),0xGG(invalid hex chars)Boolean
The keywords
trueorfalse.Maps to
middleware::Value::Bool.Examples:
true,falseArray
A comma-separated list of literals enclosed in square brackets (
[]). Elements can be of mixed types.Maps to
middleware::Value::Array.Examples:
[],[1, "two", true],[ [10, 20], #["a", "b"] ]Set
A comma-separated list of literals enclosed in square brackets prefixed with a hash (
#[]). Elements can be of mixed types. The processor will validate uniqueness if required by the underlying middleware.Maps to
middleware::Value::Set.Examples:
#[],#[1, 2, "apple"],#[ 0x01, 0x02 ]Dictionary
A comma-separated list of
string_key: literal_valuepairs enclosed in curly braces ({}). Keys must be string literals.Maps to
middleware::Value::Dictionary.Examples:
{},{ "name": "Alice", "age": 30 },{ "data": [1,2,3], "metadata": { "version": "1.0" } }Anchored Keys
Anchored keys are used to refer to values within a POD (represented by a variable).
Syntax:
?PodVariable["key_literal"]or?PodVariable[?KeyVariable]?PodVariable: A Variable representing a POD."key_literal": A String Literal representing a fixed key.?KeyVariable: A Variable representing a key to be resolved at runtime.Examples:
?UserPod["name"]?DataPod[?fieldName]Top-Level Definitions
A Podlog document can contain multiple custom predicate definitions and at most one request definition.
Custom Predicate Definition
Defines a reusable piece of logic.
Syntax:
predicate_name(public_arg1, public_arg2, ..., private: private_arg1, private_arg2, ...) = CONJUNCTION_TYPE( statement1 statement2 ... )predicate_name: An Identifier for the predicate., private:and then a comma-separated list of private argument names (identifiers).pred(private: PrivA) = ...), or no arguments (e.g.,pred() = ...).?public_arg1) within the predicate's scope.CONJUNCTION_TYPE: EitherANDorOR.AND: All statements in the body must be true.OR: At least one statement in the body must be true.statement_list: One or more Statements enclosed in parentheses.Examples:
Request Definition
Defines the main set of statements to be proven by the prover. There can be at most one
REQUESTblock in a document.Syntax:
REQUEST( statement1 statement2 ... )statement_list: Optional. If present, one or more Statements enclosed in parentheses. Variables used here are implicitly defined and their bindings are sought by the prover.Example:
Statements
Statements are the core assertions or operations within custom predicate definitions and request blocks.
Native Predicate Calls
These calls invoke built-in operations provided by the POD2 middleware. Arguments to native predicates are typically Anchored Keys, except for
ValueOfwhich takes an Anchored Key and a Literal.Core Predicates (Directly map to middleware operations):
ValueOf(?PodVar["key"], LiteralValue)?PodVar["key"]is equal toLiteralValue.ValueOf(?Config["port"], 8080)Equal(?PodVar1["key1"], ?PodVar2["key2"])Equal(?User["id"], ?Order["userId"])NotEqual(?PodVar1["key1"], ?PodVar2["key2"])NotEqual(?Account["status"], "suspended")Lt(?PodVar1["key1"], ?PodVar2["key2"])value(?PodVar1["key1"]) < value(?PodVar2["key2"]).Lt(?Event["timestamp"], ?CutoffTime["value"])LtEq(?PodVar1["key1"], ?PodVar2["key2"])value(?PodVar1["key1"]) <= value(?PodVar2["key2"]).LtEq(?User["age"], ?MaxAge["limit"])Contains(?PodVar[?Container], ?PodVar2[?Key], ?PodVar2[?Value])?Valueat a location specified by?Key. In most cases, users should useDictContains,SetContains, orArrayContainsbased on the expected type of the container.Contains(?Data["record"], ?User["nameKey"], ?User["expectedName"])NotContains(?PodVar[?Container], ?PodVar2[?Key])?Keyis not present in?Container.NotContains(?Banlist["users"], ?User["id"])SumOf(?Target["keyT"], ?Source1["keyS1"], ?Source2["keyS2"])?Target["keyT"]equals the sum of the values at?Source1["keyS1"]and?Source2["keyS2"].SumOf(?Totals["all"], ?SubTotals["A"], ?SubTotals["B"])ProductOf(?Target["keyT"], ?Source1["keyS1"], ?Source2["keyS2"])?Target["keyT"]equals the product of the values at?Source1["keyS1"]and?Source2["keyS2"].ProductOf(?Area["rect"], ?Dimensions["width"], ?Dimensions["height"])MaxOf(?Target["keyT"], ?Source1["keyS1"], ?Source2["keyS2"])?Target["keyT"]is equal to either of the values at?Source1["keyS1"]or?Source2["keyS2"], whichever is higher.MaxOf(?Score["high"], ?Player1["score"], ?Player2["score"])HashOf(?Target["keyT"], ?Source1["keyS1"], ?Source2["keyS2"])value(?Target["keyT"])is the hash ofvalue(?Source1["keyS1"])andvalue(?Source2["keyS2"]).HashOf(?User["computedId"], ?User["email"], ?User["salt"])Syntactic Sugar Predicates (Translated by the processor):
These provide more convenient syntax for common operations and are translated into the core middleware predicates.
Gt(?PodVar1["key1"], ?PodVar2["key2"])value(?PodVar1["key1"]) > value(?PodVar2["key2"]).Lt(?PodVar2["key2"], ?PodVar1["key1"])(arguments are swapped).Gt(?User["score"], 1000)GtEq(?PodVar1["key1"], ?PodVar2["key2"])value(?PodVar1["key1"]) >= value(?PodVar2["key2"]).LtEq(?PodVar2["key2"], ?PodVar1["key1"])(arguments are swapped).GtEq(?Balance["amount"], ?Minimum["required"])DictContains(?DictVar["dict_key"], ?KeyToFind["key_name"], ?ExpectedValue["val_name"])value(?DictVar["dict_key"][?KeyToFind["key_name"]]) == value(?ExpectedValue["val_name"]).Contains(?DictVar["dict_key"], ?KeyToFind["key_name"], ?ExpectedValue["val_name"])(or similar, depending on exact mapping to generic Contains).DictContains(?UserPod["attributes"], ?_KeyNamePod["name"], ?_ExpectedValuePod["admin"])?_KeyNamePod["name"]might resolve to a literal key like "role" and?_ExpectedValuePod["admin"]to "admin".DictNotContains(?DictVar["dict_key"], ?KeyToFind["key_name"])?KeyToFind["key_name"]is not a key invalue(?DictVar["dict_key"]).NotContains(?DictVar["dict_key"], ?KeyToFind["key_name"]).DictNotContains(?UserPod["flags"], ?_FlagNamePod["is_banned"])ArrayContains(?ArrayVar["array_key"], ?IndexVar["idx_name"], ?ExpectedValue["val_name"])value(?ArrayVar["array_key"][?IndexVar["idx_name"]]) == value(?ExpectedValue["val_name"]).Contains(?ArrayVar["array_key"], ?IndexVar["idx_name"], ?ExpectedValue["val_name"]).ArrayContains(?LogPod["entries"], ?_IndexPod["target_idx"], ?_ValuePod["error_code_X"])(Note:
SetContainsandSetNotContainsare currently missing, pending an implementation for de-sugaring.)Custom Predicate Calls
Calls a previously defined custom predicate.
Syntax:
predicate_name(arg1, arg2, ...)predicate_name: The Identifier of the custom predicate to call.arg_list: A comma-separated list of arguments. Arguments can be Variables or Literals. The number and type of arguments must match the public argument definition of the called predicate.Example: