Skip to content

always replace SELF when copying statements#345

Merged
dgulotta merged 3 commits intomainfrom
self-id
Jul 22, 2025
Merged

always replace SELF when copying statements#345
dgulotta merged 3 commits intomainfrom
self-id

Conversation

@dgulotta
Copy link
Collaborator

Closes #342 .

@robknight
Copy link
Collaborator

robknight commented Jul 17, 2025

Are there any security implications to this? If I send someone a POD Request containing this magic value as a RawValue, will it get replaced with the POD ID? Or does this only apply when copying statements from an existing POD - so the statement starts out referring to SELF, but then refers to the POD ID it was copied from when it was copied? That feels pretty spooky to me.

I think this means that clients, including the solver, would need to become aware of the effect of this constant. As it stands, I think this breaks the use-case of "run the request through the solver again, but with the response as the sole input", because now the request might stipulate a different value to the one returned in the response.

I'm still pretty skeptical of the use-case that motivates this, and of adding magic. In isolation each bit of magic seems reasonable, until there's enough magic that it becomes hard to reason about what's going on.

@dgulotta
Copy link
Collaborator Author

dgulotta commented Jul 17, 2025

I don't think of #342 as being tied to a specific use case of self-referential pods. A custom statement could take a pod ID, and be agnostic as to whether that ID is SELF. Currently, the system will produce an incorrect statement if the ID happens to be SELF.

@robknight
Copy link
Collaborator

robknight commented Jul 17, 2025

I don't think it's really incorrect - if I make a POD Request which says REQUEST(some_custom_pred(1)) then the resulting statement will contain the number 1. This is what I would expect. Having a resulting statement which contains a different value to the one that was in the request is what would be surprising to me.

@robknight
Copy link
Collaborator

Having thought a bit more about it: would it be possible to go back to the earlier solution of having SELF as a special-case wildcard, rather than a special-case literal?

Representing placeholder values as wildcards just seems much cleaner, and "wildcards in requests become literal values in responses" is an existing pattern, whereas "literal values can change to become other literal values" isn't.

use plonky2::{
field::types::{Field, PrimeField64},
field::{
goldilocks_field::GoldilocksField,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We rename this type as F here

pub type F = GoldilocksField;

And then re-export it in the middleware here

pub use crate::backends::plonky2::basetypes::*;

To be consistent could you remove this import and rewrite the SELF_ID_HASH as [F(0x5), F(0xe), ...?

let is_self = builder.is_equal_flattenable(&self_value, &first);
let normalize = builder.and(is_ak, is_self);
let first_normalized = builder.select_flattenable(params, normalize, self_id, &first);
let first_normalized = builder.select_flattenable(params, is_self, self_id, &first);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also update the documentation and implementation of the native version of this function at

pod2/src/middleware/mod.rs

Lines 769 to 783 in 143a8c9

/// Replace references to SELF by `self_id` in anchored keys of the statement.
pub fn normalize_statement(statement: &Statement, self_id: PodId) -> Statement {
let predicate = statement.predicate();
let args = statement
.args()
.iter()
.map(|sa| match &sa {
StatementArg::Key(AnchoredKey { pod_id, key }) if *pod_id == SELF => {
StatementArg::Key(AnchoredKey::new(self_id, key.clone()))
}
_ => sa.clone(),
})
.collect();
Statement::from_args(predicate, args).expect("statement was valid before normalization")
}

@ed255
Copy link
Collaborator

ed255 commented Jul 17, 2025

Are there any security implications to this? If I send someone a POD Request containing this magic value as a RawValue, will it get replaced with the POD ID? Or does this only apply when copying statements from an existing POD - so the statement starts out referring to SELF, but then refers to the POD ID it was copied from when it was copied? That feels pretty spooky to me.

If someone sends a POD Request that contains a statement with a literal RawValue with this new magic value, you'll create a pod that exposes a public statement where that magic value will be replaced by the pod_id.
But this replacement happens in a controlled manner. The MainPod that generates that statement will lay out the statement with the magic value (no replacement), and check all the operations with the magic value.
Then the API method to get the statements of the pod will replace the magic value by self.

At the same time, when you use that pod as input to another pod, the circuit will apply this replacement before operating on those statements.

I think this means that clients, including the solver, would need to become aware of the effect of this constant. As it stands, I think this breaks the use-case of "run the request through the solver again, but with the response as the sole input", because now the request might stipulate a different value to the one returned in the response.

Could you elaborate on that?

One way to think about this is that the SELF_ID_HASH is a special value that means: put the pod_id here.

I'm still pretty skeptical of the use-case that motivates this, and of adding magic. In isolation each bit of magic seems reasonable, until there's enough magic that it becomes hard to reason about what's going on.

I don't disagree about this being a little bit of magic. I just hope we discuss and figure out if this breaks anything that can't be solved nicely!

To me this PR is fixing a bug described in #342

The bug is the following. Imagine the following custom predicates

puzzle_ok(pod_id) = AND(
  ProductOf(123456789, ?pod_id["a"], ?pod_id["b"])
)

know_sk(pk, pod_id) = AND(
  PublicKeyOf(?pk, ?pod_id["sk"])
)

my_pred(pk, private: pod_id) = AND(
  puzzle_ok(?pod_id)
  know_sk(?pk, ?pod_id)
)

The intention of those is that if I can create a my_pred statement, then a single pod with id=pod_id exists such that it fulfills both predicates puzzle_ok and know_sk.

But in the current implementation I'm able to generate the statements puzzle_ok and know_sk in 2 different pods (with different pod_ids) and then combine them to generate my_pred in a third pod. Because all those pod_id that appear as arguments to the custom predicates will stay at 1, because I originally built them using SELF.

After writing this example I realize it conveys the idea, but it's weird. But I believe that once we have the PublicKeyOf statement, the Entries of a MainPod have new semantics. Because proving the knowledge of a public key in a MainPod can be seen as a way of signing with that key all the Entries of the pod. And then it's a bug if I can say that several statements are proved together in the same pod when in reality those statements have been proven independently.

@ed255
Copy link
Collaborator

ed255 commented Jul 17, 2025

Having thought a bit more about it: would it be possible to go back to the earlier solution of having SELF as a special-case wildcard, rather than a special-case literal?

I think that implies some kind of type checking in-circuit. You mean that a value used as pod_id to be placed as a wildcard for the pod_id of an anchored key is different than a value to be placed as a wildcard in a value slot right?

Currently the argument to the statement doesn't distinguish those cases. We'd need to add some typing information to do that (that's the solution 1 in #342 (comment)).

Basically in the StatementArg we'd have None, Literal(Value), PodId(PodId), Key(AnchoredKey).
Each argument is represented as 2 x 4 field elements. We'd need to distinguish the case PodId(PodId) from the rest and only apply the replacement in such a case.

Is this what you mean?

The current implementation is simple because we don't distinguish Literal(Value) and PodId(PodId). Right now I don't know how big of a change would that be.

@robknight
Copy link
Collaborator

robknight commented Jul 17, 2025

I think this means that clients, including the solver, would need to become aware of the effect of this constant. As it stands, I think this breaks the use-case of "run the request through the solver again, but with the response as the sole input", because now the request might stipulate a different value to the one returned in the response.

Could you elaborate on that?

If I write REQUEST(some_pred(SOME_LITERAL)) and send this to someone, then I want to be able to validate the response by running the same request locally, with the response POD as the sole input. But if the response contains a different literal, then it won't work.

We already have a system for dealing with prover-supplied values, which is wildcards. REQUEST(some_pred(?some_wildcard)) can come back with a prover-supplied value in the some_pred statement without causing a problem. This is a really neat division that is easy to reason about - literals are literals and never change, and wildcards are always "replaced" by literals during proving.

After writing this example I realize it conveys the idea, but it's weird. But I believe that once we have the PublicKeyOf statement, the Entries of a MainPod have new semantics. Because proving the knowledge of a public key in a MainPod can be seen as a way of signing with that key all the Entries of the pod. And then it's a bug if I can say that several statements are proved together in the same pod when in reality those statements have been proven independently.

This seems wrong to me. PublicKeyOf is not equivalent to a signature, for the reason that it doesn't say anything about the data that it "signs". You can freely copy the PublicKeyOf statement to any other MainPod, unless you only ever use it as as private statement. Aside from this, I'm not sure why we would want to do this.

We already have a system for signing things - SignedPod! If you want to make a signature over some data, you can create a SignedPod to do it! If you want to sign a message saying something like "I endorse the contents of this MainPod", then you can make a SignedPod with an entry whose value is the hash of the MainPod. This also has the advantage of being much cheaper than making a new MainPod if you just want to prove that you control the keypair of a public key mentioned in the MainPod whilst also, say, signing a nonce to prevent credential reuse.

Having thought a bit more about it: would it be possible to go back to the earlier solution of having SELF as a special-case wildcard, rather than a special-case literal?

I think that implies some kind of type checking in-circuit. You mean that a value used as pod_id to be placed as a wildcard for the pod_id of an anchored key is different than a value to be placed as a wildcard in a value slot right?

I'm just thinking of the special SELF wildcard, which is what we had before the literals-in-statements change. That wildcard can become the real POD ID during proving.

@artwyman
Copy link
Collaborator

I think I'd need to have a deeper in-person conversation to fully understand the issue under discussion here, so take my thoughts as only loosely-held opinions. But I've been following along and have some knee-jerk reactions.

  1. Replacing a magic literal does feel risky to me, though I don't have a specific problematic case in mind. If podlang already has a notion of wildcards which will be filled in by the prover that seems like a better conceptual fit here. At least enough to be worth exploring.

  2. I do think that it's useful for the prover of a MainPOD to have some way to identify themselves and "attest" to the contents of that MainPOD, without signing another SignedPOD to do it. That's specifically so that copied statements don't have the same value as the original. This is something we can discuss at the use case level, but I think it's an important option when taking PODs from "statement of fact" to "statement of intent".

PublicKeyOf is one way to get there. Rob and I talked through some ways to get there using the RSAPOD or a known secret entry some other POD. I think the construction right now requires a custom predicate in order to ensure the known value stays private, while confirming it came from the same MainPOD rather than being copied from another one. A part of me wants this concept of "prover identity" to be more of a first-class notion in the language.

@ed255
Copy link
Collaborator

ed255 commented Jul 18, 2025

I think this means that clients, including the solver, would need to become aware of the effect of this constant. As it stands, I think this breaks the use-case of "run the request through the solver again, but with the response as the sole input", because now the request might stipulate a different value to the one returned in the response.

Could you elaborate on that?

If I write REQUEST(some_pred(SOME_LITERAL)) and send this to someone, then I want to be able to validate the response by running the same request locally, with the response POD as the sole input. But if the response contains a different literal, then it won't work.

I see, I understand that now.

This may be a bit of stretch but currently if you do REQUEST(Equal(SELF["foo"], bar)) you'll encounter the same issue, as SELF will be replaced by the pod_id value. Do you see this as problematic?

Technically you could simulate the environment of the prover (where you still have SELF instead of the pod_id) if you use pod.pub_self_statements() instead of pod.pub_statements. So perhaps there's no technical blocker here, but more a design that is confusing / not intuitive?

We already have a system for dealing with prover-supplied values, which is wildcards. REQUEST(some_pred(?some_wildcard)) can come back with a prover-supplied value in the some_pred statement without causing a problem. This is a really neat division that is easy to reason about - literals are literals and never change, and wildcards are always "replaced" by literals during proving.

I don't see the SELF as competing against wildcards. I see the SELF as a way to internally refer to the pod_id which you don't know yet. And in order for pod_ids to be unique, this SELF must become unique after it appears in conjunction with statements from other pods (hence the replacement).

After writing this example I realize it conveys the idea, but it's weird. But I believe that once we have the PublicKeyOf statement, the Entries of a MainPod have new semantics. Because proving the knowledge of a public key in a MainPod can be seen as a way of signing with that key all the Entries of the pod. And then it's a bug if I can say that several statements are proved together in the same pod when in reality those statements have been proven independently.

This seems wrong to me. PublicKeyOf is not equivalent to a signature, for the reason that it doesn't say anything about the data that it "signs". You can freely copy the PublicKeyOf statement to any other MainPod, unless you only ever use it as as private statement. Aside from this, I'm not sure why we would want to do this.

We already have a system for signing things - SignedPod! If you want to make a signature over some data, you can create a SignedPod to do it! If you want to sign a message saying something like "I endorse the contents of this MainPod", then you can make a SignedPod with an entry whose value is the hash of the MainPod. This also has the advantage of being much cheaper than making a new MainPod if you just want to prove that you control the keypair of a public key mentioned in the MainPod whilst also, say, signing a nonce to prevent credential reuse.

I understand what you say. I also wonder what use-cases PublicKeyOf uncovers that can't be realized with SignedPods.
But I disagree with saying that "PublicKeyOf is not equivalent to a signature". I think there's a way to use the PublicKeyOf such that the only way to make a MainPod is by knowing the a secret key corresponding to a public key without revealing the secret key. Whether this construction is useful or not, I'm not sure because we already have SignedPods. I would prefer moving the discussion about the usefulness of PublicKeyOf to another thread if possible. I used it in my example for simplicity. Something similar could be achieved with an introduction pod that derives a public key, or a statement shows knowledge of a hash pre-image without revealing it. And of course this needs to be properly constructed such that the knowledge proof can't be isolated and reused.

Having thought a bit more about it: would it be possible to go back to the earlier solution of having SELF as a special-case wildcard, rather than a special-case literal?

I think that implies some kind of type checking in-circuit. You mean that a value used as pod_id to be placed as a wildcard for the pod_id of an anchored key is different than a value to be placed as a wildcard in a value slot right?

I'm just thinking of the special SELF wildcard, which is what we had before the literals-in-statements change. That wildcard can become the real POD ID during proving.

Ah, you mean when an anchored key in a statement template argument was something like (SelfOrWildcard, KeyOrWildcard)?

Copy link
Collaborator

@ed255 ed255 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dgulotta dgulotta merged commit 89dfc4e into main Jul 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pod IDs in custom statements

4 participants