-
Notifications
You must be signed in to change notification settings - Fork 21
feat: asm rlpUtils #846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: asm rlpUtils #846
Conversation
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
rlputils/rlputils.zkasm
Outdated
| bytestring: | ||
| ;; check for empty bytestring: | ||
| if DATA_4 !=0 goto bytestring_non_empty_case | ||
| if DATA_1 !=0 goto exit_fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Inverted condition for empty bytestring check
The condition if DATA_4 !=0 goto bytestring_non_empty_case appears inverted based on the documentation. Line 92 states DATA_4 = empty_byte_string = DATA_1 == 0, meaning DATA_4 = 1 when the string is empty. However, the code jumps to bytestring_non_empty_case when DATA_4 is non-zero (empty), and falls through to fail on line 102 when DATA_4 is zero (non-empty). This causes empty byte strings to be processed as non-empty and non-empty byte strings to trigger exit_fail.
rlputils/rlputils.zkasm
Outdated
| ;; case 128 <= integer <= 2 ^128 | ||
| if DATA_5 != 1 goto exit_fail ;; sanity check that rlp_prefix_required = true | ||
| DATA_8 = log256_u128(DATA_2) | ||
| DATA_6 = (RLP_PREFIX_INT_SHORT + DATA_8) * TWO_FIVE_SIX_POWER_SIXTEEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Off-by-one error in integer RLP prefix calculation
The log256_u128 function returns a 0-indexed byte position (byte_size - 1), but DATA_8 is documented as leading_limb_byte_size. For example, log256_u128(128) returns 0, but 128 requires 1 byte. This causes the RLP prefix at line 67 to compute (0x80 + 0) instead of (0x80 + 1), producing prefix 0x80 when it should be 0x81. The same issue affects line 81-82 for two-limb integers. The assignments need + 1 to convert from byte position to byte size.
Additional Locations (1)
| var size u4 | ||
| size = log256_u128(DATA_1) | ||
| DATA_8 = size + 1 | ||
| rlp_prefix_tmp = rpad_128(RLP_PREFIX_INT_LONG + (DATA_3 * RLP_FROM_INT_TO_LIST) + size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Bytestring long prefix uses wrong byte size variable
Line 131 correctly computes DATA_8 = size + 1 to convert log256 output to byte count, but line 132 uses size instead of DATA_8 (or size + 1) in the RLP long prefix calculation. This causes the first prefix byte to be RLP_PREFIX_INT_LONG + size instead of RLP_PREFIX_INT_LONG + size + 1, resulting in an off-by-one error in the encoded prefix for long byte strings.
b125d7f to
8518042
Compare
DavePearce
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looking good. I would test the subfunctions count_nz_128 and rpad_128 etc in isolation. Also be good to have more detailed descriptions for "top-level" functions (e.g. for count_nz_128 but not needed for count_nz_64, etc).
| @@ -0,0 +1,42 @@ | |||
| ;; Right or Left Pad an input to a given size. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have some more informative description of what it does. Since our current constraints are almost entirely uncommented, I think it would be good to start with a strong culture of commenting in zkasm.
| if remaining != 0 goto exit_shift | ||
| goto exit_no_change | ||
| exit_shift: | ||
| res = rpad_128(256 * remaining) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, using recursion here seems good to me. It means you end up with upto 16 rows for each input (though some can be reused between different inputs of course). I was wondering whether there is a decomposition approach which could work, but didn't figure it out yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it can be done using a recursive decomposition (i.e. rather than the iterative style you have here). You split it into two pieces: (1) count number of leading zeros; (2) shift using existing approach. It might reduce the overall number of rows. But anyway, perhaps that's for another day. What you have here works well!
| ;; The interface with RLP* modules is as follow | ||
| ;; TODO, insert interface | ||
|
|
||
| ;; The RLP_UTILS module is a helper module to compute some RLP-ifications: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, what you have here seems reasonable enough to me. It starts to feel like a long function, so we could maybe break it up. But, I don't think its actually necessary (my oob implementation is longer than this!)
The only other thing I noticed is that you declared var variables in certain specific places, like this:
bytestring_prefix_required:
DATA_7 = 0
var input_size_lt56 u1
This is totally fine. BUT, it is worth noting that go-corset does not "reuse" registers at all. For example, say this:
fn f(x u8) -> (r u8) {
if x == 0 goto exit_0
var c u1
c, r = x + 1
return
exit_0:
var b u1
b,r = x - 1
return
}
Here, we end up with two columns b and c in the final constraints, when actually we could have just had one by doing this:
n f(x u8) -> (r u8) {
var cb u1
if x == 0 goto exit_0
var c u1
cb, r = x + 1
return
exit_0:
var b u1
cb, r = x - 1
return
}
I hope that, in the future, go-corset will do this kind of simplification for us (kinda like perspectives I suppose).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope that, in the future, go-corset will do this kind of simplification for us (kinda like perspectives I suppose).
That would be nice yes :)
util/byte_counting.zkasm
Outdated
| var low_useless u1 | ||
| var size_next u4 | ||
| high, low = limb | ||
| low_useless, size_next = size - 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I normally use either c (for carry) when adding or b (for borrow) when subtracting. It comes from hardware "flags registers".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see also this line low_useless, size_next = size - 8 doesn't compile because it considers all cases for size, plus requires a sign bit for the subtraction. Hence, you need enough bits to encode 31 - 8 plus the sign bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encoding you want I think is this:
low_useless, tmp, size_next = size - 8
Where tmp is an unused u1 column (which in fact is always 0).
| return_high: | ||
| var b u1 | ||
| var size_trm u4 | ||
| b, size_trm = size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, its a lot annoying to do this kind of trimming. In the future, we could have a cast operator (since in this case you already know its < 16); and/or we could support a shorthand syntax like _, size_trm = size which just adds the needed column for us.
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Signed-off-by: F Bojarski <[email protected]>
Note
Introduce a zkasm-based
rlputilsmodule (integer, bytestring, bytes32, data_pricing), wire it into RLP TXN, add needed constants/utilities, and remove the old Lisp implementation and related lookups/tables.rlputils/rlputils.zkasmimplementingRLP_UTILS_INST_INTEGER,RLP_UTILS_INST_BYTE_STRING_PREFIX,RLP_UTILS_INST_BYTES32,RLP_UTILS_INST_DATA_PRICING.util/byte_counting.zkasm,util/byte_size.zkasm,util/padding.zkasm.util/bit_shl.zkasmcomments.Makefileto userlputils/rlputils.zkasm(Cancun/Prague/Osaka builds).rlptxn/cancun/lookups/rlptxn_into_rlputils.lispto new flatrlputilsinterface (INST,DATA_1..8).constants/evm.zkasmwith RLP opcodes/prefixes andTWO_FIVE_SIX_POWER_FIFTEEN.rlputilsLisp columns/constants/constraints/lookups andrlputils/cancun/maxlog.zkasm.reftables/*/power.lispand related lookups intowcp/power.Written by Cursor Bugbot for commit cc38b34. This will update automatically on new commits. Configure here.