-
Notifications
You must be signed in to change notification settings - Fork 473
Add the transaction-level account index of erroring programs to TransactionError::InstructionError
#6083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
If this PR represents a change to the public RPC API:
Thank you for keeping the RPC clients in sync with the server API @steveluscher. |
@@ -645,6 +645,106 @@ zstd = "0.13.3" | |||
opt-level = 3 | |||
|
|||
[patch.crates-io] | |||
# The following entries are auto-generated by /bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore this. This will disappear when anza-xyz/solana-sdk#74 is released.
let invalid_instruction_data_error = TransactionError::InstructionError( | ||
index, | ||
InstructionError::InvalidInstructionData, | ||
Some(instruction.program_id_index), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the SVMInstruction
already knows the transaction-wide index of its program, but please double check me on this. cc/ @buffalojoec
macro_rules! test { | ||
($instructions:expr, $expected_result:expr $(,)?) => { | ||
for feature_set in [FeatureSet::default(), FeatureSet::all_enabled()] { | ||
test!($instructions, $expected_result, &feature_set); | ||
} | ||
}; | ||
($instructions:expr, $expected_result:expr, $feature_set:expr $(,)?) => { | ||
__test_inner!($instructions, $feature_set, |result| { | ||
assert_eq!(result, $expected_result); | ||
}); | ||
}; | ||
($instructions:expr, $expected_result:pat $(,)?) => { | ||
for feature_set in [FeatureSet::default(), FeatureSet::all_enabled()] { | ||
test!($instructions, $expected_result, &feature_set); | ||
} | ||
}; | ||
($instructions:expr, $expected_result:pat if $guard:expr $(,)?) => { | ||
for feature_set in [FeatureSet::default(), FeatureSet::all_enabled()] { | ||
test!($instructions, $expected_result if $guard, &feature_set); | ||
} | ||
}; | ||
($instructions:expr, $expected_result:pat, $feature_set:expr $(,)?) => { | ||
__test_inner!($instructions, $feature_set, |result| { | ||
assert_matches!(result, $expected_result); | ||
}); | ||
}; | ||
($instructions:expr, $expected_result:pat if $guard:expr, $feature_set:expr $(,)?) => { | ||
__test_inner!($instructions, $feature_set, |result| { | ||
assert_matches!(result, $expected_result if $guard); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just me expanding these macros to support patterns.
In general, throughout these tests, you'll see me use assert_matches
and a pattern for the program account index because:
- It ensures that the tests will not break if the index of the program changes
- The absolute index of the responsible program is tested in
transaction_processor.rs
so we don't need to keep testing it over and over everywhere else.
@@ -129,6 +129,7 @@ impl RpcSender for MockSender { | |||
Err(TransactionError::InstructionError( | |||
0, | |||
InstructionError::UninitializedAccount, | |||
Some(42), // Mock responsible program account index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
svm/src/message_processor.rs
Outdated
let transaction_context: &TransactionContext = invoke_context.transaction_context; | ||
let responsible_program_account_index = transaction_context | ||
// By definition the last instruction (outer or inner) in the trace before the trace | ||
// stopped being appended to is the one that encountered an error. | ||
.get_instruction_trace_length() | ||
.checked_sub(1) | ||
.and_then(|index_in_trace| { | ||
transaction_context | ||
.get_instruction_context_at_index_in_trace(index_in_trace) | ||
.ok() | ||
}) | ||
// The last program address in the instruction is that of the program being called. | ||
.and_then(|ctx| ctx.get_last_program_key(transaction_context).ok()) | ||
// The order of program accounts in the `TransactionContext` has no relation to the | ||
// order of the program accounts in the original message. It's the index in the | ||
// message that we need. | ||
.and_then(|errored_program_pubkey| { | ||
message | ||
.account_keys() | ||
.iter() | ||
.position(|message_pubkey| message_pubkey.eq(errored_program_pubkey)) | ||
}); | ||
TransactionError::InstructionError( | ||
top_level_instruction_index as u8, | ||
err, | ||
responsible_program_account_index.map(|i| i as u8), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the code that actually captures the account index of the erroring program, and only if there appeared an error.
The key insight is here:
By definition the last instruction (outer or inner) in the trace before the trace stopped being appended to is the one that encountered an error.
// This mock program takes in a list of programs and two bytes of data: | ||
// (1) the index of the program that should throw an error | ||
// (2) the index of the program being called | ||
// | ||
// The programs get executed - two at each CPI call depth - like this: | ||
// | ||
// Program 0 | ||
// -- Program 1 | ||
// -- Program 2 | ||
// ---- Program 3 | ||
// ---- Program 4 | ||
// ------ Program 5 | ||
// ------ Program 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the meat of the test. Essentially I created a program that calls itself over and over as diagrammed in the comment. You can pass it the index at which you want it to throw an error. The test then makes sure that the TransactionError::InstructionError
indicates that the error was thrown from the program whose account index is that one exactly.
accounts: (0..PROGRAM_ADDRESSES.len()) | ||
.map(|index| (PROGRAM_ADDRESSES[index], mock_program_account.clone())) | ||
.collect_vec(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we know that the program accounts are in the order we expect them to be from the perspective of the transaction.
&[Instruction::new_with_bytes( | ||
PROGRAM_ADDRESSES[base_program_index], | ||
&[ | ||
index_of_program_that_should_throw_exception, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how you tell the program ‘I want you to throw an error when you get to the program with this index.’
@@ -70,6 +70,7 @@ message InstructionError { | |||
uint32 index = 1; | |||
InstructionErrorType error = 2; | |||
CustomError custom = 3; | |||
optional uint32 responsible_program_account_index = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require a feature gate before we start writing to it?
{index_of_program_that_should_throw_exception}." | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably write two additional tests:
- Test that the kind of instruction error that would blow up without a program at all (eg.
NotEnoughAccountKeys
or whatever) returns aNone
for the program account index. - Test that it still gets the right index when the program is in an address lookup table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds exactly right to me, along with what I suggested in another comment, about multiple top-level instructions
Can't they look at the inner ix metadata to figure this out? |
Yes! But not without a round trip to |
Well they had to get the status anyways, right? |
Two cases where it's not correct to also fetch the entire transaction:
Speculatively fetching |
is this not breaking several public interfaces? |
In any case, we'll go to 3.0.0 of |
f7a6f80
to
eebeab3
Compare
Updated the PR to include the index of the inner instruction, if applicable. |
…oring program is added to `TransactionError::InstructionError`, patch up all of the tests
eebeab3
to
0f963cf
Compare
uint32 index = 1; | ||
InstructionErrorType error = 2; | ||
CustomError custom = 3; | ||
optional uint32 inner_instruction_index = 4; | ||
optional uint32 responsible_program_account_index = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steviez is there something fancy I should be doing here? Creating an option (1 byte) and a uint32
(4 bytes) just to store one byte of data sort of stinks. Could I instead hijack the remaining 24 bytes of uint32 index = 1
to do this?
xxxxxxxx abbbbbbb bcdddddd dd------
x - existing 8 bytes for `index`
a - option flag for `inner_instruction_index`
b - new 8 bytes for `inner_instruction_index`
c - option flag for `responsible_program_account_index`
d - new 8 bytes for `responsible_program_account_index`
…then just figure it all out in convert.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Mostly small questions and comments.
I'll defer to Joe about the SVM message bit and to Steve about the proto bit since they'll know better. For what it's worth, the current change seems reasonable to me in both cases.
@@ -29,6 +29,7 @@ crate-type = ["lib"] | |||
name = "solana_compute_budget_instruction" | |||
|
|||
[dev-dependencies] | |||
assert_matches = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this change intended?
@@ -35,6 +35,7 @@ solana-instruction = { workspace = true } | |||
solana-message = { workspace = true } | |||
solana-pubkey = { workspace = true } | |||
solana-rpc-client-api = { workspace = true } | |||
solana-sdk-ids = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this change intended?
@@ -12,6 +12,7 @@ edition = { workspace = true } | |||
[dependencies] | |||
ahash = { workspace = true } | |||
itertools = { workspace = true } | |||
lazy_static = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there's work to remove this everywhere #6049 in favor of LazyLock
https://doc.rust-lang.org/std/sync/struct.LazyLock.html, so we probably shouldn't add it
// By definition the last instruction (outer or inner) in the trace before the trace | ||
// stopped being appended to is the one that encountered an error. | ||
.get_instruction_trace_length() | ||
.checked_sub(1) | ||
.and_then(|index_in_trace| { | ||
transaction_context | ||
.get_instruction_context_at_index_in_trace(index_in_trace) | ||
.ok() | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my lack of understanding, but would it be simpler to call get_current_instruction_context
? On the flip side, it looks like it boils down to almost exactly the same code
enum InnerInstructionIndexSearchState { | ||
SearchingForTopLevelInstruction( | ||
usize, // Index of top-level instruction being sought next | ||
), | ||
InTopLevelInstruction( | ||
Option<u8>, // Inner instruction index | ||
), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand the point of the first variant of this enum. It's only ever used for 0
, and never incremented, which makes me think there's either a bug, or it doesn't need an index, and we can just initialize to InTopLevelInstruction(None)
.
It might be simpler and more legible to just have an Option<u8>
with the current inner instruction index, which is reset to None
whenever we get to a top-level ix, and otherwise incremented. What do you think?
Would it be worth also adding a test with multiple top-level instructions to make sure this logic works?
{index_of_program_that_should_throw_exception}." | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds exactly right to me, along with what I suggested in another comment, about multiple top-level instructions
tbh i care zero about either of those (they're leaving validator/monorepo soon enough). this breaks the rust sdk and all binary consumers. we can't "just bump major" because the validator is imposing the change on all consumers |
Problem
Consider a transaction error that originates from a cross-program invocation (ie. an inner instruction). Currently
TransactionError
returns you the index of the outer instruction, which in no way helps you to correlate the content of the error message with the actual program from whence it came.Consider this failed transaction. The index of the failure points to instruction
#4
(ie. the instruction at index 3).This is a bit disingenuous because the error didn't actually emanate from instruction
#4
(programJUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4
), it came from instruction#4.5
(programwhirLbMiicVdio4qvUfM5KAg6Ct8VwpYzGff3uctyCc
). If you tried to decode this as a program error ofJUP6Lk...
…you will get the wrong result. That method will return
false
instead oftrue
because of the wrong choice ofprogramAddress
.Given the existing situation (ie. the server does not vend the address of the program that threw the error) developers have no choice but to parse logs to try to figure out the program from which the custom error came.
See also anza-xyz/kit#149
Summary of Changes
This PR adds the transaction-level index of the program from which the error came to the
TransactionError::InstructionError
. This is added to storage as anOption<u32>
since it will not be possible to backfill old transactions' erroring program account indices.The account index of the program is designed to be from the perspective of the transaction and not the perspective of the instruction to make this change safe from SIMD-163.
It also adds the index of the inner instruction, in cases where the error originated from a CPI.
Until v3.0.0 of
solana-transaction-error
is released, you can test this PR locally by checking anza-xyz/solana-sdk#74 out locally at the path../solana-sdk/
relative toagave
.Despite CI results in GitHub, if you link in the not-yet-released version of
solana-transaction-error
as described above, the tests all pass.Notes to reviewers
You will really want to go through the ‘Commits’ tab in GitHub, because I separated out the interesting change from the ‘now I have to massage all the tests’ change.
Questions
Depends on the release of anza-xyz/solana-sdk#74.
Implements #5152.
Addresses anza-xyz/kit#149.