Add the inner instruction index and address of erroring programs to `TransactionError::InstructionError` #6083

steveluscher · 2025-05-03T00:42:54Z

Problem

Consider a transaction error that originates from a cross-program invocation (ie. an inner instruction). Currently TransactionError returns you the index of the outer instruction, which in no way helps you to correlate the content of the error message with the actual program from whence it came.

Consider this failed transaction. The index of the failure points to instruction #4 (ie. the instruction at index 3).

{
    "value": [{
        "confirmationStatus": "finalized",
        "confirmations": null,
        "err": {
            "InstructionError": [ 3, { "Custom": 6038 } ]
        },
        "slot": 323282082,
        "status": {
            "Err": {
                "InstructionError": [ 3, { "Custom": 6038 } ]
            }
        }
    }]
}

This is a bit disingenuous because the error didn't actually emanate from instruction #4 (program JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4), it came from instruction #4.5 (program whirLbMiicVdio4qvUfM5KAg6Ct8VwpYzGff3uctyCc). If you tried to decode this as a program error of JUP6Lk...

if (isProgramError(
    error,
    transactionMessage,
    address('JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4'),
    TickArraySequenceInvalidIndex.code,
)) {
    // ...
}

…you will get the wrong result. That method will return false instead of true because of the wrong choice of programAddress.

Given the existing situation (ie. the server does not vend the address of the program that threw the error) developers have no choice but to parse logs to try to figure out the program from which the custom error came.

Summary of Changes

This PR adds:

the inner instruction index of the erroring program.
the transaction-level index of the program from which the error came to the TransactionError::InstructionError.

The account index of the program is designed to be from the perspective of the transaction and not the perspective of the instruction to make this change safe from SIMD-163.

Until v3.0.0 of solana-transaction-error is released, you can test this PR locally by checking anza-xyz/solana-sdk#74 out locally at the path ../solana-sdk/ relative to agave.

Despite CI results in GitHub, if you link in the not-yet-released version of solana-transaction-error as described above, the tests all pass.

Notes to reviewers

You will really want to go through the ‘Commits’ tab in GitHub, because I separated out the interesting change from the ‘now I have to massage all the tests’ change.

Questions

Does this whole thing require a feature gate before we start including the program account index and writing it into storage?

Depends on the release of anza-xyz/solana-sdk#74.
Implements #5152.
Addresses anza-xyz/kit#149.

mergify · 2025-05-03T00:43:29Z

If this PR represents a change to the public RPC API:

Make sure it includes a complementary update to rpc-client/ (example)
Open a follow-up PR to update the JavaScript client @solana/kit (example)

Thank you for keeping the RPC clients in sync with the server API @steveluscher.

steveluscher · 2025-05-03T00:46:31Z

Cargo.toml

@@ -645,6 +645,106 @@ zstd = "0.13.3"
 opt-level = 3

 [patch.crates-io]
+# The following entries are auto-generated by /bin/bash


Ignore this. This will disappear when anza-xyz/solana-sdk#74 is released.

steveluscher · 2025-05-03T00:47:59Z

compute-budget-instruction/src/compute_budget_instruction_details.rs

+        let invalid_instruction_data_error = TransactionError::InstructionError(
+            index,
+            InstructionError::InvalidInstructionData,
+            Some(instruction.program_id_index),


It seems that the SVMInstruction already knows the transaction-wide index of its program, but please double check me on this. cc/ @buffalojoec

https://github.com/steveluscher/agave/blob/f7a6f80ff632216ceb82bcbd95a9a8c1d9855987/svm-transaction/src/instruction.rs#L8-L9

Yes! It's a part of sanitization.
https://github.com/steveluscher/agave/blob/f7a6f80ff632216ceb82bcbd95a9a8c1d9855987/svm-transaction/src/svm_message/sanitized_message.rs#L29
https://github.com/anza-xyz/solana-sdk/blob/cd5095ff4531a15508577f216dddfad156e7fd8b/message/src/versions/sanitized.rs#L38-L41
https://github.com/anza-xyz/solana-sdk/blob/cd5095ff4531a15508577f216dddfad156e7fd8b/message/src/legacy.rs#L227-L233

steveluscher · 2025-05-03T00:49:42Z

compute-budget-instruction/src/instructions_processor.rs

+    macro_rules! test {
+        ($instructions:expr, $expected_result:expr $(,)?) => {
+            for feature_set in [FeatureSet::default(), FeatureSet::all_enabled()] {
+                test!($instructions, $expected_result, &feature_set);
+            }
+        };
+        ($instructions:expr, $expected_result:expr, $feature_set:expr $(,)?) => {
+            __test_inner!($instructions, $feature_set, |result| {
+                assert_eq!(result, $expected_result);
+            });
+        };
+        ($instructions:expr, $expected_result:pat $(,)?) => {
+            for feature_set in [FeatureSet::default(), FeatureSet::all_enabled()] {
+                test!($instructions, $expected_result, &feature_set);
+            }
+        };
+        ($instructions:expr, $expected_result:pat if $guard:expr $(,)?) => {
+            for feature_set in [FeatureSet::default(), FeatureSet::all_enabled()] {
+                test!($instructions, $expected_result if $guard, &feature_set);
+            }
+        };
+        ($instructions:expr, $expected_result:pat, $feature_set:expr $(,)?) => {
+            __test_inner!($instructions, $feature_set, |result| {
+                assert_matches!(result, $expected_result);
+            });
+        };
+        ($instructions:expr, $expected_result:pat if $guard:expr, $feature_set:expr $(,)?) => {
+            __test_inner!($instructions, $feature_set, |result| {
+                assert_matches!(result, $expected_result if $guard);
+            });


This is just me expanding these macros to support patterns.

In general, throughout these tests, you'll see me use assert_matches and a pattern for the program account index because:

It ensures that the tests will not break if the index of the program changes

The absolute index of the responsible program is tested in transaction_processor.rs so we don't need to keep testing it over and over everywhere else.

steveluscher · 2025-05-03T00:50:44Z

rpc-client/src/mock_sender.rs

@@ -129,6 +129,7 @@ impl RpcSender for MockSender {
                    Err(TransactionError::InstructionError(
                        0,
                        InstructionError::UninitializedAccount,
+                        Some(42), // Mock responsible program account index.


Total missed opportunity to Sum 41.

steveluscher · 2025-05-03T00:52:46Z

svm/src/message_processor.rs

+            let transaction_context: &TransactionContext = invoke_context.transaction_context;
+            let responsible_program_account_index = transaction_context
+                // By definition the last instruction (outer or inner) in the trace before the trace
+                // stopped being appended to is the one that encountered an error.
+                .get_instruction_trace_length()
+                .checked_sub(1)
+                .and_then(|index_in_trace| {
+                    transaction_context
+                        .get_instruction_context_at_index_in_trace(index_in_trace)
+                        .ok()
+                })
+                // The last program address in the instruction is that of the program being called.
+                .and_then(|ctx| ctx.get_last_program_key(transaction_context).ok())
+                // The order of program accounts in the `TransactionContext` has no relation to the
+                // order of the program accounts in the original message. It's the index in the
+                // message that we need.
+                .and_then(|errored_program_pubkey| {
+                    message
+                        .account_keys()
+                        .iter()
+                        .position(|message_pubkey| message_pubkey.eq(errored_program_pubkey))
+                });
+            TransactionError::InstructionError(
+                top_level_instruction_index as u8,
+                err,
+                responsible_program_account_index.map(|i| i as u8),
+            )


This is the code that actually captures the account index of the erroring program, and only if there appeared an error.

The key insight is here:

By definition the last instruction (outer or inner) in the trace before the trace stopped being appended to is the one that encountered an error.

steveluscher · 2025-05-03T00:54:31Z

svm/src/transaction_processor.rs

+        // This mock program takes in a list of programs and two bytes of data:
+        //   (1) the index of the program that should throw an error
+        //   (2) the index of the program being called
+        //
+        // The programs get executed - two at each CPI call depth - like this:
+        //
+        // Program 0
+        // -- Program 1
+        // -- Program 2
+        // ---- Program 3
+        // ---- Program 4
+        // ------ Program 5
+        // ------ Program 6


This is the meat of the test. Essentially I created a program that calls itself over and over as diagrammed in the comment. You can pass it the index at which you want it to throw an error. The test then makes sure that the TransactionError::InstructionError indicates that the error was thrown from the program whose account index is that one exactly.

steveluscher · 2025-05-03T00:55:48Z

svm/src/transaction_processor.rs

+            accounts: (0..PROGRAM_ADDRESSES.len())
+                .map(|index| (PROGRAM_ADDRESSES[index], mock_program_account.clone()))
+                .collect_vec(),


This is how we know that the program accounts are in the order we expect them to be from the perspective of the transaction.

steveluscher · 2025-05-03T00:56:26Z

svm/src/transaction_processor.rs

+            &[Instruction::new_with_bytes(
+                PROGRAM_ADDRESSES[base_program_index],
+                &[
+                    index_of_program_that_should_throw_exception,


This is how you tell the program ‘I want you to throw an error when you get to the program with this index.’

steveluscher · 2025-05-03T00:58:08Z

storage-proto/proto/transaction_by_addr.proto

@@ -70,6 +70,7 @@ message InstructionError {
    uint32 index = 1;
    InstructionErrorType error = 2;
    CustomError custom = 3;
+    optional uint32 responsible_program_account_index = 4;


Does this require a feature gate before we start writing to it?

steveluscher · 2025-05-03T01:16:39Z

@Lichtso I made this the account index of the program from the perspective of the transaction and not the perspective of the instruction to make it safe from SIMD-163.

steveluscher · 2025-05-03T01:27:01Z

svm/src/transaction_processor.rs

+            {index_of_program_that_should_throw_exception}."
+        );
+    }
+


I should probably write two additional tests:

Test that the kind of instruction error that would blow up without a program at all (eg. NotEnoughAccountKeys or whatever) returns a None for the program account index.

Test that it still gets the right index when the program is in an address lookup table

That sounds exactly right to me, along with what I suggested in another comment, about multiple top-level instructions

jstarry · 2025-05-05T01:39:49Z

Given the existing situation (ie. the server does not vend the address of the program that threw the error) developers have no choice but to parse logs to try to figure out the program from which the custom error came.

Can't they look at the inner ix metadata to figure this out?

steveluscher · 2025-05-05T04:33:41Z

Can't they look at the inner ix metadata to figure this out?

Yes! But not without a round trip to getTransaction() to fetch that metadata, which is definitely not what we want to do if we want to keep apps performant and reliable.

jstarry · 2025-05-05T05:28:33Z

Yes! But not without a round trip to getTransaction() to fetch that metadata, which is definitely not what we want to do if we want to keep apps performant and reliable.

Well they had to get the status anyways, right?

steveluscher · 2025-05-05T06:07:21Z

Well they had to get the status anyways, right?

Two cases where it's not correct to also fetch the entire transaction:

You're fetching getSignatureStatuses and you don't know that the status is an error yet.
You're running one of several default transaction confirmation routines.

Speculatively fetching getTransaction() in these cases would be wasteful and would increase global load on RPCs if all apps did it.

t-nelson · 2025-05-05T14:24:12Z

is this not breaking several public interfaces?

steveluscher · 2025-05-05T19:49:58Z

is this not breaking several public interfaces?

RPC API: Anything fetching a transaction status where that status contains an err property currently expects a tuple of (instructionIndex: number, err: InstructionError). This change would add a third element to the tuple. Existing clients would continue to expect 2, use 2, and ignore the third. Non breaking.
Code: After this change, any type position that uses the old type will generate a type error at that position (eg. TypeScript), but will not produce a runtime error. This is because the change is additive and does not modify the types of the values in position 0 and 1 of the tuple.

In any case, we'll go to 3.0.0 of solana-transaction-error to make this clear via semver.

steveluscher · 2025-05-06T02:23:40Z

Updated the PR to include the index of the inner instruction, if applicable.

steveluscher · 2025-05-07T16:14:46Z

storage-proto/proto/transaction_by_addr.proto

    uint32 index = 1;
    InstructionErrorType error = 2;
    CustomError custom = 3;
+    optional uint32 inner_instruction_index = 4;
+    optional uint32 responsible_program_account_index = 5;


@steviez is there something fancy I should be doing here? Creating an option (1 byte) and a uint32 (4 bytes) just to store one byte of data sort of stinks. Could I instead hijack the remaining 24 bytes of uint32 index = 1 to do this?

xxxxxxxx abbbbbbb bcdddddd dd------ x - existing 8 bytes for `index` a - option flag for `inner_instruction_index` b - new 8 bytes for `inner_instruction_index` c - option flag for `responsible_program_account_index` d - new 8 bytes for `responsible_program_account_index`

…then just figure it all out in convert.rs?

joncinque

Great work! Mostly small questions and comments.

I'll defer to Joe about the SVM message bit and to Steve about the proto bit since they'll know better. For what it's worth, the current change seems reasonable to me in both cases.

joncinque · 2025-05-07T22:51:48Z

compute-budget-instruction/Cargo.toml

@@ -29,6 +29,7 @@ crate-type = ["lib"]
 name = "solana_compute_budget_instruction"

 [dev-dependencies]
+assert_matches = { workspace = true }


Was this change intended?

joncinque · 2025-05-07T23:01:15Z

rpc-client/Cargo.toml

@@ -35,6 +35,7 @@ solana-instruction = { workspace = true }
 solana-message = { workspace = true }
 solana-pubkey = { workspace = true }
 solana-rpc-client-api = { workspace = true }
+solana-sdk-ids = { workspace = true }


Was this change intended?

joncinque · 2025-05-08T14:33:33Z

svm/Cargo.toml

@@ -12,6 +12,7 @@ edition = { workspace = true }
 [dependencies]
 ahash = { workspace = true }
 itertools = { workspace = true }
+lazy_static = { workspace = true }


nit: there's work to remove this everywhere #6049 in favor of LazyLock https://doc.rust-lang.org/std/sync/struct.LazyLock.html, so we probably shouldn't add it

joncinque · 2025-05-08T15:01:11Z

svm/src/message_processor.rs

+                // By definition the last instruction (outer or inner) in the trace before the trace
+                // stopped being appended to is the one that encountered an error.
+                .get_instruction_trace_length()
+                .checked_sub(1)
+                .and_then(|index_in_trace| {
+                    transaction_context
+                        .get_instruction_context_at_index_in_trace(index_in_trace)
+                        .ok()
+                })


This is my lack of understanding, but would it be simpler to call get_current_instruction_context? On the flip side, it looks like it boils down to almost exactly the same code

joncinque · 2025-05-08T15:12:30Z

svm/src/message_processor.rs

+            enum InnerInstructionIndexSearchState {
+                SearchingForTopLevelInstruction(
+                    usize, // Index of top-level instruction being sought next
+                ),
+                InTopLevelInstruction(
+                    Option<u8>, // Inner instruction index
+                ),
+            }


Sorry, I don't understand the point of the first variant of this enum. It's only ever used for 0, and never incremented, which makes me think there's either a bug, or it doesn't need an index, and we can just initialize to InTopLevelInstruction(None).

It might be simpler and more legible to just have an Option<u8> with the current inner instruction index, which is reset to None whenever we get to a top-level ix, and otherwise incremented. What do you think?

Would it be worth also adding a test with multiple top-level instructions to make sure this logic works?

joncinque · 2025-05-08T15:24:55Z

svm/src/transaction_processor.rs

+            {index_of_program_that_should_throw_exception}."
+        );
+    }
+


That sounds exactly right to me, along with what I suggested in another comment, about multiple top-level instructions

t-nelson · 2025-05-08T16:21:10Z

is this not breaking several public interfaces?

* **RPC API**: Anything fetching a transaction status where that status contains an `err` property currently expects a tuple of `(instructionIndex: number, err: InstructionError)`. This change would add a third element to the tuple. Existing clients would continue to expect 2, use 2, and ignore the third. Non breaking.

* **Code**: After this change, any type position that uses the old type will generate a _type_ error at that position (eg. [TypeScript](https://www.typescriptlang.org/play/?#code/C4TwDgpgBAkgdgZ2AJwK4GNgEsD2cCiyyOyUAvFAiALYBGOANgNwBQLokUAQhAGYnQKAbTio6EZABpYiFBmx5CxZAF1WHaAEFewCeSgixtCdPhI0mXASIlpo8atYteqOJbxR+OABQA3AIYMqBAAXNx8AgCUUADebAC+bF7eQgCM0gDKNPQM3tH+CDLm8lZKJCqRrAD0VVAAAsAIALQQAB6QmC02pADMUBAMENQQcI1QWLIQ-gAmUDi8UABMzjg+aZnZjHlQBUVy7tbK0osVTEA)), but will not produce a _runtime_ error. This is because the change is additive and does not modify the types of the values in position 0 and 1 of the tuple.

In any case, we'll go to 3.0.0 of solana-transaction-error to make this clear via semver.

tbh i care zero about either of those (they're leaving validator/monorepo soon enough). this breaks the rust sdk and all binary consumers. we can't "just bump major" because the validator is imposing the change on all consumers

steveluscher · 2025-05-10T00:45:27Z

Moving to draft status; please don't review at the moment.

buffalojoec · 2025-05-17T04:11:56Z

Moving to draft status; please don't review at the moment.

Let me know whenever it's ready again. I combed through and for the most part I think it makes sense. I'll add my detailed review once you're ready. 🫡

…xt of its top-level outer instruction

…InvokeContext`

… program account index) into the existing storage for `TransactionError`, in a way that's space-efficient and backward compatible

…arries the index of the inner instruction from which the error was thrown (if applicable) and the address of the program responsible

…ialization can be defined for the RPC

steveluscher

Alright. Pending the release of this PR in solana-sdk, I think this is ready to go.

The approach

Make a breaking change to the TransactionError::InstructionError type to add the address of the program responsible for the error and its inner instruction index (anza-xyz/solana-sdk#74).
‘Airgap’ this type from the stored representation in blockstore by creating StoredTransactionError. Implement a custom serializer on that type to make the serialization backward and forward compatible with the old one. (e207230)
‘Airgap’ this type from the RPC API by creating UiTransactionError and UiTransactionResult. Implement a custom serializer on that type to make the serialization backward and forward compatible with the old one. (0d1fd35)
Teach TransactionContext to be able to pull the current inner instruction index, as would be displayed on an Explorer. (cb19cdf)
Teach InvokeContext how to keep track of the program responsible for any first-thrown error, as well as the inner instruction index of that program in the transaction (431fc75)
Teach SVM to use InvokeContext to blame programs for throwing errors (33966d7).

The outcome

The end result should be that we have a new, structured TransactionError::InstructionError with named fields that transaction processing code can use, that we maintain backward compatibility with old apps that expect instruction errors to appear in the form { "InstructionError": [0, { "Custom": 1 }] } through the RPC, and that new clients get to use additional data in that sequence as the next version of Agave starts to produce errors of the form { "InstructionError": [0, { "Custom": 1 }, "11111111111111111111111111111111", null] }.

How to review this PR

I recommend stepping through the ‘commits’ tab, ignoring the first commit where I patch in a local copy of solana-sdk with the change in anza-xyz/solana-sdk#74. CI will not be able to run on this PR until 74 is landed and released, so we're flying blind for now. The only way to run the tests is to check out this PR, and 74 of solana-sdk as a sibling of agave/ and compile the whole shebang with RUSTFLAGS="-Adeprecated"

Here's what I need from you folks

@steviez, can you review the changes in e207230 from the perspective of how the new `TransactionError::InstructionError gets stored in blockstore. I believe these to be backward and forward compatible, and have several tests that attempt to prove that.
@Lichtso, can you review my changes to TransactionContext (cb19cdf), InvokeContext (431fc75), and SVM (33966d7).
- I'm very unhappy with this code. It exists because I haven't found all of the places where errors are thrown from SVM, so I've had to accept that, sometimes, first_seen_error_attribution will be None, despite an error having been encountered. What I really want is to make that a true invariant that throws, because there should always be error attribution if you've entered that block.
@apfitzge, can you review this from the perspective of the fees reviewer?
@buffalojoec, can you review (0d1fd35) and (c964064) with the goal of making sure that I've ensured that RPC responses get serialized in the old tuple-variant style everywhere, to maintain compatibility with the client RPC API.
@joncinque, can you sort of skim the whole thing, and also double check my assumption in 152f001?

steveluscher · 2025-05-30T18:33:28Z

transaction-context/src/lib.rs

+    /// Observe how the index resets every time a new top-level instruction is called.
+    ///
+    /// * #1 Program A (no index)
+    ///   * #1.1 CPI to Program B (index 0)


Explorers typically 1-index inner instruction indexes, but in the code we zero index them, None meaning ‘this is not an inner instruction.’

steveluscher · 2025-05-30T18:40:01Z

storage-proto/src/convert.rs

+                // Index values are 32-bit integers of the form:
+                // TTTTTTTT IIIIIIII xxxxxxxx xxxxxxxx


This is already a uint32, so we can fit more u8 data in here, rather than to create a new column.

steveluscher · 2025-05-30T18:41:35Z

storage-proto/src/convert.rs

+                // * I - The inner index of the instruction that errored; 0 if None, 1-indexed otherwise
+                let [outer_instruction_index, maybe_inner_instruction_index, _unused1, _unused2] =
+                    instruction_error.index.to_le_bytes();
+                let inner_instruction_index = maybe_inner_instruction_index.checked_sub(1);


I've implemented this as a single-byte option, where 0 means None and 1 means 0. The implication is that this can only store an inner instruction index of at most 255. If we can CPI that many instructions, it probably means that Solana has undergone major changes, and all of this code is already gone.

steveluscher · 2025-05-30T18:42:31Z

storage-proto/src/lib.rs

+    fn from(value: StoredTransactionError) -> Self {
+        let bytes = value.0;
+        match &bytes.as_slice() {
+            [8, 0, 0, 0, ..] => {


When reading the bytes of a transaction error from storage, this header implies that it's a TransactionError::InstructionError (ie. the eighth variant in the TransactionError struct.

steveluscher · 2025-05-30T18:44:24Z

svm/src/message_processor.rs

+                    attr.inner_instruction_index.map(|i| i as u8),
+                    Some(attr.responsible_program_address),
+                ),
+                None => {


What I wanted to do is to make this an invariant; you should never be able to reach this code without something in the SVM having blamed a program for the error. I don't see a way to provably make this so, so instead I've opted for a ‘soft fail’ that sets the attribution data to None. cc/ @Lichtso.

steveluscher · 2025-05-30T18:47:10Z

transaction-status-client-types/src/lib.rs

@@ -226,12 +234,122 @@ impl From<&MessageAddressTableLookup> for UiAddressTableLookup {
    }
 }

+#[derive(Clone, Debug, PartialEq, Eq)]
+pub struct UiTransactionError(pub TransactionError);


Start here for the RPC UiTransactionError and UiTransactionResult airgap change. These are essentially shims that ensure that the RPC serialization for transaction errors won't change its structure despite TransactionError::InstructionError having changed its structure.

steveluscher · 2025-05-30T18:54:10Z

runtime/src/bank.rs

@@ -244,7 +244,7 @@ struct RentMetrics {
 pub type BankStatusCache = StatusCache<Result<()>>;
 #[cfg_attr(
    feature = "frozen-abi",
-    frozen_abi(digest = "5dfDCRGWPV7thfoZtLpTJAV8cC93vQUXgTm6BnrfeUsN")
+    frozen_abi(digest = "5UmYzdMvTDkFBKsqddQ43mSikgEA6s2bTvRZUq78YPQ2")


I believe that this is actually bad, because Result contains TransactionError, which has the updated version of TransactionError::InstructionError, which means that snapshots that are serialized with this version of BankSlotDelta will be incompatible with Agave <2.3.

This probably means that I have work here to do to make sure that the snapshots themselves are backward/forward compatible, but please let me know if I'm wrong about that. Maybe it's the case that we don't have an expectation that you can boot Agave 2.2 from a snapshot produced by 2.3? You tell me. cc/ @joncinque.

Maybe @brooksprumo?

Hi 👋

Without looking at the PR/changes, the way to determine if the digests are safe/right to change is to generate the abi files locally and compare. You'll want to run the test-abi.sh script, or the cmd it itself runs, on both master and this PR. Then you can diff the output see which fields changed. Sometimes it is adding fields, and sometimes it is just renaming a module.

This probably means that I have work here to do to make sure that the snapshots themselves are backward/forward compatible, but please let me know if I'm wrong about that. Maybe it's the case that we don't have an expectation that you can boot Agave 2.2 from a snapshot produced by 2.3? You tell me.

Snapshots must be compatible between adjacent version. For example, a snapshot created by v2.2 must be loadable by v2.1 and v2.3. A snapshot created by v2.3 must be loadable by v2.2, but does not need to be loadable by v2.1. To change the snapshot serialization format, a SIMD is required, as it impacts the other validator clients (e.g. firedancer).

Thanks for the guidance!

If the diff was this (ie. adding two fields to the end of the tuple variant InstructionError) would that be backward compatible? That is to say if v2.3 wrote those two extra elements would v2.2 ignore them and carry on, or would it panic?

I'm pretty sure that I can get BankSlotDeltas to serialize like that, if that would ensure backward/forward compat, but what I'm hearing is that adding any fields at all will require a SIMD no matter what.

If the diff was this (ie. adding two fields to the end of the tuple variant InstructionError) would that be backward compatible? That is to say if v2.3 wrote those two extra elements would v2.2 ignore them and carry on, or would it panic?

In this case I'm not sure. Luckily it's an easy thing to try out! Create a snapshot with this PR and then try to load that snapshot with v2.2.

I'm pretty sure that I can get BankSlotDeltas to serialize like that, if that would ensure backward/forward compat, but what I'm hearing is that adding any fields at all will require a SIMD no matter what.

I don't have personal experience with adding tuple variants on something that was marked for frozen abi, so I'm not actually sure what'll happen here. I think a SIMD will depend on the result of the experiment above.

Luckily, if a SIMD is required, they are pretty simple. At least as far as SIMDs are concerned :)

steveluscher · 2025-06-12T22:30:31Z

Giving up on this; wrote a giant redux of everything I tried, split this PR into smaller landable PRs that will make the next person's job easier, and linked everything together in #6546.

steveluscher requested review from a team as code owners May 3, 2025 00:42

steveluscher mentioned this pull request May 3, 2025

Add the responsible program's account index and inner instruction index to each InstructionError anza-xyz/solana-sdk#74

Open

steveluscher commented May 3, 2025

View reviewed changes

steveluscher requested a review from apfitzge May 3, 2025 01:07

steveluscher commented May 3, 2025

View reviewed changes

steveluscher force-pushed the responsible-program-of-transaction-error branch from f7a6f80 to eebeab3 Compare May 6, 2025 02:21

steveluscher force-pushed the responsible-program-of-transaction-error branch from eebeab3 to 0f963cf Compare May 6, 2025 16:55

joncinque self-requested a review May 6, 2025 20:11

steveluscher commented May 7, 2025

View reviewed changes

joncinque reviewed May 8, 2025

View reviewed changes

steveluscher marked this pull request as draft May 10, 2025 00:44

steveluscher force-pushed the responsible-program-of-transaction-error branch from 0f963cf to 071d83c Compare May 16, 2025 05:37

steveluscher changed the title ~~Add the transaction-level account index of erroring programs to TransactionError::InstructionError~~ Add the inner instruction index and transaction-level account index of erroring programs to TransactionError::InstructionError May 16, 2025

steveluscher force-pushed the responsible-program-of-transaction-error branch from 071d83c to 7ca6c8b Compare May 16, 2025 17:56

steveluscher mentioned this pull request May 16, 2025

InstructionContext now carries the index of the instruction from the perspective of the top level instruction from which it was spawned #6255

Closed

steveluscher force-pushed the responsible-program-of-transaction-error branch from 7ca6c8b to 68cf8e6 Compare May 28, 2025 20:13

steveluscher force-pushed the responsible-program-of-transaction-error branch 8 times, most recently from 7c542ea to 338cf2b Compare May 29, 2025 22:51

[NOT FOR COMMIT] Patch in local copy of solana-sdk

51c405b

steveluscher force-pushed the responsible-program-of-transaction-error branch from 338cf2b to 72b3748 Compare May 30, 2025 17:38

steveluscher added 8 commits May 30, 2025 18:32

A method that returns the inner index of an instruction, in the conte…

cb19cdf

…xt of its top-level outer instruction

A method you can use to store context about the first-seen error on `…

431fc75

…InvokeContext`

Add two additional pieces of information (inner index and responsible…

e207230

… program account index) into the existing storage for `TransactionError`, in a way that's space-efficient and backward compatible

When TransactionError::InstructionError is thrown, ensure that it c…

33966d7

…arries the index of the inner instruction from which the error was thrown (if applicable) and the address of the program responsible

Add tests for errors in RPC subscriptions

c964064

Add UiTransactionError and UiTransactionResult so that custom ser…

0d1fd35

…ialization can be defined for the RPC

Patch up tests, types, and clippy at the end of the PR stack

2e7f4d1

Bump ABI

152f001

steveluscher force-pushed the responsible-program-of-transaction-error branch from 72b3748 to 152f001 Compare May 30, 2025 18:32

steveluscher commented May 30, 2025

View reviewed changes

steveluscher marked this pull request as ready for review May 30, 2025 18:54

steveluscher changed the title ~~Add the inner instruction index and transaction-level account index of erroring programs to TransactionError::InstructionError~~ Add the inner instruction index and address of erroring programs to TransactionError::InstructionError May 30, 2025

This was referenced Jun 6, 2025

Stop writing TransactionError into snapshots from status cache #6460

Closed

Eliminate Signature-to-TransactionResult cache from Bank #6546

Open

steveluscher closed this Jun 12, 2025

		// Index values are 32-bit integers of the form:
		// TTTTTTTT IIIIIIII xxxxxxxx xxxxxxxx

Add the inner instruction index and address of erroring programs to TransactionError::InstructionError #6083

Add the inner instruction index and address of erroring programs to TransactionError::InstructionError #6083

Uh oh!

Conversation

steveluscher commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Summary of Changes

Notes to reviewers

Questions

Uh oh!

mergify bot commented May 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

buffalojoec May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

steveluscher commented May 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jstarry commented May 5, 2025

Uh oh!

steveluscher commented May 5, 2025

Uh oh!

jstarry commented May 5, 2025

Uh oh!

steveluscher commented May 5, 2025

Uh oh!

t-nelson commented May 5, 2025

Uh oh!

steveluscher commented May 5, 2025

Uh oh!

steveluscher commented May 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joncinque left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

t-nelson commented May 8, 2025

Uh oh!

steveluscher commented May 10, 2025

Uh oh!

buffalojoec commented May 17, 2025

Uh oh!

steveluscher left a comment

Choose a reason for hiding this comment

Add the inner instruction index and address of erroring programs to `TransactionError::InstructionError` #6083

Add the inner instruction index and address of erroring programs to `TransactionError::InstructionError` #6083

steveluscher commented May 3, 2025 •

edited

Loading

buffalojoec May 17, 2025 •

edited

Loading