|
| 1 | +--- |
| 2 | +simd: 'XXXX' |
| 3 | +title: Slashable event verification |
| 4 | +authors: |
| 5 | + - Ashwin Sekar |
| 6 | +category: Standard |
| 7 | +type: Core |
| 8 | +status: Draft |
| 9 | +created: (fill me in with today's date, YYYY-MM-DD) |
| 10 | +feature: (fill in with feature tracking issues once accepted) |
| 11 | +--- |
| 12 | + |
| 13 | +## Summary |
| 14 | + |
| 15 | +This proposal describes an enshrined on-chain program to verify proofs that a |
| 16 | +validator committed a slashable infraction. This program creates reports on chain |
| 17 | +for use in future SIMDs. |
| 18 | + |
| 19 | +**This proposal does not modify any stakes or rewards, the program will |
| 20 | +only verify and log infractions.** |
| 21 | + |
| 22 | +## Motivation |
| 23 | + |
| 24 | +There exists a class of protocol violations that are difficult to detect synchronously, |
| 25 | +but are simple to detect after the fact. In order to penalize violators we provide |
| 26 | +a means to record these violations on chain. |
| 27 | + |
| 28 | +This also serves as a starting point for observability and discussions around the |
| 29 | +economics of penalizing these violators. This is a necessary step to implement |
| 30 | +slashing in the Solana Protocol. |
| 31 | + |
| 32 | +## New Terminology |
| 33 | + |
| 34 | +None |
| 35 | + |
| 36 | +## Feature flags |
| 37 | + |
| 38 | +`create_slashing_program`: |
| 39 | + |
| 40 | +- `sProgVaNWkYdP2eTRAy1CPrgb3b9p8yXCASrPEqo6VJ` |
| 41 | + |
| 42 | +## Prerequisites |
| 43 | + |
| 44 | +None |
| 45 | + |
| 46 | +## Detailed Design |
| 47 | + |
| 48 | +On the epoch boundary where the `create_slashing_program` feature flag is first |
| 49 | +activated the following behavior will be executed in the first block for the new |
| 50 | +epoch: |
| 51 | + |
| 52 | +1. Create a new program account at `S1ashing11111111111111111111111111111111111` |
| 53 | + with an upgrade authority set to the system program |
| 54 | + `11111111111111111111111111111111` |
| 55 | + |
| 56 | +2. Verify that the program account |
| 57 | + `8sT74BE7sanh4iT84EyVUL8b77cVruLHXGjvTyJ4GwCe` has a verified build hash of |
| 58 | + `<FILL IN AFTER IMPLEMENTATION>` [\[1\]](#notes) |
| 59 | + |
| 60 | +3. Copy the contents of `8sT74BE7sanh4iT84EyVUL8b77cVruLHXGjvTyJ4GwCe` into |
| 61 | + `S1ashing11111111111111111111111111111111111` |
| 62 | + |
| 63 | +This program (hereafter referred to as the slashing program) supports 2 |
| 64 | +instructions `DuplicateBlockProof`, and `CloseProofReport`. |
| 65 | + |
| 66 | +`DuplicateBlockProof` requires 1 account: |
| 67 | + |
| 68 | +0. `proof_account`, expected to be previously intiialized with the proof data. |
| 69 | + |
| 70 | +`DuplicateBlockProof` has an instruction data of 48 bytes, containing: |
| 71 | + |
| 72 | +- `0x00`, a fixed-value byte acting as the instruction discriminator |
| 73 | +- `offset`, an unaligned eight-byte little-endian unsigned integer indicating |
| 74 | + the offset from which to read the proof |
| 75 | +- `slot`, an unaligned eight-byte little-endian unsigned integer indicating the |
| 76 | + slot in which the violation occured |
| 77 | +- `node_pubkey`, an unaligned 32 byte array representing the public key of the |
| 78 | + node which committed the violation |
| 79 | + |
| 80 | +We expect the contents of the `proof_account` when read from `offset` to |
| 81 | +deserialize to a struct of two byte arrays representing the duplicate shreds. |
| 82 | +The first 4 bytes correspond to the length of the first shred, and the 4 bytes |
| 83 | +after that shred correspond to the length of the second shred. |
| 84 | + |
| 85 | +```rust |
| 86 | +struct DuplicateBlockProofData { |
| 87 | + shred1_length: u32, |
| 88 | + shred1: &[u8], |
| 89 | + shred2_length: u32, |
| 90 | + shred2: &[u8] |
| 91 | +} |
| 92 | +``` |
| 93 | + |
| 94 | +`DuplicateBlockProof` aborts if: |
| 95 | + |
| 96 | +- The difference between the current slot and `slot` is greater than 1 epoch's |
| 97 | + worth of slots as reported by the `Clock` sysvar |
| 98 | +- `offset` is larger than the length of `proof_account` |
| 99 | +- `proof_account[offset..]` does not deserialize cleanly to a |
| 100 | + `DuplicateBlockProofData`. |
| 101 | +- The resulting shreds do not adhere to the Solana shred format [\[2\]](#notes) |
| 102 | + or are legacy shred variants. |
| 103 | +- The resulting shreds specify a slot that is different from `slot`. |
| 104 | +- The resulting shreds specify different shred versions. |
| 105 | + |
| 106 | +After deserialization the slashing program will attempt to verify the proof, by |
| 107 | +checking that `shred1` and `shred2` constitute a valid duplicate proof for |
| 108 | +`slot` and are correctly signed by `node_pubkey`. This is similar to logic used |
| 109 | +in Solana's gossip protocol to verify duplicate proofs for use in fork choice. |
| 110 | + |
| 111 | +### Proof verification |
| 112 | + |
| 113 | +`shred1` and `shred2` constitute a valid duplicate proof if any of the following |
| 114 | +conditions are met: |
| 115 | + |
| 116 | +- Both shreds specify the same index and shred type, however their payloads |
| 117 | + differ |
| 118 | +- Both shreds specify the same FEC set, however their merkle roots differ |
| 119 | +- Both shreds specify the same FEC set and are coding shreds, however their |
| 120 | + erasure configs conflict |
| 121 | +- At least one shred is a coding shred, and its erasure meta indicates an FEC set |
| 122 | + overlap. |
| 123 | +- The shreds are data shreds with different indices and the shred with the lower |
| 124 | + index has the `LAST_SHRED_IN_SLOT` flag set |
| 125 | + |
| 126 | +Note: We do not verify that `node_pubkey` was the leader for `slot`. Any node that |
| 127 | +willingly signs duplicate shreds for a slot that they are not a leader for is |
| 128 | +eligible for slashing. |
| 129 | + |
| 130 | +--- |
| 131 | + |
| 132 | +### Signature verification |
| 133 | + |
| 134 | +In order to verify that `shred1` and `shred2` were correctly signed by |
| 135 | +`node_pubkey` we use instruction retrospection. |
| 136 | + |
| 137 | +Using the `Instructions` sysvar we verify that the previous two instructions of |
| 138 | +this transaction are for the program ID |
| 139 | +`Ed25519SigVerify111111111111111111111111111` |
| 140 | + |
| 141 | +For each of these instructions, verify the instruction data: |
| 142 | + |
| 143 | +- The first byte is `0x01` |
| 144 | +- The second byte (padding) is `0x00` |
| 145 | + |
| 146 | +And then deserialize the remaining instruction data as 2 byte little-endian |
| 147 | +unsigned integers: |
| 148 | + |
| 149 | +```rust |
| 150 | +struct Ed25519SignatureOffsets { |
| 151 | + signature_offset: u16, // offset to ed25519 signature of 64 bytes |
| 152 | + signature_instruction_index: u16, // instruction index to find signature |
| 153 | + public_key_offset: u16, // offset to public key of 32 bytes |
| 154 | + public_key_instruction_index: u16, // instruction index to find public key |
| 155 | + message_data_offset: u16, // offset to start of message data |
| 156 | + message_data_size: u16, // size of message data |
| 157 | + message_instruction_index: u16, // index of instruction data to get message |
| 158 | + // data |
| 159 | +} |
| 160 | +``` |
| 161 | + |
| 162 | +We wish to verify that these instructions correspond to |
| 163 | + |
| 164 | +``` |
| 165 | +verify(pubkey = node_pubkey, message = shred1.merkle_root, signature = shred1.signature) |
| 166 | +verify(pubkey = node_pubkey, message = shred2.merkle_root, signature = shred2.signature) |
| 167 | +``` |
| 168 | + |
| 169 | +We use the deserialized offsets to calculate [\[3\]](#notes) the `pubkey`, |
| 170 | +`message`, and `signature` of each instruction and verify that they correspond |
| 171 | +to the `node_pubkey`, `merkle_root`, and `signature` specified by the shred payload. |
| 172 | + |
| 173 | +If both proof and signer verification succeed, we continue on to store the incident. |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +### Incident reporting |
| 178 | + |
| 179 | +After verifying a successful proof we store the results in a program derived |
| 180 | +address for future use. The PDA is derived using the `node_pubkey`, `slot`, and |
| 181 | +the violation type: |
| 182 | + |
| 183 | +```rust |
| 184 | +let (pda, _) = find_program_address(&[ |
| 185 | + node_pubkey.to_bytes(), |
| 186 | + slot.to_le_bytes(), |
| 187 | + ViolationType::DuplicateBlock.to_u8(), |
| 188 | +]) |
| 189 | +``` |
| 190 | + |
| 191 | +At the moment `DuplicateBlock` is the only violation type but future work will |
| 192 | +add additional slashing types. |
| 193 | + |
| 194 | +If the `pda` account has non-zero lamports, then we abort as the violation has |
| 195 | +already been reported. Otherwise we create the account, with the slashing program |
| 196 | +as the owner. In this account we store the following: |
| 197 | + |
| 198 | +```rust |
| 199 | +struct ProofReport { |
| 200 | + reporter: Pubkey, // Fee payer, to allow the account to be closed |
| 201 | + epoch: Epoch, // Epoch in which this report was created |
| 202 | + pubkey: Pubkey, // The pubkey of the node that committed the violation |
| 203 | + slot: Slot, // Slot in which the violation occured |
| 204 | + violation_type: u8, // The violation type |
| 205 | + proof: Vec<u8> // The serialized proof |
| 206 | + proof_account: Option<Pubkey>, // Optional account where proof is stored instead |
| 207 | +} |
| 208 | +``` |
| 209 | + |
| 210 | +The `DuplicateBlockProofData` is serialized into the `proof` field. This provides |
| 211 | +an on chain trail of the reporting process, since the `proof_account` supplied in |
| 212 | +the `DuplicateBlockProof` account could later be modified. |
| 213 | + |
| 214 | +The `pubkey` is populated with the `node_pubkey`. For future violation types that |
| 215 | +involve votes, this will instead be populated with the vote account's pubkey. |
| 216 | +The work in SIMD-0180 will allow the `node_pubkey` to be translated to a vote account |
| 217 | +if needed. |
| 218 | + |
| 219 | +Note that PDA's can only be created with a 10kb initial size. |
| 220 | +Although not a problem for `DuplicateBlockProofData`, if future proof types require |
| 221 | +more space, we allow the proof to be stored in a separate account, and linked back |
| 222 | +to the PDA using the `proof_account` field. |
| 223 | + |
| 224 | +--- |
| 225 | + |
| 226 | +### Closing the incident report |
| 227 | + |
| 228 | +After the slashing violation has been processed by the runtime, the initial fee |
| 229 | +payer may wish to close their `ProofReport` account to reclaim the lamports. |
| 230 | + |
| 231 | +They can accomplish this via the `CloseProofReport` instruction which requires |
| 232 | +2 accounts: |
| 233 | + |
| 234 | +0. `report_account`: The PDA account storing the report: Writable, owned by the |
| 235 | + slashing program |
| 236 | +1. `destination`: Writable account to reclaim the lamports |
| 237 | + |
| 238 | +`CloseProofReport` has an instruction data of 42 bytes, containing: |
| 239 | + |
| 240 | +- `0x01`, a fixed-value byte acting as the instruction discriminator |
| 241 | +- `violation_type`, a one byte value acting as the violation type discriminator |
| 242 | +- `slot`, an unaligned eight-byte little-endian unsigned integer indicating the |
| 243 | + slot which was reported |
| 244 | +- `pubkey`, an unaligned 32 byte array representing the public key of the node |
| 245 | + which was reported |
| 246 | + |
| 247 | +We abort if: |
| 248 | + |
| 249 | +- `violation_type` is not `0x00` (corresponds to `DuplicateBlock` violation) |
| 250 | +- Deriving the pda using `pubkey`, `slot`, and `ViolationType::DuplicateBlock` |
| 251 | + as outlined above does not result in the adddress of `report_account` |
| 252 | +- `report_account` is not writeable |
| 253 | +- `report_account` does not deserialize cleanly to `ProofReport` |
| 254 | +- `report_account.reporter` is not a signer |
| 255 | +- `report_account.epoch + 3` is greater than the current epoch reported from |
| 256 | + the `Clock` sysvar. We want to ensure that these accounts do not get closed before |
| 257 | + they are observed by indexers and dashboards. |
| 258 | + |
| 259 | +Otherwise we close the `report_account` and credit the `lamports` to `destination` |
| 260 | + |
| 261 | +--- |
| 262 | + |
| 263 | +## Impact |
| 264 | + |
| 265 | +A new program will be enshrined at `S1ashing11111111111111111111111111111111111`. |
| 266 | + |
| 267 | +Reports stored in PDAs of this program might be queried for dashboards which could |
| 268 | +incur additional indexing overhead for RPC providers. |
| 269 | + |
| 270 | +## Security Considerations |
| 271 | + |
| 272 | +None |
| 273 | + |
| 274 | +## Drawbacks |
| 275 | + |
| 276 | +None |
| 277 | + |
| 278 | +## Backwards Compatibility |
| 279 | + |
| 280 | +The feature is not backwards compatible |
| 281 | + |
| 282 | +## Notes |
| 283 | + |
| 284 | +\[1\]: Sha256 of program data, see |
| 285 | + https://github.com/Ellipsis-Labs/solana-verifiable-build/blob/214ba849946be0f7ec6a13d860f43afe125beea3/src/main.rs#L331 |
| 286 | + for details. |
| 287 | + |
| 288 | +\[2\]: The slashing program will support any combination of merkle shreds, chained |
| 289 | + merkle shreds, and retransmitter signed chained merkle shreds, see https://github.com/anza-xyz/agave/blob/4e7f7f76f453e126b171c800bbaca2cb28637535/ledger/src/shred.rs#L6 |
| 290 | + for the full specification. |
| 291 | + |
| 292 | +\[3\]: Example of offset calculation can be found here https://docs.solanalabs.com/runtime/programs#ed25519-program |
0 commit comments