|
| 1 | +``` |
| 2 | + BIP: tbd |
| 3 | + Layer: Consensus (soft fork) |
| 4 | + Title: OP_TXHASH and OP_CHECKTXHASHVERIFY |
| 5 | + Author: Steven Roose <[email protected]> |
| 6 | + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-tbd |
| 7 | + Status: Draft |
| 8 | + Type: Standards Track |
| 9 | + Created: 2023-09-03 |
| 10 | + License: BSD-3-Clause |
| 11 | +``` |
| 12 | + |
| 13 | +# Abstract |
| 14 | + |
| 15 | +This BIP proposes two new opcodes, `OP_CHECKTXHASHVERIFY`, to be activated |
| 16 | +as a change to the semantics of `OP_NOP4` in legacy script, segwit and tapscript; |
| 17 | +and OP_TXHASH, to be activated as a change to the semantics of `OP_SUCCESS189` |
| 18 | +in tapscript only. |
| 19 | + |
| 20 | +These opcodes provide a generalized method for introspecting certain details of |
| 21 | +the spending transaction, which enables non-interactive enforcement of certain |
| 22 | +properties of the transaction spending a certain UTXO. |
| 23 | + |
| 24 | +The constructions specified in this BIP also open up the way for other |
| 25 | +potential updates; see Motivation section for more details. |
| 26 | + |
| 27 | + |
| 28 | +# Summary |
| 29 | + |
| 30 | +## OP_CHECKTXHASHVERIFY |
| 31 | + |
| 32 | +The first new opcode, `OP_CHECKTXHASHVERIFY`, redefines the `OP_NOP4` opcode (`0xb3`) as a soft fork upgrade. |
| 33 | + |
| 34 | +It has the following semantics: |
| 35 | + |
| 36 | +* There is at least one element on the stack, fail otherwise. |
| 37 | +* The element on the stack is at least 32 bytes long, fail otherwise. |
| 38 | +* The first 32 bytes are interpreted as the TxHash and the remaining suffix bytes specify the TxFieldSelector. |
| 39 | +* If the TxFieldSelector is invalid, fail. |
| 40 | +* The actual TxHash of the transaction at the current input index, calculated |
| 41 | + using the given TxFieldSelector must be equal to the first 32 bytes of the |
| 42 | + element on the stack, fail otherwise. |
| 43 | + |
| 44 | + |
| 45 | +## OP_TXHASH |
| 46 | + |
| 47 | +The second new opcode, `OP_TXHASH`, redefines the `OP_SUCCESS189` tapscript opcode (`0xbd`) as a soft fork upgrade. |
| 48 | + |
| 49 | +It has the following semantics: |
| 50 | + |
| 51 | +* There is at least one element on the stack, fail otherwise. |
| 52 | +* The element is interpreted as the TxFieldSelector and is popped off the stack. |
| 53 | +* If the TxFieldSelector is invalid, fail. |
| 54 | +* The 32-byte TxHash of the transaction at the current input index, calculated |
| 55 | + using the given TxFieldSelector is pushed onto the stack. |
| 56 | + |
| 57 | +## TxFieldSelector |
| 58 | + |
| 59 | +The TxFieldSelector has the following semantics. We will give a brief conceptual |
| 60 | +summary, followed by a reference implementation of the CalculateTxHash function. |
| 61 | + |
| 62 | +* There are two special cases for the TxFieldSelector: |
| 63 | + * the empty value, zero bytes long: it is set equal to `TXFS_SPECIAL_TEMPLATE`, |
| 64 | + the de-facto default value which means everything except the prevouts and the prevout |
| 65 | + scriptPubkeys.<br>Special case `TXFS_SPECIAL_TEMPLATE` is 4 bytes long, as follows: |
| 66 | + * 1. `TXFS_ALL` |
| 67 | + * 2. `TXFS_INPUTS_TEMPLATE | TXFS_OUTPUTS_ALL` |
| 68 | + * 3. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` |
| 69 | + * 4. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` |
| 70 | + * the `0x00` byte: it is set equal to `TXFS_SPECIAL_ALL`, which means "ALL" and is primarily |
| 71 | + useful to emulate `SIGHASH_ALL` when `OP_TXHASH` is used in combination |
| 72 | + with `OP_CHECKSIGFROMSTACK`.<br>Special case `TXFS_SPECIAL_TEMPLATE` is 4 |
| 73 | + bytes long, as follows: |
| 74 | + * 1. `TXFS_ALL` |
| 75 | + * 2. `TXFS_INPUTS_ALL | TXFS_OUTPUTS_ALL` |
| 76 | + * 3. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` |
| 77 | + * 4. `TXFS_INOUT_NUMBER | TXFS_INOUT_SELECTION_ALL` |
| 78 | + |
| 79 | +* The first byte of the TxFieldSelector has its 8 bits assigned as follows, from lowest to highest: |
| 80 | + * 1. version (`TXFS_VERSION`) |
| 81 | + * 2. locktime (`TXFS_LOCKTIME`) |
| 82 | + * 3. current input index (`TXFS_CURRENT_INPUT_IDX`) |
| 83 | + * 4. current input control block (or empty) (`TXFS_CURRENT_INPUT_CONTROL_BLOCK`) |
| 84 | + * 5. current script last `OP_CODESEPARATOR` position (or 0xffffffff) |
| 85 | + (`TXFS_CURRENT_INPUT_LAST_CODESEPARATOR_POS`) |
| 86 | + * 6. inputs (`TXFS_INPUTS`) |
| 87 | + * 7. outputs (`TXFS_OUTPUTS`) |
| 88 | + |
| 89 | +* The last (highest) bit of the first byte (`TXFS_CONTROL`), we will call the |
| 90 | + "control bit", and it can be used to control the behavior of the opcode. For |
| 91 | + `OP_TXHASH` and `OP_CHECKTXHASHVERIFY`, the control bit is used to determine |
| 92 | + whether the TxFieldSelector itself has to be included in the resulting hash. |
| 93 | + (For potential other uses of the TxFieldSelector (like a hypothetical |
| 94 | + `OP_TX`), this bit can be repurposed.) |
| 95 | + |
| 96 | +* If either "inputs" or "outputs" is set to 1, expect another byte with its 8 |
| 97 | + bits assigning the following variables, from lowest to highest: |
| 98 | + * Specifying which fields of the inputs will be selected: |
| 99 | + * 1. prevouts (`TXFS_INPUTS_PREVOUTS`) |
| 100 | + * 2. sequences (`TXFS_INPUTS_SEQUENCES`) |
| 101 | + * 3. scriptSigs (`TXFS_INPUTS_SCRIPTSIGS`) |
| 102 | + * 4. prevout scriptPubkeys (`TXFS_INPUTS_PREV_SCRIPTPUBKEYS`) |
| 103 | + * 5. prevout values (`TXFS_INPUTS_PREV_VALUED`) |
| 104 | + * 6. taproot annexes (`TXFS_INPUTS_TAPROOT_ANNEXES`) |
| 105 | + |
| 106 | + * Specifying which fields of the outputs will be selected: |
| 107 | + * 7. scriptPubkeys (`TXFS_OUTPUTS_SCRIPTPUBKEYS`) |
| 108 | + * 8. values (`TXFS_OUTPUTS_VALUES`) |
| 109 | + |
| 110 | +* We define as follows: |
| 111 | + * `TXFS_ALL = TXFS_VERSION | TXFS_LOCKTIME | TXFS_CURRENT_INPUT_IDX | TXFS_CURRENT_INPUT_CONTROL_BLOCK | TXFS_CURRENT_INPUT_LAST_CODESEPARATOR_POS | TXFS_INPUTS | TXFS_OUTPUTS | TXFS_CONTROL` |
| 112 | + * `TXFS_INPUTS_ALL = TXFS_INPUTS_PREVOUTS | TXFS_INPUTS_SEQUENCES | TXFS_INPUTS_SCRIPTSIGS | TXFS_INPUTS_PREV_SCRIPTPUBKEYS | TXFS_INPUTS_PREV_VALUES | TXFS_INPUTS_TAPROOT_ANNEXES` |
| 113 | + * `TXFS_INPUTS_TEMPLATE = TXFS_INPUTS_SEQUENCES | TXFS_INPUTS_SCRIPTSIGS | TXFS_INPUTS_PREV_VALUES | TXFS_INPUTS_TAPROOT_ANNEXES` |
| 114 | + * `TXFS_OUTPUTS_ALL = TXFS_OUTPUTS_SCRIPTPUBKEYS | TXFS_OUTPUTS_VALUES` |
| 115 | + |
| 116 | +For both inputs and then outputs, do the following: |
| 117 | + |
| 118 | +* If the "in/outputs" field is set to 1, another additional byte is expected: |
| 119 | + * The highest bit (`TXFS_INOUT_NUMBER`) indicates whether the "number of in-/outputs" |
| 120 | + should be committed to. |
| 121 | + * For the remaining bits, there are three exceptional values: |
| 122 | + * 0x00 (`TXFS_INOUT_SELECTION_NONE`) means "no in/outputs" |
| 123 | + (hence only the number of them as `0x80` (`TXFS_INOUT_NUMBER`)). |
| 124 | + * `0x40` (`TXFS_INOUT_SELECTION_CURRENT`) means "select only the in/output of the current input index" |
| 125 | + (it is invalid when current index exceeds number of outputs). |
| 126 | + * `0x3f` (`TXFS_INOUT_SELECTION_ALL`) means "select all in/outputs". |
| 127 | + |
| 128 | + * The second highest bit (`TXFS_INOUT_SELECTION_MODE`) is the "specification mode": |
| 129 | + * Set to 0 it means "leading mode". |
| 130 | + * Set to 1 it means "individual mode". |
| 131 | + * In "leading mode", the third highest bit (`TXFS_INOUT_SELECTION_SIZE`) is |
| 132 | + used to indicate the "count size", i.e. the number of bytes will be used to |
| 133 | + represent the number of in/output. |
| 134 | + * With "index size" set to 0, the remaining lowest 5 bits of the first byte will |
| 135 | + be interpreted as the number of leading in/outputs to select. |
| 136 | + * With "index size" set to 1, the remaining lowest 5 bits of the first byte together with the |
| 137 | + 8 bits of the next byte will be interpreted as the number of leading in/outputs to select. |
| 138 | + * In "individual mode", the remaining lowest 6 bits of the first byte will be |
| 139 | + interpreted as `n`, the number of individual in/outputs to select. For each |
| 140 | + individual input, (at least) one byte is expected, of this byte. The |
| 141 | + highest bit is used to indicate "absolute or relative" indices. |
| 142 | + * If the highest bit is set to 0, it is an absolute index. The second |
| 143 | + highest bit is used to indicate the amount of bytes are used to represent |
| 144 | + the index. |
| 145 | + * If the second-highest bit is 0, the remaining 6 bits represent the index to be selected. |
| 146 | + * If the second-highest bit is 1, the remaining 6 bits, together with the 8 bits of the next |
| 147 | + byte, represent the index to be selected. |
| 148 | + * If the highest bit is set to 1, it is a relative index. The second highest bit is used to |
| 149 | + indicate the sign of the index. |
| 150 | + * If the second-highest bit is set to 0, the remaining 6 bits represent the positive relative |
| 151 | + index to be selected. |
| 152 | + * If the second-highest bit is set to 1, the remaining 6 bits represent the negative relative |
| 153 | + index to be selected. |
| 154 | + |
| 155 | +Effectively, this allows a user to select |
| 156 | +* all in/outputs |
| 157 | +* the current input index |
| 158 | +* the leading in/outputs up to 8192 |
| 159 | +* up to 64 individually selected in/outputs |
| 160 | +** using absolute indices up to 16384 |
| 161 | +** using indices relative to the current input index from -64 to +64. |
| 162 | + |
| 163 | +The TxFieldSelector is invalid when |
| 164 | +* a byte is expected but missing |
| 165 | +* additional unexpected bytes are present |
| 166 | +* index size is set to 1 while not being necessary |
| 167 | +* a leading number of individual index is selected out of bounds of the in/outputs |
| 168 | +* individual indices are duplicated or not in increasing order |
| 169 | + |
| 170 | +These limitations are to avoid potential TxFieldSelector malleability. It is |
| 171 | +however allowed to use leading mode where it could be "all". This |
| 172 | +is important to allow for optional addition of extra inputs or outputs. |
| 173 | + |
| 174 | +//TODO(stevenroose) should we disallow individual that could be leading? |
| 175 | + |
| 176 | + |
| 177 | +## Resource limits |
| 178 | + |
| 179 | +* For legacy scripts and segwit, we don't add any extra resource limitations, |
| 180 | + with the argumentation that `OP_CHECKTXHASHVERIFY` already requires the user |
| 181 | + to provide at least 32 bytes of extra transaction size, either in the input |
| 182 | + scriptSig, or the witness. Additional more complex hashes require additional |
| 183 | + witness bytes. Given that `OP_CAT` is not available in this context, if a |
| 184 | + malicious user tries to increase the number of TransactionHashes being |
| 185 | + calculated by using opcodes like `OP_DUP`, the TxFieldSelector for all these |
| 186 | + calculations is identical, so the calculation can be cached within the same |
| 187 | + transaction. |
| 188 | + |
| 189 | +* For tapscript, primarily motivated by the cheaper opcode `OP_TXHASH` (it |
| 190 | + doesn't require an additional 32 witness bytes be provided) and the potential |
| 191 | + future addition of byte manipulation opcodes like `OP_CAT`, an additional |
| 192 | + cost is specified per TransactionHash execution. Using the same validation |
| 193 | + budget ("sigops budget") introduced in BIP-0342, each TransactionHash |
| 194 | + decreases the validation budget by 10. If this brings the budget below zero, |
| 195 | + the script fails immediately.<br>The following considerations should be made: |
| 196 | + * All fields that can be of arbitrary size are cachable as TransactionHash always hashes their hashed values. |
| 197 | + * In "individual mode", a user can at most commit 32 inputs or outputs, |
| 198 | + which we don't consider excessive for potential repeated use. |
| 199 | + * In "leading mode", a caching strategy can be used where the SHA256 context |
| 200 | + is stored every N in/outputs so that multiple executions of the |
| 201 | + TransactionHash function can use the caches and only have to hash an |
| 202 | + additional N-1 items at most. |
| 203 | + |
| 204 | + |
| 205 | +# Motivation |
| 206 | + |
| 207 | +This BIP specifies a basic transaction introspection primitive that is useful |
| 208 | +to either reduce interactivity in multi-user protocols or to enforce some basic |
| 209 | +constraints on transactions. |
| 210 | + |
| 211 | +Additionally, the constructions specified in this BIP can lay the groundwork for |
| 212 | +some potential future upgrades: |
| 213 | +* The TxFieldSelector construction would work well with a hypothetical opcode |
| 214 | + `OP_TX` that allows for directly introspecting the transaction by putting the |
| 215 | + fields selected on the stack instead of hashing them together. |
| 216 | +* The TransactionHash obtained by `OP_TXHASH` can be combined with a |
| 217 | + hypothetical opcode `OP_CHECKSIGFROMSTACK` to effectively create an |
| 218 | + incredibly flexible signature hash, which would enable constructions like |
| 219 | + `SIGHASH_ANYPREVOUT`. |
| 220 | + |
| 221 | +## Comparing with some alternative proposals |
| 222 | + |
| 223 | +* This proposal strictly generalizes BIP-119's `OP_CHECKTEMPLATEVERIFY`, as the |
| 224 | + default mode of our TxFieldSelector is effectively the same (though not |
| 225 | + byte-for-byte identical) as what `OP_CTV` acomplishes, without costing any |
| 226 | + additional bytes. Additionally, using `OP_CHECKTXHASHVERIFY` allows for more |
| 227 | + flexibility which can help in the case for |
| 228 | + * enabling adding fees to a transaction without breaking a multi-tx protocol; |
| 229 | + * multi-user protocols where users are only concerned about their own inputs and outputs. |
| 230 | + |
| 231 | +* Constructions like `OP_IN_OUT_VALUE` used with `OP_EQUALVERIFY` can be |
| 232 | + emulated by two `OP_TXHASH` instances by using the TxFieldSelector to select |
| 233 | + a single input value first and a single output value second and enforcing |
| 234 | + equality on the hashes. Neither of these alternatives can be used to enforce |
| 235 | + small value differencials without the availability of 64-bit arithmetic in |
| 236 | + Script. |
| 237 | + |
| 238 | +* Like mentioned above, `SIGHASH_ANYPREVOUT` can be emulated using `OP_TXHASH` |
| 239 | + when combined with `OP_CHECKSIGFROMSTACK`: |
| 240 | + `<txfs> OP_TXHASH <pubkey> OP_CHECKSIGFROMSTACK` effectively emulates `SIGHASH_ANYPREVOUT`. |
| 241 | + |
| 242 | + |
| 243 | +# Detailed Specification |
| 244 | + |
| 245 | +A reference implementation in Rust is provided attached as part of this BIP |
| 246 | +together with a JSON file of test vectors generated using the reference |
| 247 | +implementation. |
| 248 | + |
| 249 | + |
| 250 | +# Implementation |
| 251 | + |
| 252 | +* A proposed implementation for Bitcoin Core is available here: |
| 253 | + https://github.com/bitcoin/bitcoin/pull/29050 |
| 254 | +* A proposed implementation for rust-bitcoin is available here: |
| 255 | + https://github.com/rust-bitcoin/rust-bitcoin/pull/2275 |
| 256 | + |
| 257 | +Both of the above implementations perform effective caching to avoid potential |
| 258 | +denial-of-service attack vectors. |
| 259 | + |
| 260 | + |
| 261 | +# Acknowledgement |
| 262 | + |
| 263 | +Credit for this proposal mostly goes to Jeremy Rubin for his work on BIP-119's |
| 264 | +`OP_CHECKTEMPLATEVERIFY` and to Russell O'Connor for the original idea of |
| 265 | +generalizing `OP_CHECKTEMPLATEVERIFY` into `OP_TXHASH`. |
| 266 | + |
| 267 | +Additional thanks to Andrew Poelstra, Greg Sanders, Rearden Code, Rusty Russell |
| 268 | +and others for their feedback on the specification. |
| 269 | + |
0 commit comments