From c5f1a3cd899b5ea9f07d46a47c09037a379cedb5 Mon Sep 17 00:00:00 2001 From: "Andreas M. Antonopoulos" Date: Sat, 26 Nov 2016 11:12:58 -0300 Subject: [PATCH] rewriteoutputs and inputs, reorg flow --- ch06.asciidoc | 161 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 140 insertions(+), 21 deletions(-) diff --git a/ch06.asciidoc b/ch06.asciidoc index 299569ce..e74512ef 100644 --- a/ch06.asciidoc +++ b/ch06.asciidoc @@ -25,6 +25,8 @@ Behind the scenes, an actual transaction looks very different from a transaction We can use Bitcoin Core's command-line interface (+getrawtransaction+ and +decoderawtransaction+) to retrieve Alice's "raw" transaction, decode it and see what it contains. The result looks like this: +[[alice_tx]] +.Alice's transaction decoded [source,json] ---- { @@ -58,15 +60,15 @@ You may also notice a lot of strange and indecipherable fields and hexadecimal s [[tx_inputs_outputs]] === Transaction Outputs and Inputs -((("transactions","unspent transaction output (UTXO)")))((("unspent transaction output (UTXO)")))The fundamental building block of a bitcoin transaction is a _transaction output_. Transaction outputs are indivisible chunks of bitcoin currency, recorded on the blockchain, and recognized as valid by the entire network. Bitcoin full nodes track all available and spendable outputs, known as _Unspent Transaction Outputs_ or _UTXO_. The collection of all UTXO is known as the UTXO Set and currently numbers in the millions of UTXO. +((("transactions","unspent transaction output (UTXO)")))((("unspent transaction output (UTXO)")))The fundamental building block of a bitcoin transaction is a _transaction output_. Transaction outputs are indivisible chunks of bitcoin currency, recorded on the blockchain, and recognized as valid by the entire network. Bitcoin full nodes track all available and spendable outputs, known as _Unspent Transaction Outputs_ or _UTXO_. The collection of all UTXO is known as the _UTXO set_ and currently numbers in the millions of UTXO. The UTXO set grows as new UTXO is created and shrinks when UTXO is consumed. Every transaction represents a change (state transition) in the UTXO set. When we say that a user's wallet has "received" bitcoin, what we mean is that the wallet has detected an unspent transaction output (UTXO) which can be spent with one of the keys controlled by that wallet. Thus, a user's bitcoin "balance" is the sum of all UTXO that user's wallet can spend and which may be scattered amongst hundreds of transactions and hundreds of blocks. The concept of a balance is a derived construct created by the wallet application. The wallet calculates the user's balance by scanning the blockchain and aggregating the value of any UTXO that the wallet can spend with the keys it controls. A transaction output can have an arbitrary value denominated as a multiple of((("satoshis"))) satoshis. Just like dollars can be divided down to two decimal places as cents, bitcoins can be divided down to eight decimal places as satoshis. Although an output can have any arbitrary value, once created it is indivisible. This is an important characteristic of outputs that needs to be emphasized: outputs are *discreet* and *indivisible* units of value, denominated in satoshis. An unspent output can only be consumed in its entirety by a transaction. -If an unspent transaction output is larger than the desired value of a transaction, it must still be consumed in its entirety and change must be generated in the transaction. ((("change, making")))In other words, if you have a UTXO worth 20 bitcoin and want to pay only 1 bitcoin, your transaction must consume the entire 20-bitcoin UTXO and produce two outputs: one paying 1 bitcoin to your desired recipient and another paying 19 bitcoin in change back to your wallet. As a result of the indivisibe nature of transaction outputs, most bitcoin transactions will have to generate change. +If an unspent transaction output is larger than the desired value of a transaction, it must still be consumed in its entirety and change must be generated in the transaction. ((("change, making")))In other words, if you have a UTXO worth 20 bitcoin and want to pay only 1 bitcoin, your transaction must consume the entire 20-bitcoin UTXO and produce two outputs: one paying 1 bitcoin to your desired recipient and another paying 19 bitcoin in change back to your wallet. As a result of the indivisible nature of transaction outputs, most bitcoin transactions will have to generate change. -Imagine a shopper buying a $1.50 beverage, reaching into her wallet and trying to find a combination of coins and bank notes to cover the $1.50 cost. The shopper will choose exact change if available (a dollar bill and two quarters), or a combination of smaller denominations (six quarters), or if necessary, a larger unit such as a five dollar bank note. If she hands too much money, say $5, to the shop owner, she will expect $3.50 change, which she will return to her wallet and have available for future transactions. +Imagine a shopper buying a $1.50 beverage, reaching into her wallet and trying to find a combination of coins and bank notes to cover the $1.50 cost. The shopper will choose exact change if available (for example, a dollar bill and two quarters), or a combination of smaller denominations (six quarters), or if necessary, a larger unit such as a five dollar bank note. If she hands too much money, say $5, to the shop owner, she will expect $3.50 change, which she will return to her wallet and have available for future transactions. Similarly, a bitcoin transaction must be created from a user's UTXO in whatever denominations that user has available. Users cannot cut a UTXO in half any more than they can cut a dollar bill in half and use it as currency. The user's wallet application will typically select from the user's available UTXO to compose an amount greater than or equal to the desired transaction amount. @@ -81,14 +83,12 @@ The exception to the output and input chain is a special type of transaction cal What comes first? Inputs or outputs, the chicken or the egg? Strictly speaking, outputs come first because coinbase transactions, which generate new bitcoin, have no inputs and create outputs from nothing. ==== -Since outputs come first, we will examine the - [[tx_outs]] ==== Transaction Outputs -((("bitcoin ledger, outputs in", id="ix_ch06-asciidoc2", range="startofrange")))((("transactions","outputs", id="ix_ch06-asciidoc3", range="startofrange")))((("unspent transaction output (UTXO)", id="ix_ch06-asciidoc4", range="startofrange")))Every bitcoin transaction creates outputs, which are recorded on the bitcoin ledger. Almost all of these outputs, with one exception (see <>) create spendable chunks of bitcoin called _unspent transaction outputs_ or UTXO, which are then recognized by the whole network and available for the owner to spend in a future transaction. Sending someone bitcoin is creating an unspent transaction output (UTXO) for them to spend. +((("bitcoin ledger, outputs in", id="ix_ch06-asciidoc2", range="startofrange")))((("transactions","outputs", id="ix_ch06-asciidoc3", range="startofrange")))((("unspent transaction output (UTXO)", id="ix_ch06-asciidoc4", range="startofrange")))Every bitcoin transaction creates outputs, which are recorded on the bitcoin ledger. Almost all of these outputs, with one exception (see <>) create spendable chunks of bitcoin called UTXO, which are then recognized by the whole network and available for the owner to spend in a future transaction. -UTXO are tracked by every full-node bitcoin client as a data set called the((("UTXO pool")))((("UTXO set"))) _UTXO set_ or _UTXO pool_, held in a database. New transactions consume (spend) one or more of these outputs from the UTXO set. +UTXO are tracked by every full-node bitcoin client in the UTXO set. New transactions consume (spend) one or more of these outputs from the UTXO set. Transaction outputs consist of two parts: @@ -99,14 +99,15 @@ The cryptographic puzzle, is also known as a ((("locking scripts"))) _locking sc The transaction scripting language, used in the locking script mentioned previously, is discussed in detail in <>. -Now, let's look at Alice's transaction and see if we can identify the outputs. In the JSON encoding, the outputs are in an array (list) named +vout+: +Now, let's look at Alice's transaction (shown previously in <>) and see if we can identify the outputs. In the JSON encoding, the outputs are in an array (list) named +vout+: [source,json] ---- "vout": [ { "value": 0.01500000, - "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY OP_CHECKSIG" + "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY + OP_CHECKSIG" }, { "value": 0.08450000, @@ -115,35 +116,103 @@ Now, let's look at Alice's transaction and see if we can identify the outputs. I ] ---- -As you can see, the transaction contains two outputs. Each output is defined by a value and a cryptographic puzzle. In the encoding shown by Bitcoin Core above, the value is shown in bitcoin. The second part of each output is the cryptographic puzzle that sets the conditions for spending. Bitcoin Core shows this as +scriptPubKey+ and shows us a human-readable representation of the script (see <>). +As you can see, the transaction contains two outputs. Each output is defined by a value and a cryptographic puzzle. In the encoding shown by Bitcoin Core above, the value is shown in bitcoin. The second part of each output is the cryptographic puzzle that sets the conditions for spending. Bitcoin Core shows this as +scriptPubKey+ and shows us a human-readable representation of the script. -When transactions are transmitted over the network or exchanged between applications, they are _serialized_ and represented as a byte-stream, with a standardized variable-length encoding for each of the fields in the transaction. The serialization format of a transaction output is show in <>: +The topic of locking and unlocking UTXO will be discussed later, in <>. The scripting language that is used for the script in +scriptPubKey+ is discussed in <>. Before we delve into those topics, we need to understand the overall structure of transaction inputs and outputs. + +When transactions are transmitted over the network or exchanged between applications, they are _serialized_. (((serialization)))Serialization is the process of converting the internal representation of a data structure into a format that can be transmitted one byte at a time, also known as a byte-stream. Serialization is most commonly used for encoding data structures for transmission over a network or for storage in a file. The serialization format of a transaction output is shown in <>: [[tx_out_structure]] -.The structure of a transaction output +.Transaction output serialization [options="header"] |======= |Size| Field | Description -| 8 bytes | Amount | Bitcoin value in satoshis (10^-8^ bitcoin) +| 8 bytes (little-endian) | Amount | Bitcoin value in satoshis (10^-8^ bitcoin) | 1-9 bytes (VarInt) | Locking-Script Size | Locking-Script length in bytes, to follow | Variable | Locking-Script | A script defining the conditions needed to spend the output |======= -==== Locking and Unlocking UTXO - Cryptographic Puzzles and Witnesses +The process of converting from the byte-stream representation of a transaction to whatever data structure (e.g. a transaction object) is used to store transactions internally in your program, is called _de-serialization_ or _transaction parsing_. (((de-serialization)))Most bitcoin libraries have functions for transaction serialization and de-serialization. +See if you can manually decode Alice's transaction from the serialized hexadecimal form, finding some of the elements we saw above. The section containing the two outputs is highlighted to help you: +==== ++0100000001186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73+ ++4d2804fe65fa35779000000008b483045022100884d142d86652a3f47+ ++ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039+ ++ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813+ ++01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84+ ++16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1+ ++7b4a10fa336a8d752adfffffffff02+*+60e31600000000001976a914ab6+* +*+8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000+* +*+1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac+* ++00000000+ +==== + +Here are some hints: + +* There are two outputs in the highlighted section, each serialized as shown in the table <> +* The value of 0.15 bitcoin is 1,500,000 satoshis. That's +16 e3 60+ in hexadecimal. +* In the serialized transaction, the value +16 e3 60+ is encoded in little-endian (least-significant-byte-first) byte order, so it looks like +60 e3 16+ +* The +scriptPubKey+ length is 25 bytes, which is +19+ in hexadecimal [[tx_inputs]] ==== Transaction Inputs -((("transactions","inputs", id="ix_ch06-asciidoc5", range="startofrange")))In simple terms, transaction inputs are pointers to UTXO. They point to a specific UTXO by reference to the transaction hash and sequence number where the UTXO is recorded in the blockchain. To spend UTXO, a transaction input also includes unlocking scripts that satisfy the spending conditions set by the UTXO. The unlocking script is usually a signature proving ownership of the bitcoin address that is in the locking script. +((("transactions","inputs", id="ix_ch06-asciidoc5", range="startofrange")))In simple terms, transaction inputs are pointers to UTXO. They point to a specific UTXO by reference to the transaction hash and sequence number where the UTXO is recorded in the blockchain. To spend UTXO, a transaction input also includes an unlocking script, also known as a _witness_, that satisfies the spending conditions set by the UTXO locking script. Most often, the unlocking script is a digital signature and public key proving ownership of the bitcoin. However, not all unlocking scripts contain signatures. -When users make a payment, their wallet constructs a transaction by selecting from the available UTXO. For example, to make a 0.015 bitcoin payment, the wallet app may select a 0.01 UTXO and a 0.005 UTXO, using them both to add up to the desired payment amount. +Let's look back at our example in <>. The transaction inputs are an array (list) called +vin+: -Once the UTXO is selected, the wallet then produces unlocking scripts containing signatures for each of the UTXO, thereby making them spendable by satisfying their locking script conditions. The wallet adds these UTXO references and unlocking scripts as inputs to the transaction. <> shows the structure of a transaction input. +[[vin]] +.The transaction inputs in Alice's transaction +[source,json] +---- +"vin": [ + { + "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18", + "vout": 0, + "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf", + "sequence": 4294967295 + } +] +---- + +As you can see, there is only one input in the list. It contains four elements: + +* A transaction ID, referencing the transaction which contains the UTXO being spent +* An output index (+vout+), identifying which UTXO from that transaction is referenced (first one is zero) +* A scriptSig, which satisfies the conditions placed on the UTXO, unlocking it for spending +* A sequence number (to be discussed later) + +The transaction ID and output index, together uniquely identify a previously created UTXO, by reference (transaction ID) to the transaction that contains it and an index number, which starts at zero. In Alice's transaction, the input points to transaction ID +7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18+ and output index +0+ (ie. the first UTXO created by that transaction). + +At this point you may have noticed that we don't know anything about this UTXO, other than a reference to the transaction containing it. We don't know it's value (amount in satoshi), and we don't know the locking script that sets the conditions for spending it. To find this information, we must retrieve the transaction from the blockchain and look at the specific UTXO. + +We can use the same sequence of commands with Bitcoin Core as we used when retrieving Alice's transaction (+getrawtransaction+ and +decoderawtransaction+). With that we can get the outputs and take a look: + +[[alice_input_tx]] +.Alice's UTXO from the previous transaction, used as an input +[source,json] +---- +"vout": [ + { + "value": 0.10000000, + "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG" + } + ] +---- + +When we retrieve the previous transaction we find it only has one output (vout index +0+). We see that it has a value of 0.1 BTC and that it has a locking script (+scriptPubKey+) which contains "OP_DUP OP_HASH160...". This UTXO of 0.1 BTC is the input that is spent (indivisibly and entirely) by Alice's transaction. + +[TIP] +==== +To fully understand Alice's transaction we had to retrieve the previous transaction(s) referenced as inputs. A function that retrieves previous transactions and unspent transaction outputs is very common and exists in almost every bitcoin library and API. +==== + +When transactions are serialized for transmission on the network, their inputs are encoded into a byte-stream as follows: [[tx_in_structure]] -.The structure of a transaction input +.Transaction input serialization [options="header"] |======= |Size| Field | Description @@ -151,14 +220,47 @@ Once the UTXO is selected, the wallet then produces unlocking scripts containing | 4 bytes | Output Index | The index number of the UTXO to be spent; first one is 0 | 1-9 bytes (VarInt) | Unlocking-Script Size | Unlocking-Script length in bytes, to follow | Variable | Unlocking-Script | A script that fulfills the conditions of the UTXO locking script. -| 4 bytes | Sequence Number | Currently disabled Tx-replacement feature, set to 0xFFFFFFFF +| 4 bytes | Sequence Number | Used for locktime or disabled (0xFFFFFFFF) |======= -[NOTE] +As with the outputs, let's see if we can find the inputs from Alice's transaction in the serialized format. First, the inputs decoded: + +[source,json] +---- +"vin": [ + { + "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18", + "vout": 0, + "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf", + "sequence": 4294967295 + } +], +---- + +Now, let's see if we can identify these fields in the serialized hex encoding: + ==== -The sequence number is used to override a transaction prior to the expiration of the transaction locktime, which is a feature that is currently disabled in bitcoin. Most transactions set this value to the maximum integer value (0xFFFFFFFF) and it is ignored by the bitcoin network. If the transaction has a nonzero locktime, at least one of its inputs must have a sequence number below 0xFFFFFFFF in order to enable locktime.(((range="endofrange", startref="ix_ch06-asciidoc5"))) + ++0100000001+*+186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73+* +*+4d2804fe65fa35779000000008b483045022100884d142d86652a3f47+* +*+ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039+* +*+ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813+* +*+01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84+* +*+16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1+* +*+7b4a10fa336a8d752adfffffffff+*+0260e31600000000001976a914ab6+ ++8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000+ ++1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac00000+ ++000+ + ==== +Hints: + +* The transaction ID is serialized in reversed byte order, so it starts with (hex) +18+ and ends with +79+ +* The output index is a 4-byte group of zeroes, easy to identify +* The length of the scriptSig is 139 bytes, or +8b+ in hex. +* The sequence number is set to +FFFFFFFF+, again easy to identify + [[tx_fees]] ==== Transaction Fees @@ -174,6 +276,11 @@ The current algorithm used by miners to prioritize transactions for inclusion in ==== Adding Fees to Transactions +//// + +Fees market - bitcoin fees - confirmation time and competition +//// + ((("fees, transaction","adding", id="ix_ch06-asciidoc7", range="startofrange")))((("transactions","fees", id="ix_ch06-asciidoc8", range="startofrange")))The data structure of transactions does not have a field for fees. Instead, fees are implied as the difference between the sum of inputs and the sum of outputs. Any excess amount that remains after all outputs have been deducted from all inputs is the fee that is collected by the miners. @@ -200,15 +307,27 @@ As Eugenia's wallet application tries to construct a single larger payment trans Eugenia's wallet application will calculate the appropriate fee by measuring the size of the transaction and multiplying that by the per-kilobyte fee. Many wallets will overpay fees for larger transactions to ensure the transaction is processed promptly. The higher fee is not because Eugenia is spending more money, but because her transaction is more complex and larger in size—the fee is independent of the transaction's bitcoin value.(((range="endofrange", startref="ix_ch06-asciidoc8")))(((range="endofrange", startref="ix_ch06-asciidoc7"))) + + + + [[tx_chains]] === Transaction Chaining and Orphan Transactions +//// +CPFP +//// + ((("chaining transactions")))((("orphan transactions")))((("transactions","chaining")))((("transactions","orphan")))As we have seen, transactions form a chain, whereby one transaction spends the outputs of the previous transaction (known as the parent) and creates outputs for a subsequent transaction (known as the child). Sometimes an entire chain of transactions depending on each other—say a parent, child, and grandchild transaction—are created at the same time, to fulfill a complex transactional workflow that requires valid children to be signed before the parent is signed. For example, this is a technique used in((("CoinJoin"))) CoinJoin transactions where multiple parties join transactions together to protect their privacy. When a chain of transactions is transmitted across the network, they don't always arrive in the same order. Sometimes, the child might arrive before the parent. In that case, the nodes that see a child first can see that it references a parent transaction that is not yet known. Rather than reject the child, they put it in a temporary pool to await the arrival of its parent and propagate it to every other node. The pool of transactions without parents is known as the((("orphan transaction pool"))) _orphan transaction pool_. Once the parent arrives, any orphans that reference the UTXO created by the parent are released from the pool, revalidated recursively, and then the entire chain of transactions can be included in the transaction pool, ready to be mined in a block. Transaction chains can be arbitrarily long, with any number of generations transmitted simultaneously. The mechanism of holding orphans in the orphan pool ensures that otherwise valid transactions will not be rejected just because their parent has been delayed and that eventually the chain they belong to is reconstructed in the correct order, regardless of the order of arrival. There is a limit to the number of orphan transactions stored in memory, to prevent a denial-of-service attack against bitcoin nodes. The limit is defined as((("MAX_ORPHAN_TRANSACTIONS constant"))) +MAX_ORPHAN_TRANSACTIONS+ in the source code of the bitcoin reference client. If the number of orphan transactions in the pool exceeds +MAX_ORPHAN_TRANSACTIONS+, one or more randomly selected orphan transactions are evicted from the pool, until the pool size is back within limits. +[[tx_lock_unlock]] +==== Locking and Unlocking UTXO - Cryptographic Puzzles and Witnesses + + [[tx_script]] === Transaction Scripts and Script Language