[[transactions]] == Transactions [[ch06_intro]] === Introduction ((("transactions", "defined")))((("warnings and cautions", see="also security")))Transactions are the most important part of the Bitcoin system. Everything else in bitcoin is designed to ensure that transactions can be created, propagated on the network, validated, and finally added to the global ledger of transactions (the blockchain). Transactions are data structures that encode the transfer of value between participants in the Bitcoin system. Each transaction is a public entry in bitcoin's blockchain, the global double-entry bookkeeping ledger. In this chapter we will examine all the various forms of transactions, what they contain, how to create them, how they are verified, and how they become part of the permanent record of all transactions. When we use the term "wallet" in this chapter, we are referring to the software that constructs transactions, not just the database of keys. [[tx_structure]] === Transactions in Detail ((("use cases", "buying coffee", id="alicesix")))In <>, we looked at the transaction Alice used to pay for coffee at Bob's coffee shop using a block explorer (<>). The block explorer application shows a transaction from Alice's "address" to Bob's "address." This is a much simplified view of what is contained in a transaction. In fact, as we will see in this chapter, much of the information shown is constructed by the block explorer and is not actually in the transaction. [[alices_transactions_to_bobs_cafe]] .Alice's transaction to Bob's Cafe image::images/mbc2_0208.png["Alice Coffee Transaction"] [[transactions_behind_the_scenes]] ==== Transactions—Behind the Scenes ((("transactions", "behind the scenes details of")))Behind the scenes, an actual transaction looks very different from a transaction provided by a typical block explorer. In fact, most of the high-level constructs we see in the various bitcoin application user interfaces _do not actually exist_ in the Bitcoin system. We can use Bitcoin Core's command-line interface (+getrawtransaction+ and +decoderawtransaction+) to retrieve Alice's "raw" transaction, decode it, and see what it contains. The result looks like this: [[alice_tx]] .Alice's transaction decoded [source,json] ---- { "version": 1, "locktime": 0, "vin": [ { "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18", "vout": 0, "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf", "sequence": 4294967295 } ], "vout": [ { "value": 0.01500000, "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY OP_CHECKSIG" }, { "value": 0.08450000, "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG", } ] } ---- You may notice a few things about this transaction, mostly the things that are missing! Where is Alice's address? Where is Bob's address? Where is the 0.1 input "sent" by Alice? In bitcoin, there are no coins, no senders, no recipients, no balances, no accounts, and no addresses. All those things are constructed at a higher level for the benefit of the user, to make things easier to understand. You may also notice a lot of strange and indecipherable fields and hexadecimal strings. Don't worry, we will explain each field shown here in detail in this chapter. [[tx_inputs_outputs]] === Transaction Outputs and Inputs ((("transactions", "outputs and inputs", id="Tout06")))((("outputs and inputs", "outputs defined")))((("unspent transaction outputs (UTXO)")))((("UTXO sets")))((("transactions", "outputs and inputs", "output characteristics")))((("outputs and inputs", "output characteristics")))The fundamental building block of a bitcoin transaction is a _transaction output_. Transaction outputs are indivisible chunks of bitcoin currency, recorded on the blockchain, and recognized as valid by the entire network. Bitcoin full nodes track all available and spendable outputs, known as _unspent transaction outputs_, or _UTXO_. The collection of all UTXO is known as the _UTXO set_ and currently numbers in the millions of UTXO. The UTXO set grows as new UTXO is created and shrinks when UTXO is consumed. Every transaction represents a change (state transition) in the UTXO set. ((("balances")))When we say that a user's wallet has "received" bitcoin, what we mean is that the wallet has detected an UTXO that can be spent with one of the keys controlled by that wallet. Thus, a user's bitcoin "balance" is the sum of all UTXO that user's wallet can spend and which may be scattered among hundreds of transactions and hundreds of blocks. The concept of a balance is created by the wallet application. The wallet calculates the user's balance by scanning the blockchain and aggregating the value of any UTXO the wallet can spend with the keys it controls. Most wallets maintain a database or use a database service to store a quick reference set of all the UTXO they can spend with the keys they control. ((("satoshis")))A transaction output can have an arbitrary (integer) value denominated as a multiple of satoshis. Just as dollars can be divided down to two decimal places as cents, bitcoin can be divided down to eight decimal places as satoshis. Although an output can have any arbitrary value, once created it is indivisible. This is an important characteristic of outputs that needs to be emphasized: outputs are _discrete_ and _indivisible_ units of value, denominated in integer satoshis. An unspent output can only be consumed in its entirety by a transaction. ((("change, making")))If an UTXO is larger than the desired value of a transaction, it must still be consumed in its entirety and change must be generated in the transaction. In other words, if you have an UTXO worth 20 bitcoin and want to pay only 1 bitcoin, your transaction must consume the entire 20-bitcoin UTXO and produce two outputs: one paying 1 bitcoin to your desired recipient and another paying 19 bitcoin in change back to your wallet. As a result of the indivisible nature of transaction outputs, most bitcoin transactions will have to generate change. Imagine a shopper buying a $1.50 beverage, reaching into her wallet and trying to find a combination of coins and bank notes to cover the $1.50 cost. The shopper will choose exact change if available e.g. a dollar bill and two quarters (a quarter is $0.25), or a combination of smaller denominations (six quarters), or if necessary, a larger unit such as a $5 note. If she hands too much money, say $5, to the shop owner, she will expect $3.50 change, which she will return to her wallet and have available for future transactions. Similarly, a bitcoin transaction must be created from a user's UTXO in whatever denominations that user has available. Users cannot cut an UTXO in half any more than they can cut a dollar bill in half and use it as currency. The user's wallet application will typically select from the user's available UTXO to compose an amount greater than or equal to the desired transaction amount. As with real life, the bitcoin application can use several strategies to satisfy the purchase amount: combining several smaller units, finding exact change, or using a single unit larger than the transaction value and making change. All of this complex assembly of spendable UTXO is done by the user's wallet automatically and is invisible to users. It is only relevant if you are programmatically constructing raw transactions from UTXO. A transaction consumes previously recorded unspent transaction outputs and creates new transaction outputs that can be consumed by a future transaction. This way, chunks of bitcoin value move forward from owner to owner in a chain of transactions consuming and creating UTXO. ((("transactions", "coinbase transactions")))((("coinbase transactions")))((("mining and consensus", "coinbase transactions")))The exception to the output and input chain is a special type of transaction called the _coinbase_ transaction, which is the first transaction in each block. This transaction is placed there by the "winning" miner and creates brand-new bitcoin payable to that miner as a reward for mining. This special coinbase transaction does not consume UTXO; instead, it has a special type of input called the "coinbase." This is how bitcoin's money supply is created during the mining process, as we will see in <>. [TIP] ==== What comes first? Inputs or outputs, the chicken or the egg? Strictly speaking, outputs come first because coinbase transactions, which generate new bitcoin, have no inputs and create outputs from nothing. ==== [[tx_outs]] ==== Transaction Outputs ((("transactions", "outputs and inputs", "output components")))((("outputs and inputs", "output parts")))Every bitcoin transaction creates outputs, which are recorded on the bitcoin ledger. Almost all of these outputs, with one exception (see <>) create spendable chunks of bitcoin called UTXO, which are then recognized by the whole network and available for the owner to spend in a future transaction. UTXO are tracked by every full-node Bitcoin client in the UTXO set. New transactions consume (spend) one or more of these outputs from the UTXO set. Transaction outputs consist of two parts: - An amount of bitcoin, denominated in _satoshis_, the smallest bitcoin unit - A cryptographic puzzle that determines the conditions required to spend the output ((("locking scripts")))((("scripting", "locking scripts")))((("witnesses")))((("scriptPubKey")))The cryptographic puzzle is also known as a _locking script_, a _witness script_, or a +scriptPubKey+. The transaction scripting language, used in the locking script mentioned previously, is discussed in detail in <>. Now, let's look at Alice's transaction (shown previously in <>) and see if we can identify the outputs. In the JSON encoding, the outputs are in an array (list) named +vout+: [source,json] ---- "vout": [ { "value": 0.01500000, "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY OP_CHECKSIG" }, { "value": 0.08450000, "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG", } ] ---- As you can see, the transaction contains two outputs. Each output is defined by a value and a cryptographic puzzle. In the encoding shown by Bitcoin Core, the value is shown in bitcoin, but in the transaction itself it is recorded as an integer denominated in satoshis. The second part of each output is the cryptographic puzzle that sets the conditions for spending. Bitcoin Core shows this as +scriptPubKey+ and shows us a human-readable representation of the script. The topic of locking and unlocking UTXO will be discussed later, in <>. The scripting language that is used for the script in +scriptPubKey+ is discussed in <>. But before we delve into those topics, we need to understand the overall structure of transaction inputs and outputs. ===== Transaction serialization—outputs ((("transactions", "outputs and inputs", "structure of")))((("outputs and inputs", "structure of")))((("serialization", "outputs")))When transactions are transmitted over the network or exchanged between applications, they are _serialized_. Serialization is the process of converting the internal representation of a data structure into a format that can be transmitted one byte at a time, also known as a byte stream. Serialization is most commonly used for encoding data structures for transmission over a network or for storage in a file. The serialization format of a transaction output is shown in <>. [[tx_out_structure]] .Transaction output serialization [options="header"] |======= |Size| Field | Description | 8 bytes (little-endian) | Amount | Bitcoin value in satoshis (10^-8^ bitcoin) | 1–9 bytes (VarInt) | Locking-Script Size | Locking-Script length in bytes, to follow | Variable | Locking-Script | A script defining the conditions needed to spend the output |======= Most bitcoin libraries and frameworks do not store transactions internally as byte-streams, as that would require complex parsing every time you needed to access a single field. For convenience and readability, bitcoin libraries store transactions internally in data structures (usually object-oriented structures). ((("deserialization")))((("parsing")))((("transactions", "parsing")))The process of converting from the byte-stream representation of a transaction to a library's internal representation data structure is called _deserialization_ or _transaction parsing_. The process of converting back to a byte-stream for transmission over the network, for hashing, or for storage on disk is called _serialization_. Most bitcoin libraries have built-in functions for transaction serialization and deserialization. See if you can manually decode Alice's transaction from the serialized hexadecimal form, finding some of the elements we saw previously. The section containing the two outputs is highlighted in <> to help you: [[example_6_1]] .Alice's transaction, serialized and presented in hexadecimal notation ==== +0100000001186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73+ +4d2804fe65fa35779000000008b483045022100884d142d86652a3f47+ +ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039+ +ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813+ +01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84+ +16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1+ +7b4a10fa336a8d752adfffffffff02+*+60e31600000000001976a914ab6+* *+8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000+* *+1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac+* +00000000+ ==== Here are some hints: - There are two outputs in the highlighted section, each serialized as shown in <>. - The value of 0.015 bitcoin is 1,500,000 satoshis. That's +16 e3 60+ in hexadecimal. - In the serialized transaction, the value +16 e3 60+ is encoded in little-endian (least-significant-byte-first) byte order, so it looks like +60 e3 16+. - The +scriptPubKey+ length is 25 bytes, which is +19+ in hexadecimal. [[tx_inputs]] ==== Transaction Inputs ((("transactions", "outputs and inputs", "input components")))((("outputs and inputs", "input components")))((("unspent transaction outputs (UTXO)")))((("UTXO sets")))Transaction inputs identify (by reference) which UTXO will be consumed and provide proof of ownership through an unlocking script. To build a transaction, a wallet selects from the UTXO it controls, UTXO with enough value to make the requested payment. Sometimes one UTXO is enough, other times more than one is needed. For each UTXO that will be consumed to make this payment, the wallet creates one input pointing to the UTXO and unlocks it with an unlocking script. Let's look at the components of an input in greater detail. The first part of an input is a pointer to an UTXO by reference to the transaction hash and an output index, which identifies the specific UTXO in that transaction. The second part is an unlocking script, which the wallet constructs in order to satisfy the spending conditions set in the UTXO. Most often, the unlocking script is a digital signature and public key proving ownership of the bitcoin. However, not all unlocking scripts contain signatures. The third part is a sequence number, which will be discussed later. Consider our example in <>. The transaction inputs are an array (list) called +vin+: [[vin]] .The transaction inputs in Alice's transaction [source,json] ---- "vin": [ { "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18", "vout": 0, "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf", "sequence": 4294967295 } ] ---- As you can see, there is only one input in the list (because one UTXO contained sufficient value to make this payment). The input contains four elements: - A ((("transaction IDs (txd)")))transaction ID, referencing the transaction that contains the UTXO being spent - An output index (+vout+), identifying which UTXO from that transaction is referenced (first one is zero) - A +scriptSig+, which satisfies the conditions placed on the UTXO, unlocking it for spending - A sequence number (to be discussed later) In Alice's transaction, the input points to the transaction ID: ---- 7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18 ---- and output index +0+ (i.e., the first UTXO created by that transaction). The unlocking script is constructed by Alice's wallet by first retrieving the referenced UTXO, examining its locking script, and then using it to build the necessary unlocking script to satisfy it. Looking just at the input you may have noticed that we don't know anything about this UTXO, other than a reference to the transaction containing it. We don't know its value (amount in satoshi), and we don't know the locking script that sets the conditions for spending it. To find this information, we must retrieve the referenced UTXO by retrieving the underlying transaction. Notice that because the value of the input is not explicitly stated, we must also use the referenced UTXO in order to calculate the fees that will be paid in this transaction (see <>). It's not just Alice's wallet that needs to retrieve UTXO referenced in the inputs. Once this transaction is broadcast to the network, every validating node will also need to retrieve the UTXO referenced in the transaction inputs in order to validate the transaction. Transactions on their own seem incomplete because they lack context. They reference UTXO in their inputs but without retrieving that UTXO we cannot know the value of the inputs or their locking conditions. When writing bitcoin software, anytime you decode a transaction with the intent of validating it or counting the fees or checking the unlocking script, your code will first have to retrieve the referenced UTXO from the blockchain in order to build the context implied but not present in the UTXO references of the inputs. For example, to calculate the amount paid in fees, you must know the sum of the values of inputs and outputs. But without retrieving the UTXO referenced in the inputs, you do not know their value. So a seemingly simple operation like counting fees in a single transaction in fact involves multiple steps and data from multiple transactions. We can use the same sequence of commands with Bitcoin Core as we used when retrieving Alice's transaction (+getrawtransaction+ and +decoderawtransaction+). With that we can get the UTXO referenced in the preceding input and take a look: [[alice_input_tx]] .Alice's UTXO from the previous transaction, referenced in the input [source,json] ---- "vout": [ { "value": 0.10000000, "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG" } ] ---- We see that this UTXO has a value of 0.1 BTC and that it has a locking script (+scriptPubKey+) that contains "OP_DUP OP_HASH160...". [TIP] ==== To fully understand Alice's transaction we had to retrieve the previous transaction(s) referenced as inputs. A function that retrieves previous transactions and unspent transaction outputs is very common and exists in almost every bitcoin library and API. ==== ===== Transaction serialization—inputs ((("serialization", "inputs")))((("transactions", "outputs and inputs", "input serialization")))((("outputs and inputs", "input serialization")))When transactions are serialized for transmission on the network, their inputs are encoded into a byte stream as shown in <>. [[tx_in_structure]] .Transaction input serialization [options="header"] |======= |Size| Field | Description | 32 bytes | Transaction Hash | Pointer to the transaction containing the UTXO to be spent | 4 bytes | Output Index | The index number of the UTXO to be spent; first one is 0 | 1–9 bytes (VarInt) | Unlocking-Script Size | Unlocking-Script length in bytes, to follow | Variable | Unlocking-Script | A script that fulfills the conditions of the UTXO locking script | 4 bytes | Sequence Number | Used for locktime or disabled (0xFFFFFFFF) |======= As with the outputs, let's see if we can find the inputs from Alice's transaction in the serialized format. First, the inputs decoded: [source,json] ---- "vin": [ { "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18", "vout": 0, "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf", "sequence": 4294967295 } ], ---- Now, let's see if we can identify these fields in the serialized hex encoding in <>: [[example_6_2]] .Alice's transaction, serialized and presented in hexadecimal notation ==== +0100000001+*+186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73+* *+4d2804fe65fa35779000000008b483045022100884d142d86652a3f47+* *+ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039+* *+ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813+* *+01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84+* *+16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1+* *+7b4a10fa336a8d752adfffffffff+*+0260e31600000000001976a914ab6+ +8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000+ +1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac00000+ +000+ ==== Hints: - The transaction ID is serialized in reversed byte order, so it starts with (hex) +18+ and ends with +79+ - The output index is a 4-byte group of zeros, easy to identify - The length of the +scriptSig+ is 139 bytes, or +8b+ in hex - The sequence number is set to +FFFFFFFF+, again easy to identify((("", startref="alicesix"))) === Bitcoin Addresses, Balances, and Other Abstractions ((("transactions", "higher-level abstractions", id="Thigher06")))We began this chapter with the discovery that transactions look very different "behind the scenes" than how they are presented in wallets, blockchain explorers, and other user-facing applications. Many of the simplistic and familiar concepts from the earlier chapters, such as Bitcoin addresses and balances, seem to be absent from the transaction structure. We saw that transactions don't contain Bitcoin addresses, per se, but instead operate through scripts that lock and unlock discrete values of bitcoin. Balances are not present anywhere in this system and yet every wallet application prominently displays the balance of the user's wallet. Now that we have explored what is actually included in a bitcoin transaction, we can examine how the higher-level abstractions are derived from the seemingly primitive components of the transaction. Let's look again at how Alice's transaction was presented on a popular block explorer (<>). [[alice_transaction_to_bobs_cafe]] .Alice's transaction to Bob's Cafe image::images/mbc2_0208.png["Alice Coffee Transaction"] On the left side of the transaction, the blockchain explorer shows Alice's Bitcoin address as the "sender." In fact, this information is not in the transaction itself. When the blockchain explorer retrieved the transaction it also retrieved the previous transaction referenced in the input and extracted the first output from that older transaction. Within that output is a locking script that locks the UTXO to Alice's public key hash (a P2PKH script). The blockchain explorer extracted the public key hash and encoded it using Base58Check encoding to produce and display the Bitcoin address that represents that public key. Similarly, on the right side, the blockchain explorer shows the two outputs; the first to Bob's Bitcoin address and the second to Alice's Bitcoin address (as change). Once again, to create these Bitcoin addresses, the blockchain explorer extracted the locking script from each output, recognized it as a P2PKH script, and extracted the public-key-hash from within. Finally, the blockchain explorer reencoded that public key hash with Base58Check to produce and display the Bitcoin addresses. If you were to click on Bob's Bitcoin address, the blockchain explorer would show you the view in <>. [[the_balance_of_bobs_bitcoin_address]] .The balance of Bob's Bitcoin address image::images/mbc2_0608.png["The balance of Bob's Bitcoin address"] The blockchain explorer displays the balance of Bob's Bitcoin address. But nowhere in the Bitcoin system is there a concept of a "balance." Rather, the values displayed here are constructed by the blockchain explorer as follows. To construct the "Total Received" amount, the blockchain explorer first will decode the Base58Check encoding of the Bitcoin address to retrieve the 160-bit hash of Bob's public key that is encoded within the address. Then, the blockchain explorer will search through the database of transactions, looking for outputs with P2PKH locking scripts that contain Bob's public key hash. By summing up the value of all the outputs, the blockchain explorer can produce the total value received. Constructing the current balance (displayed as "Final Balance") requires a bit more work. The blockchain explorer keeps a separate database of the outputs that are currently unspent, the UTXO set. To maintain this database, the blockchain explorer must monitor the Bitcoin network, add newly created UTXO, and remove spent UTXO, in real time, as they appear in unconfirmed transactions. This is a complicated process that depends on keeping track of transactions as they propagate, as well as maintaining consensus with the Bitcoin network to ensure that the correct chain is followed. Sometimes, the blockchain explorer goes out of sync and its perspective of the UTXO set is incomplete or incorrect. From the UTXO set, the blockchain explorer sums up the value of all unspent outputs referencing Bob's public key hash and produces the "Final Balance" number shown to the user. In order to produce this one image, with these two "balances," the blockchain explorer has to index and search through dozens, hundreds, or even hundreds of thousands of transactions. In summary, the information presented to users through wallet applications, blockchain explorers, and other bitcoin user interfaces is often composed of higher-level abstractions that are derived by searching many different transactions, inspecting their content, and manipulating the data contained within them. By presenting this simplistic view of bitcoin transactions that resemble bank checks from one sender to one recipient, these applications have to abstract a lot of underlying detail. They mostly focus on the common types of transactions: P2PKH with SIGHASH_ALL signatures on every input. Thus, while bitcoin applications can present more than 80% of all transactions in an easy-to-read manner, they are sometimes stumped by transactions that deviate from the norm. Transactions that contain more complex locking scripts, or different SIGHASH flags, or many inputs and outputs, demonstrate the simplicity and weakness of these abstractions. Every day, hundreds of transactions that do not contain P2PKH outputs are confirmed on the blockchain. The blockchain explorers often present these with red warning messages saying they cannot decode an address. The following link contains the most recent "strange transactions" that were not fully decoded: https://blockchain.info/strange-transactions[]. As we will see in the next chapter, these are not necessarily strange transactions. They are transactions that contain more complex locking scripts than the common P2PKH. We will learn how to decode and understand more complex scripts and the applications they support next.((("", startref="Thigher06")))((("", startref="alicesixtwo"))) === Timelocks ((("transactions", "advanced", "timelocks")))((("scripting", "timelocks", id="Stimelock07")))((("nLocktime field")))((("scripting", "timelocks", "uses for")))((("timelocks", "uses for")))Timelocks are restrictions on transactions or outputs that only allow spending after a point in time. Bitcoin has had a transaction-level timelock feature from the beginning. It is implemented by the +nLocktime+ field in a transaction. Two new timelock features were introduced in late 2015 and mid-2016 that offer UTXO-level timelocks. These are +CHECKLOCKTIMEVERIFY+ and +CHECKSEQUENCEVERIFY+. Timelocks are useful for postdating transactions and locking funds to a date in the future. More importantly, timelocks extend bitcoin scripting into the dimension of time, opening the door for complex multistep smart contracts. [[transaction_locktime_nlocktime]] ==== Transaction Locktime (nLocktime) ((("scripting", "timelocks", "nLocktime")))((("timelocks", "nLocktime")))From the beginning, Bitcoin has had a transaction-level timelock feature. Transaction locktime is a transaction-level setting (a field in the transaction data structure) that defines the earliest time that a transaction is valid and can be relayed on the network or added to the blockchain. Locktime is also known as +nLocktime+ from the variable name used in the Bitcoin Core codebase. It is set to zero in most transactions to indicate immediate propagation and execution. If +nLocktime+ is nonzero and below 500 million, it is interpreted as a block height, meaning the transaction is not valid and is not relayed or included in the blockchain prior to the specified block height. If it is above 500 million, it is interpreted as a Unix Epoch timestamp (seconds since Jan-1-1970) and the transaction is not valid prior to the specified time. Transactions with +nLocktime+ specifying a future block or time must be held by the originating system and transmitted to the Bitcoin network only after they become valid. If a transaction is transmitted to the network before the specified +nLocktime+, the transaction will be rejected by the first node as invalid and will not be relayed to other nodes. The use of +nLocktime+ is equivalent to postdating a paper check. [[locktime_limitations]] ===== Transaction locktime limitations +nLocktime+ has the limitation that while it makes it possible to spend some outputs in the future, it does not make it impossible to spend them until that time. Let's explain that with the following example. ((("use cases", "buying coffee", id="alicesseven")))Alice signs a transaction spending one of her outputs to Bob's address, and sets the transaction +nLocktime+ to 3 months in the future. Alice sends that transaction to Bob to hold. With this transaction Alice and Bob know that: - Bob cannot transmit the transaction to redeem the funds until 3 months have elapsed. - Bob may transmit the transaction after 3 months. However: - Alice can create another transaction, double-spending the same inputs without a locktime. Thus, Alice can spend the same UTXO before the 3 months have elapsed. - Bob has no guarantee that Alice won't do that. It is important to understand the limitations of transaction +nLocktime+. The only guarantee is that Bob will not be able to redeem it before 3 months have elapsed. There is no guarantee that Bob will get the funds. To achieve such a guarantee, the timelock restriction must be placed on the UTXO itself and be part of the locking script, rather than on the transaction. This is achieved by the next form of timelock, called Check Lock Time Verify. ==== Relative Timelocks with nSequence ((("nSequence field")))((("scripting", "timelocks", "relative timelocks with nSequence")))Relative timelocks can be set on each input of a transaction, by setting the +nSequence+ field in each input. ===== Original meaning of nSequence The +nSequence+ field was originally intended (but never properly implemented) to allow modification of transactions in the mempool. In that use, a transaction containing inputs with +nSequence+ value below 2^32^ - 1 (0xFFFFFFFF) indicated a transaction that was not yet "finalized." Such a transaction would be held in the mempool until it was replaced by another transaction spending the same inputs with a higher +nSequence+ value. Once a transaction was received whose inputs had an +nSequence+ value of 0xFFFFFFFF it would be considered "finalized" and mined. The original meaning of +nSequence+ was never properly implemented and the value of +nSequence+ is customarily set to 0xFFFFFFFF in transactions that do not utilize timelocks. For transactions with nLocktime or +CHECKLOCKTIMEVERIFY+, the +nSequence+ value must be set to less than 2^31^ for the timelock guards to have an effect, as explained below. ===== nSequence as a consensus-enforced relative timelock Since the activation of BIP-68, new consensus rules apply for any transaction containing an input whose +nSequence+ value is less than 2^31^ (bit 1<<31 is not set). Programmatically, that means that if the most significant (bit 1<<31) is not set, it is a flag that means "relative locktime." Otherwise (bit 1<<31 set), the +nSequence+ value is reserved for other uses such as enabling +CHECKLOCKTIMEVERIFY+, +nLocktime+, Opt-In-Replace-By-Fee, and other future developments. Transaction inputs with +nSequence+ values less than 2^31^ are interpreted as having a relative timelock. Such a transaction is only valid once the input has aged by the relative timelock amount. For example, a transaction with one input with an +nSequence+ relative timelock of 30 blocks is only valid when at least 30 blocks have elapsed from the time the UTXO referenced in the input was mined. Since +nSequence+ is a per-input field, a transaction may contain any number of timelocked inputs, all of which must have sufficiently aged for the transaction to be valid. A transaction can include both timelocked inputs (+nSequence+ < 2^31^) and inputs without a relative timelock (+nSequence+ >= 2^31^). The +nSequence+ value is specified in either blocks or seconds, but in a slightly different format than we saw used in +nLocktime+. A type-flag is used to differentiate between values counting blocks and values counting time in seconds. The type-flag is set in the 23rd least-significant bit (i.e., value 1<<22). If the type-flag is set, then the +nSequence+ value is interpreted as a multiple of 512 seconds. If the type-flag is not set, the +nSequence+ value is interpreted as a number of blocks. When interpreting +nSequence+ as a relative timelock, only the 16 least significant bits are considered. Once the flags (bits 32 and 23) are evaluated, the +nSequence+ value is usually "masked" with a 16-bit mask (e.g., +nSequence+ & 0x0000FFFF). <> shows the binary layout of the +nSequence+ value, as defined by BIP-68. [[bip_68_def_of_nseq]] .BIP-68 definition of nSequence encoding (Source: BIP-68) image::images/mbc2_0701.png["BIP-68 definition of nSequence encoding"] Relative timelocks based on consensus enforcement of the +nSequence+ value are defined in BIP-68. The standard is defined in https://github.com/bitcoin/bips/blob/master/bip-0068.mediawiki[BIP-68, Relative lock-time using consensus-enforced sequence numbers]. [[segwit]] === Segregated Witness ((("segwit (Segregated Witness)", id="Ssegwit07")))Segregated Witness (segwit) is an upgrade to the bitcoin consensus rules and network protocol, proposed and implemented as a BIP-9 soft-fork that was activated on bitcoin's mainnet on August 1st, 2017. In cryptography, the term "witness" is used to describe a solution to a cryptographic puzzle. In bitcoin terms, the witness satisfies a cryptographic condition placed on a unspent transaction output (UTXO). In the context of bitcoin, a digital signature is _one type of witness_, but a witness is more broadly any solution that can satisfy the conditions imposed on an UTXO and unlock that UTXO for spending. The term “witness” is a more general term for an “unlocking script” or “scriptSig.” Before segwit’s introduction, every input in a transaction was followed by the witness data that unlocked it. The witness data was embedded in the transaction as part of each input. The term _segregated witness_, or _segwit_ for short, simply means separating the signature or unlocking script of a specific output. Think "separate scriptSig," or “separate signature” in the simplest form. Segregated Witness therefore is an architectural change to bitcoin that aims to move the witness data from the +scriptSig+ (unlocking script) field of a transaction into a separate _witness_ data structure that accompanies a transaction. Clients may request transaction data with or without the accompanying witness data. In this section we will look at some of the benefits of Segregated Witness, describe the mechanism used to deploy and implement this architecture change, and demonstrate the use of Segregated Witness in transactions and addresses. Segregated Witness is defined by the following BIPs: https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki[BIP-141] :: The main definition of Segregated Witness. https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki[BIP-143] :: Transaction Signature Verification for Version 0 Witness Program https://github.com/bitcoin/bips/blob/master/bip-0144.mediawiki[BIP-144] :: Peer Services—New network messages and serialization formats https://github.com/bitcoin/bips/blob/master/bip-0145.mediawiki[BIP-145] :: getblocktemplate Updates for Segregated Witness (for mining) https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki[BIP-173]:: Base32 address format for native v0-16 witness outputs ==== Why Segregated Witness? Segregated Witness is an architectural change that has several effects on the scalability, security, economic incentives, and performance of bitcoin: Transaction Malleability :: By moving the witness outside the transaction, the transaction hash used as an identifier no longer includes the witness data. Since the witness data is the only part of the transaction that can be modified by a third party (see <>), removing it also removes the opportunity for transaction malleability attacks. With Segregated Witness, transaction hashes become immutable by anyone other than the creator of the transaction, which greatly improves the implementation of many other protocols that rely on advanced bitcoin transaction construction, such as payment channels, chained transactions, and lightning networks. Script Versioning :: With the introduction of Segregated Witness scripts, every locking script is preceded by a _script version_ number, similar to how transactions and blocks have version numbers. The addition of a script version number allows the scripting language to be upgraded in a backward-compatible way (i.e., using soft fork upgrades) to introduce new script operands, syntax, or semantics. The ability to upgrade the scripting language in a nondisruptive way will greatly accelerate the rate of innovation in bitcoin. Network and Storage Scaling :: The witness data is often a big contributor to the total size of a transaction. More complex scripts such as those used for multisig or payment channels are very large. In some cases these scripts account for the majority (more than 75%) of the data in a transaction. By moving the witness data outside the transaction, Segregated Witness improves bitcoin’s scalability. Nodes can prune the witness data after validating the signatures, or ignore it altogether when doing simplified payment verification. The witness data doesn’t need to be transmitted to all nodes and does not need to be stored on disk by all nodes. Signature Verification Optimization :: Segregated Witness upgrades the signature functions (+CHECKSIG+, +CHECKMULTISIG+, etc.) to reduce the algorithm's computational complexity. Before segwit, the algorithm used to produce a signature required a number of hash operations that was proportional to the size of the transaction. Data-hashing computations increased in O(n^2^) with respect to the number of signature operations, introducing a substantial computational burden on all nodes verifying the signature. With segwit, the algorithm is changed to reduce the complexity to O(n). Offline Signing Improvement :: Segregated Witness signatures incorporate the value (amount) referenced by each input in the hash that is signed. Previously, an offline signing device, such as a hardware wallet, would have to verify the amount of each input before signing a transaction. This was usually accomplished by streaming a large amount of data about the previous transactions referenced as inputs. Since the amount is now part of the commitment hash that is signed, an offline device does not need the previous transactions. If the amounts do not match (are misrepresented by a compromised online system), the signature will be invalid. ==== How Segregated Witness Works At first glance, Segregated Witness appears to be a change to how transactions are constructed and therefore a transaction-level feature, but it is not. Rather, Segregated Witness is a change to how individual UTXO are spent and therefore is a per-output feature. A transaction can spend Segregated Witness outputs or traditional (inline-witness) outputs or both. Therefore, it does not make much sense to refer to a transaction as a “Segregated Witness transaction.” Rather we should refer to specific transaction outputs as “Segregated Witness outputs." When a transaction spends an UTXO, it must provide a witness. In a traditional UTXO, the locking script requires that witness data be provided _inline_ in the input part of the transaction that spends the UTXO. A Segregated Witness UTXO, however, specifies a locking script that can be satisfied with witness data outside of the input (segregated). ==== Soft Fork (Backward Compatibility) Segregated Witness is a significant change to the way outputs and transactions are architected. Such a change would normally require a simultaneous change in every Bitcoin node and wallet to change the consensus rules—what is known as a hard fork. Instead, segregated witness is introduced with a much less disruptive change, which is backward compatible, known as a soft fork. This type of upgrade allows nonupgraded software to ignore the changes and continue to operate without any disruption. Segregated Witness outputs are constructed so that older systems that are not segwit-aware can still validate them. To an old wallet or node, a Segregated Witness output looks like an output that _anyone can spend_. Such outputs can be spent with an empty signature, therefore the fact that there is no signature inside the transaction (it is segregated) does not invalidate the transaction. Newer wallets and mining nodes, however, see the Segregated Witness output and expect to find a valid witness for it in the transaction’s witness data. [[segwit_txid]] ===== Transaction identifiers ((("transaction IDs (txid)")))One of the greatest benefits of Segregated Witness is that it eliminates third-party transaction malleability. Before segwit, transactions could have their signatures subtly modified by third parties, changing their transaction ID (hash) without changing any fundamental properties (inputs, outputs, amounts). This created opportunities for denial-of-service attacks as well as attacks against poorly written wallet software that assumed unconfirmed transaction hashes were immutable. With the introduction of Segregated Witness, transactions have two identifiers, +txid+ and +wtxid+. The traditional transaction ID +txid+ is the double-SHA256 hash of the serialized transaction, without the witness data. A transaction +wtxid+ is the double-SHA256 hash of the new serialization format of the transaction with witness data. The traditional +txid+ is calculated in exactly the same way as with a nonsegwit transaction. However, since the segwit transaction has empty ++scriptSig++s in every input, there is no part of the transaction that can be modified by a third party. Therefore, in a segwit transaction, the +txid+ is immutable by a third party, even when the transaction is unconfirmed. The +wtxid+ is like an "extended" ID, in that the hash also incorporates the witness data. If a transaction is transmitted without witness data, then the +wtxid+ and +txid+ are identical. Note than since the +wtxid+ includes witness data (signatures) and since witness data may be malleable, the +wtxid+ should be considered malleable until the transaction is confirmed. Only the +txid+ of a segwit transaction can be considered immutable by third parties and only if _all_ the inputs of the transaction are segwit inputs. [TIP] ==== Segregated Witness transactions have two IDs: +txid+ and +wtxid+. The +txid+ is the hash of the transaction without the witness data and the +wtxid+ is the hash inclusive of witness data. The +txid+ of a transaction where all inputs are segwit inputs is not susceptible to third-party transaction malleability. ==== ==== Economic Incentives for Segregated Witness Bitcoin mining nodes and full nodes incur costs for the resources used to support the Bitcoin network and the blockchain. As the volume of bitcoin transactions increases, so does the cost of resources (CPU, network bandwidth, disk space, memory). Miners are compensated for these costs through fees that are proportional to the size (in bytes) of each transaction. Nonmining full nodes are not compensated, so they incur these costs because they have a need to run an authoritative fully validating full-index node, perhaps because they use the node to operate a bitcoin business. Without transaction fees, the growth in bitcoin data would arguably increase dramatically. Fees are intended to align the needs of bitcoin users with the burden their transactions impose on the network, through a market-based price discovery mechanism. The calculation of fees based on transaction size treats all the data in the transaction as equal in cost. But from the perspective of full nodes and miners, some parts of a transaction carry much higher costs. Every transaction added to the Bitcoin network affects the consumption of four resources on nodes: Disk Space :: Every transaction is stored in the blockchain, adding to the total size of the blockchain. The blockchain is stored on disk, but the storage can be optimized by “pruning” older transactions. CPU :: Every transaction must be validated, which requires CPU time. Bandwidth :: Every transaction is transmitted (through flood propagation) across the network at least once. Without any optimization in the block propagation protocol, transactions are transmitted again as part of a block, doubling the impact on network capacity. Memory :: Nodes that validate transactions keep the UTXO index or the entire UTXO set in memory to speed up validation. Because memory is at least one order of magnitude more expensive than disk, growth of the UTXO set contributes disproportionately to the cost of running a node. As you can see from the list, not every part of a transaction has an equal impact on the cost of running a node or on the ability of bitcoin to scale to support more transactions. The most expensive part of a transaction are the newly created outputs, as they are added to the in-memory UTXO set. By comparison, signatures (aka witness data) add the least burden to the network and the cost of running a node, because witness data are only validated once and then never used again. Furthermore, immediately after receiving a new transaction and validating witness data, nodes can discard that witness data. If fees are calculated on transaction size, without discriminating between these two types of data, then the market incentives of fees are not aligned with the actual costs imposed by a transaction. In fact, the current fee structure actually encourages the opposite behavior, because witness data is the largest part of a transaction. The incentives created by fees matter because they affect the behavior of wallets. All wallets must implement some strategy for assembling transactions that takes into consideration a number of factors, such as privacy (reducing address reuse), fragmentation (making lots of loose change), and fees. If the fees are overwhelmingly motivating wallets to use as few inputs as possible in transactions, this can lead to UTXO picking and change address strategies that inadvertently bloat the UTXO set. Transactions consume UTXO in their inputs and create new UTXO with their outputs. A transaction, therefore, that has more inputs than outputs will result in a decrease in the UTXO set, whereas a transaction that has more outputs than inputs will result in an increase in the UTXO set. Let’s consider the _difference_ between inputs and outputs and call that the “Net-new-UTXO.” That’s an important metric, as it tells us what impact a transaction will have on the most expensive network-wide resource, the in-memory UTXO set. A transaction with positive Net-new-UTXO adds to that burden. A transaction with a negative Net-new-UTXO reduces the burden. We would therefore want to encourage transactions that are either negative Net-new-UTXO or neutral with zero Net-new-UTXO. Let’s look at an example of what incentives are created by the transaction fee calculation, with and without Segregated Witness. We will look at two different transactions. Transaction A is a 3-input, 2-output transaction, which has a Net-new-UTXO metric of –1, meaning it consumes one more UTXO than it creates, reducing the UTXO set by one. Transaction B is a 2-input, 3-output transaction, which has a Net-new-UTXO metric of 1, meaning it adds one UTXO to the UTXO set, imposing additional cost on the entire Bitcoin network. Both transactions use multisignature (2-of-3) scripts to demonstrate how complex scripts increase the impact of segregated witness on fees. Let’s assume a transaction fee of 30 satoshi per byte and a 75% fee discount on witness data: ++++
Without Segregated Witness

Transaction A fee: 25,710 satoshi

Transaction B fee: 18,990 satoshi

With Segregated Witness

Transaction A fee: 8,130 satoshi

Transaction B fee: 12,045 satoshi

++++ Both transactions are less expensive when segregated witness is implemented. But comparing the costs between the two transactions, we see that before Segregated Witness, the fee is higher for the transaction that has a negative Net-new-UTXO. After Segregated Witness, the transaction fees align with the incentive to minimize new UTXO creation by not inadvertently penalizing transactions with many inputs. Segregated Witness therefore has two main effects on the fees paid by Bitcoin users. Firstly, segwit reduces the overall cost of transactions by discounting witness data and increasing the capacity of the Bitcoin blockchain. Secondly, segwit’s discount on witness data corrects a misalignment of incentives that may have inadvertently created more bloat in the UTXO set.((("", startref="Tadv07")))((("", startref="Ssegwit07")))