1
0
mirror of https://github.com/bitcoinbook/bitcoinbook synced 2024-12-01 20:38:39 +00:00
bitcoinbook/chapters/transactions.adoc

1085 lines
51 KiB
Plaintext
Raw Normal View History

[[transactions]]
== Transactions
[[ch06_intro]]
=== Introduction
((("transactions", "defined")))((("warnings and cautions", see="also
security")))Transactions are the most important part of the Bitcoin
system. Everything else in bitcoin is designed to ensure that
transactions can be created, propagated on the network, validated, and
finally added to the global ledger of transactions (the blockchain).
Transactions are data structures that encode the transfer of value
between participants in the Bitcoin system. Each transaction is a public
entry in bitcoin's blockchain, the global double-entry bookkeeping
ledger.
In this chapter we will examine all the various forms of transactions,
what they contain, how to create them, how they are verified, and how
they become part of the permanent record of all transactions. When we
use the term "wallet" in this chapter, we are referring to the software
that constructs transactions, not just the database of keys.
[[tx_structure]]
=== Transactions in Detail
((("use cases", "buying coffee", id="alicesix")))In
<<ch02_bitcoin_overview>>, we looked at the transaction Alice used to
pay for coffee at Bob's coffee shop using a block explorer
(<<alices_transactions_to_bobs_cafe>>).
The block explorer application shows a transaction from Alice's
"address" to Bob's "address." This is a much simplified view of what is
contained in a transaction. In fact, as we will see in this chapter,
much of the information shown is constructed by the block explorer and
is not actually in the transaction.
[[alices_transactions_to_bobs_cafe]]
.Alice's transaction to Bob's Cafe
image::images/mbc2_0208.png["Alice Coffee Transaction"]
[[transactions_behind_the_scenes]]
==== Transactions&#x2014;Behind the Scenes
((("transactions", "behind the scenes details of")))Behind the scenes,
an actual transaction looks very different from a transaction provided
by a typical block explorer. In fact, most of the high-level constructs
we see in the various bitcoin application user interfaces _do not
actually exist_ in the Bitcoin system.
We can use Bitcoin Core's command-line interface (+getrawtransaction+
and +decoderawtransaction+) to retrieve Alice's "raw" transaction,
decode it, and see what it contains. The result looks like this:
[[alice_tx]]
.Alice's transaction decoded
[source,json]
----
{
"version": 1,
"locktime": 0,
"vin": [
{
"txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18",
"vout": 0,
"scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf",
"sequence": 4294967295
}
],
"vout": [
{
"value": 0.01500000,
"scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY OP_CHECKSIG"
},
{
"value": 0.08450000,
"scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG",
}
]
}
----
You may notice a few things about this transaction, mostly the things
that are missing! Where is Alice's address? Where is Bob's address?
Where is the 0.1 input "sent" by Alice? In bitcoin, there are no coins,
no senders, no recipients, no balances, no accounts, and no addresses.
All those things are constructed at a higher level for the benefit of
the user, to make things easier to understand.
You may also notice a lot of strange and indecipherable fields and
hexadecimal strings. Don't worry, we will explain each field shown here
in detail in this chapter.
[[tx_inputs_outputs]]
=== Transaction Outputs and Inputs
((("transactions", "outputs and inputs", id="Tout06")))((("outputs and
inputs", "outputs defined")))((("unspent transaction outputs
(UTXO)")))((("UTXO sets")))((("transactions", "outputs and inputs",
"output characteristics")))((("outputs and inputs", "output
characteristics")))The fundamental building block of a bitcoin
transaction is a _transaction output_. Transaction outputs are
indivisible chunks of bitcoin currency, recorded on the blockchain, and
recognized as valid by the entire network. Bitcoin full nodes track all
available and spendable outputs, known as _unspent transaction outputs_,
or _UTXO_. The collection of all UTXO is known as the _UTXO set_ and
currently numbers in the millions of UTXO. The UTXO set grows as new
UTXO is created and shrinks when UTXO is consumed. Every transaction
represents a change (state transition) in the UTXO set.
((("balances")))When we say that a user's wallet has "received" bitcoin,
what we mean is that the wallet has detected an UTXO that can be spent
with one of the keys controlled by that wallet. Thus, a user's bitcoin
"balance" is the sum of all UTXO that user's wallet can spend and which
may be scattered among hundreds of transactions and hundreds of blocks.
The concept of a balance is created by the wallet application. The
wallet calculates the user's balance by scanning the blockchain and
aggregating the value of any UTXO the wallet can spend with the keys it
controls. Most wallets maintain a database or use a database service to
store a quick reference set of all the UTXO they can spend with the keys
they control.
((("satoshis")))A transaction output can have an arbitrary (integer)
value denominated as a multiple of satoshis. Just as dollars can be
divided down to two decimal places as cents, bitcoin can be divided down
to eight decimal places as satoshis. Although an output can have any
arbitrary value, once created it is indivisible. This is an important
characteristic of outputs that needs to be emphasized: outputs are
_discrete_ and _indivisible_ units of value, denominated in integer
satoshis. An unspent output can only be consumed in its entirety by a
transaction.
((("change, making")))If an UTXO is larger than the desired value of a
transaction, it must still be consumed in its entirety and change must
be generated in the transaction. In other words, if you have an UTXO
worth 20 bitcoin and want to pay only 1 bitcoin, your transaction must
consume the entire 20-bitcoin UTXO and produce two outputs: one paying 1
bitcoin to your desired recipient and another paying 19 bitcoin in
change back to your wallet. As a result of the indivisible nature of
transaction outputs, most bitcoin transactions will have to generate
change.
Imagine a shopper buying a $1.50 beverage, reaching into her wallet and
trying to find a combination of coins and bank notes to cover the $1.50
cost. The shopper will choose exact change if available e.g. a dollar
bill and two quarters (a quarter is $0.25), or a combination of smaller
denominations (six quarters), or if necessary, a larger unit such as a
$5 note. If she hands too much money, say $5, to the shop owner, she
will expect $3.50 change, which she will return to her wallet and have
available for future transactions.
Similarly, a bitcoin transaction must be created from a user's UTXO in
whatever denominations that user has available. Users cannot cut an UTXO
in half any more than they can cut a dollar bill in half and use it as
currency. The user's wallet application will typically select from the
user's available UTXO to compose an amount greater than or equal to the
desired transaction amount.
As with real life, the bitcoin application can use several strategies to
satisfy the purchase amount: combining several smaller units, finding
exact change, or using a single unit larger than the transaction value
and making change. All of this complex assembly of spendable UTXO is
done by the user's wallet automatically and is invisible to users. It is
only relevant if you are programmatically constructing raw transactions
from UTXO.
A transaction consumes previously recorded unspent transaction outputs
and creates new transaction outputs that can be consumed by a future
transaction. This way, chunks of bitcoin value move forward from owner
to owner in a chain of transactions consuming and creating UTXO.
((("transactions", "coinbase transactions")))((("coinbase
transactions")))((("mining and consensus", "coinbase transactions")))The
exception to the output and input chain is a special type of transaction
called the _coinbase_ transaction, which is the first transaction in
each block. This transaction is placed there by the "winning" miner and
creates brand-new bitcoin payable to that miner as a reward for mining.
This special coinbase transaction does not consume UTXO; instead, it has
a special type of input called the "coinbase." This is how bitcoin's
money supply is created during the mining process, as we will see in
<<mining>>.
[TIP]
====
What comes first? Inputs or outputs, the chicken or the egg? Strictly
speaking, outputs come first because coinbase transactions, which
generate new bitcoin, have no inputs and create outputs from nothing.
====
[[tx_outs]]
==== Transaction Outputs
((("transactions", "outputs and inputs", "output
components")))((("outputs and inputs", "output parts")))Every bitcoin
transaction creates outputs, which are recorded on the bitcoin ledger.
Almost all of these outputs, with one exception (see <<op_return>>)
create spendable chunks of bitcoin called UTXO, which are then
recognized by the whole network and available for the owner to spend in
a future transaction.
UTXO are tracked by every full-node Bitcoin client in the UTXO set. New
transactions consume (spend) one or more of these outputs from the UTXO
set.
Transaction outputs consist of two parts:
- An amount of bitcoin, denominated in _satoshis_, the smallest bitcoin
unit
- A cryptographic puzzle that determines the conditions required to
spend the output
((("locking scripts")))((("scripting", "locking
scripts")))((("witnesses")))((("scriptPubKey")))The cryptographic puzzle
is also known as a _locking script_, a _witness script_, or a
+scriptPubKey+.
The transaction scripting language, used in the locking script mentioned
previously, is discussed in detail in <<tx_script>>.
Now, let's look at Alice's transaction (shown previously in
<<transactions_behind_the_scenes>>) and see if we can identify the
outputs. In the JSON encoding, the outputs are in an array (list) named
+vout+:
[source,json]
----
"vout": [
{
"value": 0.01500000,
"scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY
OP_CHECKSIG"
},
{
"value": 0.08450000,
"scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG",
}
]
----
As you can see, the transaction contains two outputs. Each output is
defined by a value and a cryptographic puzzle. In the encoding shown by
Bitcoin Core, the value is shown in bitcoin, but in the transaction
itself it is recorded as an integer denominated in satoshis. The second
part of each output is the cryptographic puzzle that sets the conditions
for spending. Bitcoin Core shows this as +scriptPubKey+ and shows us a
human-readable representation of the script.
The topic of locking and unlocking UTXO will be discussed later, in
<<tx_lock_unlock>>. The scripting language that is used for the script
in +scriptPubKey+ is discussed in <<tx_script>>. But before we delve
into those topics, we need to understand the overall structure of
transaction inputs and outputs.
===== Transaction serialization&#x2014;outputs
((("transactions", "outputs and inputs", "structure of")))((("outputs
and inputs", "structure of")))((("serialization", "outputs")))When
transactions are transmitted over the network or exchanged between
applications, they are _serialized_. Serialization is the process of
converting the internal representation of a data structure into a format
that can be transmitted one byte at a time, also known as a byte stream.
Serialization is most commonly used for encoding data structures for
transmission over a network or for storage in a file. The serialization
format of a transaction output is shown in <<tx_out_structure>>.
[[tx_out_structure]]
.Transaction output serialization
[options="header"]
|=======
|Size| Field | Description
| 8 bytes (little-endian) | Amount | Bitcoin value in satoshis (10^-8^ bitcoin)
| 1&#x2013;9 bytes (VarInt) | Locking-Script Size | Locking-Script length in bytes, to follow
| Variable | Locking-Script | A script defining the conditions needed to spend the output
|=======
Most bitcoin libraries and frameworks do not store transactions
internally as byte-streams, as that would require complex parsing every
time you needed to access a single field. For convenience and
readability, bitcoin libraries store transactions internally in data
structures (usually object-oriented structures).
((("deserialization")))((("parsing")))((("transactions", "parsing")))The
process of converting from the byte-stream representation of a
transaction to a library's internal representation data structure is
called _deserialization_ or _transaction parsing_. The process of
converting back to a byte-stream for transmission over the network, for
hashing, or for storage on disk is called _serialization_. Most bitcoin
libraries have built-in functions for transaction serialization and
deserialization.
See if you can manually decode Alice's transaction from the serialized
hexadecimal form, finding some of the elements we saw previously. The
section containing the two outputs is highlighted in <<example_6_1>> to
help you:
[[example_6_1]]
.Alice's transaction, serialized and presented in hexadecimal notation
====
+0100000001186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73+
+4d2804fe65fa35779000000008b483045022100884d142d86652a3f47+
+ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039+
+ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813+
+01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84+
+16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1+
+7b4a10fa336a8d752adfffffffff02+*+60e31600000000001976a914ab6+*
*+8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000+*
*+1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac+*
+00000000+
====
Here are some hints:
- There are two outputs in the highlighted section, each serialized as
shown in <<tx_out_structure>>.
- The value of 0.015 bitcoin is 1,500,000 satoshis. That's +16 e3 60+ in
hexadecimal.
- In the serialized transaction, the value +16 e3 60+ is encoded in
little-endian (least-significant-byte-first) byte order, so it looks
like +60 e3 16+.
- The +scriptPubKey+ length is 25 bytes, which is +19+ in hexadecimal.
[[tx_inputs]]
==== Transaction Inputs
((("transactions", "outputs and inputs", "input
components")))((("outputs and inputs", "input components")))((("unspent
transaction outputs (UTXO)")))((("UTXO sets")))Transaction inputs
identify (by reference) which UTXO will be consumed and provide proof of
ownership through an unlocking script.
To build a transaction, a wallet selects from the UTXO it controls, UTXO
with enough value to make the requested payment. Sometimes one UTXO is
enough, other times more than one is needed. For each UTXO that will be
consumed to make this payment, the wallet creates one input pointing to
the UTXO and unlocks it with an unlocking script.
Let's look at the components of an input in greater detail. The first
part of an input is a pointer to an UTXO by reference to the transaction
hash and an output index, which identifies the specific UTXO in that
transaction. The second part is an unlocking script, which the wallet
constructs in order to satisfy the spending conditions set in the UTXO.
Most often, the unlocking script is a digital signature and public key
proving ownership of the bitcoin. However, not all unlocking scripts
contain signatures. The third part is a sequence number, which will be
discussed later.
Consider our example in <<transactions_behind_the_scenes>>. The
transaction inputs are an array (list) called +vin+:
[[vin]]
.The transaction inputs in Alice's transaction
[source,json]
----
"vin": [
{
"txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18",
"vout": 0,
"scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf",
"sequence": 4294967295
}
]
----
As you can see, there is only one input in the list (because one UTXO
contained sufficient value to make this payment). The input contains
four elements:
- A ((("transaction IDs (txd)")))transaction ID, referencing the
transaction that contains the UTXO being spent
- An output index (+vout+), identifying which UTXO from that transaction
is referenced (first one is zero)
- A +scriptSig+, which satisfies the conditions placed on the UTXO,
unlocking it for spending
- A sequence number (to be discussed later)
In Alice's transaction, the input points to the transaction ID:
----
7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18
----
and output index +0+ (i.e., the first UTXO created by that transaction).
The unlocking script is constructed by Alice's wallet by first
retrieving the referenced UTXO, examining its locking script, and then
using it to build the necessary unlocking script to satisfy it.
Looking just at the input you may have noticed that we don't know
anything about this UTXO, other than a reference to the transaction
containing it. We don't know its value (amount in satoshi), and we don't
know the locking script that sets the conditions for spending it. To
find this information, we must retrieve the referenced UTXO by
retrieving the underlying transaction. Notice that because the value of
the input is not explicitly stated, we must also use the referenced UTXO
in order to calculate the fees that will be paid in this transaction
(see <<tx_fees>>).
It's not just Alice's wallet that needs to retrieve UTXO referenced in
the inputs. Once this transaction is broadcast to the network, every
validating node will also need to retrieve the UTXO referenced in the
transaction inputs in order to validate the transaction.
Transactions on their own seem incomplete because they lack context.
They reference UTXO in their inputs but without retrieving that UTXO we
cannot know the value of the inputs or their locking conditions. When
writing bitcoin software, anytime you decode a transaction with the
intent of validating it or counting the fees or checking the unlocking
script, your code will first have to retrieve the referenced UTXO from
the blockchain in order to build the context implied but not present in
the UTXO references of the inputs. For example, to calculate the amount
paid in fees, you must know the sum of the values of inputs and outputs.
But without retrieving the UTXO referenced in the inputs, you do not
know their value. So a seemingly simple operation like counting fees in
a single transaction in fact involves multiple steps and data from
multiple transactions.
We can use the same sequence of commands with Bitcoin Core as we used
when retrieving Alice's transaction (+getrawtransaction+ and
+decoderawtransaction+). With that we can get the UTXO referenced in the
preceding input and take a look:
[[alice_input_tx]]
.Alice's UTXO from the previous transaction, referenced in the input
[source,json]
----
"vout": [
{
"value": 0.10000000,
"scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG"
}
]
----
We see that this UTXO has a value of 0.1 BTC and that it has a locking
script (+scriptPubKey+) that contains "OP_DUP OP_HASH160...".
[TIP]
====
To fully understand Alice's transaction we had to retrieve the previous
transaction(s) referenced as inputs. A function that retrieves previous
transactions and unspent transaction outputs is very common and exists
in almost every bitcoin library and API.
====
===== Transaction serialization&#x2014;inputs
((("serialization", "inputs")))((("transactions", "outputs and inputs",
"input serialization")))((("outputs and inputs", "input
serialization")))When transactions are serialized for transmission on
the network, their inputs are encoded into a byte stream as shown in
<<tx_in_structure>>.
[[tx_in_structure]]
.Transaction input serialization
[options="header"]
|=======
|Size| Field | Description
| 32 bytes | Transaction Hash | Pointer to the transaction containing the UTXO to be spent
| 4 bytes | Output Index | The index number of the UTXO to be spent; first one is 0
| 1&#x2013;9 bytes (VarInt) | Unlocking-Script Size | Unlocking-Script length in bytes, to follow
| Variable | Unlocking-Script | A script that fulfills the conditions of the UTXO locking script
| 4 bytes | Sequence Number | Used for locktime or disabled (0xFFFFFFFF)
|=======
As with the outputs, let's see if we can find the inputs from Alice's
transaction in the serialized format. First, the inputs decoded:
[source,json]
----
"vin": [
{
"txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18",
"vout": 0,
"scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf",
"sequence": 4294967295
}
],
----
Now, let's see if we can identify these fields in the serialized hex
encoding in <<example_6_2>>:
[[example_6_2]]
.Alice's transaction, serialized and presented in hexadecimal notation
====
+0100000001+*+186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73+*
*+4d2804fe65fa35779000000008b483045022100884d142d86652a3f47+*
*+ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039+*
*+ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813+*
*+01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84+*
*+16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1+*
*+7b4a10fa336a8d752adfffffffff+*+0260e31600000000001976a914ab6+
+8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000+
+1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac00000+
+000+
====
Hints:
- The transaction ID is serialized in reversed byte order, so it starts
with (hex) +18+ and ends with +79+
- The output index is a 4-byte group of zeros, easy to identify
- The length of the +scriptSig+ is 139 bytes, or +8b+ in hex
- The sequence number is set to +FFFFFFFF+, again easy to identify((("",
startref="alicesix")))
=== Bitcoin Addresses, Balances, and Other Abstractions
((("transactions", "higher-level abstractions", id="Thigher06")))We
began this chapter with the discovery that transactions look very
different "behind the scenes" than how they are presented in wallets,
blockchain explorers, and other user-facing applications. Many of the
simplistic and familiar concepts from the earlier chapters, such as
Bitcoin addresses and balances, seem to be absent from the transaction
structure. We saw that transactions don't contain Bitcoin addresses, per
se, but instead operate through scripts that lock and unlock discrete
values of bitcoin. Balances are not present anywhere in this system and
yet every wallet application prominently displays the balance of the
user's wallet.
Now that we have explored what is actually included in a bitcoin
transaction, we can examine how the higher-level abstractions are
derived from the seemingly primitive components of the transaction.
Let's look again at how Alice's transaction was presented on a popular
block explorer (<<alice_transaction_to_bobs_cafe>>).
[[alice_transaction_to_bobs_cafe]]
.Alice's transaction to Bob's Cafe
image::images/mbc2_0208.png["Alice Coffee Transaction"]
On the left side of the transaction, the blockchain explorer shows
Alice's Bitcoin address as the "sender." In fact, this information is
not in the transaction itself. When the blockchain explorer retrieved
the transaction it also retrieved the previous transaction referenced in
the input and extracted the first output from that older transaction.
Within that output is a locking script that locks the UTXO to Alice's
public key hash (a P2PKH script). The blockchain explorer extracted the
public key hash and encoded it using Base58Check encoding to produce and
display the Bitcoin address that represents that public key.
Similarly, on the right side, the blockchain explorer shows the two
outputs; the first to Bob's Bitcoin address and the second to Alice's
Bitcoin address (as change). Once again, to create these Bitcoin
addresses, the blockchain explorer extracted the locking script from
each output, recognized it as a P2PKH script, and extracted the
public-key-hash from within. Finally, the blockchain explorer reencoded
that public key hash with Base58Check to produce and display the Bitcoin
addresses.
If you were to click on Bob's Bitcoin address, the blockchain explorer
would show you the view in <<the_balance_of_bobs_bitcoin_address>>.
[[the_balance_of_bobs_bitcoin_address]]
.The balance of Bob's Bitcoin address
image::images/mbc2_0608.png["The balance of Bob's Bitcoin address"]
The blockchain explorer displays the balance of Bob's Bitcoin address.
But nowhere in the Bitcoin system is there a concept of a "balance."
Rather, the values displayed here are constructed by the blockchain
explorer as follows.
To construct the "Total Received" amount, the blockchain explorer first
will decode the Base58Check encoding of the Bitcoin address to retrieve
the 160-bit hash of Bob's public key that is encoded within the address.
Then, the blockchain explorer will search through the database of
transactions, looking for outputs with P2PKH locking scripts that
contain Bob's public key hash. By summing up the value of all the
outputs, the blockchain explorer can produce the total value received.
Constructing the current balance (displayed as "Final Balance") requires
a bit more work. The blockchain explorer keeps a separate database of
the outputs that are currently unspent, the UTXO set. To maintain this
database, the blockchain explorer must monitor the Bitcoin network, add
newly created UTXO, and remove spent UTXO, in real time, as they appear
in unconfirmed transactions. This is a complicated process that depends
on keeping track of transactions as they propagate, as well as
maintaining consensus with the Bitcoin network to ensure that the
correct chain is followed. Sometimes, the blockchain explorer goes out
of sync and its perspective of the UTXO set is incomplete or incorrect.
From the UTXO set, the blockchain explorer sums up the value of all
unspent outputs referencing Bob's public key hash and produces the
"Final Balance" number shown to the user.
In order to produce this one image, with these two "balances," the
blockchain explorer has to index and search through dozens, hundreds, or
even hundreds of thousands of transactions.
In summary, the information presented to users through wallet
applications, blockchain explorers, and other bitcoin user interfaces is
often composed of higher-level abstractions that are derived by
searching many different transactions, inspecting their content, and
manipulating the data contained within them. By presenting this
simplistic view of bitcoin transactions that resemble bank checks from
one sender to one recipient, these applications have to abstract a lot
of underlying detail. They mostly focus on the common types of
transactions: P2PKH with SIGHASH_ALL signatures on every input. Thus,
while bitcoin applications can present more than 80% of all transactions
in an easy-to-read manner, they are sometimes stumped by transactions
that deviate from the norm. Transactions that contain more complex
locking scripts, or different SIGHASH flags, or many inputs and outputs,
demonstrate the simplicity and weakness of these abstractions.
Every day, hundreds of transactions that do not contain P2PKH outputs
are confirmed on the blockchain. The blockchain explorers often present
these with red warning messages saying they cannot decode an address.
The following link contains the most recent "strange transactions" that
were not fully decoded: https://blockchain.info/strange-transactions[].
As we will see in the next chapter, these are not necessarily strange
transactions. They are transactions that contain more complex locking
scripts than the common P2PKH. We will learn how to decode and
understand more complex scripts and the applications they support
next.((("", startref="Thigher06")))((("", startref="alicesixtwo")))
=== Timelocks
((("transactions", "advanced", "timelocks")))((("scripting",
"timelocks", id="Stimelock07")))((("nLocktime field")))((("scripting",
"timelocks", "uses for")))((("timelocks", "uses for")))Timelocks are
restrictions on transactions or outputs that only allow spending after a
point in time. Bitcoin has had a transaction-level timelock feature from
the beginning. It is implemented by the +nLocktime+ field in a
transaction. Two new timelock features were introduced in late 2015 and
mid-2016 that offer UTXO-level timelocks. These are
+CHECKLOCKTIMEVERIFY+ and +CHECKSEQUENCEVERIFY+.
Timelocks are useful for postdating transactions and locking funds to a
date in the future. More importantly, timelocks extend bitcoin scripting
into the dimension of time, opening the door for complex multistep smart
contracts.
[[transaction_locktime_nlocktime]]
==== Transaction Locktime (nLocktime)
((("scripting", "timelocks", "nLocktime")))((("timelocks",
"nLocktime")))From the beginning, Bitcoin has had a transaction-level
timelock feature. Transaction locktime is a transaction-level setting (a
field in the transaction data structure) that defines the earliest time
that a transaction is valid and can be relayed on the network or added
to the blockchain. Locktime is also known as +nLocktime+ from the
variable name used in the Bitcoin Core codebase. It is set to zero in
most transactions to indicate immediate propagation and execution. If
+nLocktime+ is nonzero and below 500 million, it is interpreted as a
block height, meaning the transaction is not valid and is not relayed or
included in the blockchain prior to the specified block height. If it is
above 500 million, it is interpreted as a Unix Epoch timestamp (seconds
since Jan-1-1970) and the transaction is not valid prior to the
specified time. Transactions with +nLocktime+ specifying a future block
or time must be held by the originating system and transmitted to the
Bitcoin network only after they become valid. If a transaction is
transmitted to the network before the specified +nLocktime+, the
transaction will be rejected by the first node as invalid and will not
be relayed to other nodes. The use of +nLocktime+ is equivalent to
postdating a paper check.
[[locktime_limitations]]
===== Transaction locktime limitations
+nLocktime+ has the limitation that while it makes it possible to spend
some outputs in the future, it does not make it impossible to spend them
until that time. Let's explain that with the following example.
((("use cases", "buying coffee", id="alicesseven")))Alice signs a
transaction spending one of her outputs to Bob's address, and sets the
transaction +nLocktime+ to 3 months in the future. Alice sends that
transaction to Bob to hold. With this transaction Alice and Bob know
that:
- Bob cannot transmit the transaction to redeem the funds until 3 months
have elapsed.
- Bob may transmit the transaction after 3 months.
However:
- Alice can create another transaction, double-spending the same inputs
without a locktime. Thus, Alice can spend the same UTXO before the 3
months have elapsed.
- Bob has no guarantee that Alice won't do that.
It is important to understand the limitations of transaction
+nLocktime+. The only guarantee is that Bob will not be able to redeem
it before 3 months have elapsed. There is no guarantee that Bob will get
the funds. To achieve such a guarantee, the timelock restriction must be
placed on the UTXO itself and be part of the locking script, rather than
on the transaction. This is achieved by the next form of timelock,
called Check Lock Time Verify.
==== Relative Timelocks with nSequence
((("nSequence field")))((("scripting", "timelocks", "relative timelocks
with nSequence")))Relative timelocks can be set on each input of a
transaction, by setting the +nSequence+ field in each input.
===== Original meaning of nSequence
The +nSequence+ field was originally intended (but never properly
implemented) to allow modification of transactions in the mempool. In
that use, a transaction containing inputs with +nSequence+ value below
2^32^ - 1 (0xFFFFFFFF) indicated a transaction that was not yet
"finalized." Such a transaction would be held in the mempool until it
was replaced by another transaction spending the same inputs with a
higher +nSequence+ value. Once a transaction was received whose inputs
had an +nSequence+ value of 0xFFFFFFFF it would be considered
"finalized" and mined.
The original meaning of +nSequence+ was never properly implemented and
the value of +nSequence+ is customarily set to 0xFFFFFFFF in
transactions that do not utilize timelocks. For transactions with
nLocktime or +CHECKLOCKTIMEVERIFY+, the +nSequence+ value must be set to
less than 2^31^ for the timelock guards to have an effect, as explained
below.
===== nSequence as a consensus-enforced relative timelock
Since the activation of BIP-68, new consensus rules apply for any
transaction containing an input whose +nSequence+ value is less than
2^31^ (bit 1<<31 is not set). Programmatically, that means that if the
most significant (bit 1<<31) is not set, it is a flag that means
"relative locktime." Otherwise (bit 1<<31 set), the +nSequence+ value is
reserved for other uses such as enabling +CHECKLOCKTIMEVERIFY+,
+nLocktime+, Opt-In-Replace-By-Fee, and other future developments.
Transaction inputs with +nSequence+ values less than 2^31^ are
interpreted as having a relative timelock. Such a transaction is only
valid once the input has aged by the relative timelock amount. For
example, a transaction with one input with an +nSequence+ relative
timelock of 30 blocks is only valid when at least 30 blocks have elapsed
from the time the UTXO referenced in the input was mined. Since
+nSequence+ is a per-input field, a transaction may contain any number
of timelocked inputs, all of which must have sufficiently aged for the
transaction to be valid. A transaction can include both timelocked
inputs (+nSequence+ < 2^31^) and inputs without a relative timelock
(+nSequence+ >= 2^31^).
The +nSequence+ value is specified in either blocks or seconds, but in a
slightly different format than we saw used in +nLocktime+. A type-flag
is used to differentiate between values counting blocks and values
counting time in seconds. The type-flag is set in the 23rd
least-significant bit (i.e., value 1<<22). If the type-flag is set, then
the +nSequence+ value is interpreted as a multiple of 512 seconds. If
the type-flag is not set, the +nSequence+ value is interpreted as a
number of blocks.
When interpreting +nSequence+ as a relative timelock, only the 16 least
significant bits are considered. Once the flags (bits 32 and 23) are
evaluated, the +nSequence+ value is usually "masked" with a 16-bit mask
(e.g., +nSequence+ & 0x0000FFFF).
<<bip_68_def_of_nseq>> shows the binary layout of the +nSequence+ value,
as defined by BIP-68.
[[bip_68_def_of_nseq]]
.BIP-68 definition of nSequence encoding (Source: BIP-68)
image::images/mbc2_0701.png["BIP-68 definition of nSequence encoding"]
Relative timelocks based on consensus enforcement of the +nSequence+
value are defined in BIP-68.
The standard is defined in
https://github.com/bitcoin/bips/blob/master/bip-0068.mediawiki[BIP-68,
Relative lock-time using consensus-enforced sequence numbers].
[[segwit]]
=== Segregated Witness
((("segwit (Segregated Witness)", id="Ssegwit07")))Segregated Witness
(segwit) is an upgrade to the bitcoin consensus rules and network
protocol, proposed and implemented as a BIP-9 soft-fork that was
activated on bitcoin's mainnet on August 1st, 2017.
In cryptography, the term "witness" is used to describe a solution to a
cryptographic puzzle. In bitcoin terms, the witness satisfies a
cryptographic condition placed on a unspent transaction output (UTXO).
In the context of bitcoin, a digital signature is _one type of witness_,
but a witness is more broadly any solution that can satisfy the
conditions imposed on an UTXO and unlock that UTXO for spending. The
term “witness” is a more general term for an “unlocking script” or
“scriptSig.”
Before segwits introduction, every input in a transaction was followed
by the witness data that unlocked it. The witness data was embedded in
the transaction as part of each input. The term _segregated witness_, or
_segwit_ for short, simply means separating the signature or unlocking
script of a specific output. Think "separate scriptSig," or “separate
signature” in the simplest form.
Segregated Witness therefore is an architectural change to bitcoin that
aims to move the witness data from the +scriptSig+ (unlocking script)
field of a transaction into a separate _witness_ data structure that
accompanies a transaction. Clients may request transaction data with or
without the accompanying witness data.
In this section we will look at some of the benefits of Segregated
Witness, describe the mechanism used to deploy and implement this
architecture change, and demonstrate the use of Segregated Witness in
transactions and addresses.
Segregated Witness is defined by the following BIPs:
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki[BIP-141] :: The main definition of Segregated Witness.
https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki[BIP-143] :: Transaction Signature Verification for Version 0 Witness Program
https://github.com/bitcoin/bips/blob/master/bip-0144.mediawiki[BIP-144] :: Peer Services&#x2014;New network messages and serialization formats
https://github.com/bitcoin/bips/blob/master/bip-0145.mediawiki[BIP-145] :: getblocktemplate Updates for Segregated Witness (for mining)
https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki[BIP-173]:: Base32 address format for native v0-16 witness outputs
==== Why Segregated Witness?
Segregated Witness is an architectural change that has several effects
on the scalability, security, economic incentives, and performance of
bitcoin:
Transaction Malleability :: By moving the witness outside the
transaction, the transaction hash used as an identifier no longer
includes the witness data. Since the witness data is the only part of
the transaction that can be modified by a third party (see
<<segwit_txid>>), removing it also removes the opportunity for
transaction malleability attacks. With Segregated Witness, transaction
hashes become immutable by anyone other than the creator of the
transaction, which greatly improves the implementation of many other
protocols that rely on advanced bitcoin transaction construction, such
as payment channels, chained transactions, and lightning networks.
Script Versioning :: With the introduction of Segregated Witness
scripts, every locking script is preceded by a _script version_ number,
similar to how transactions and blocks have version numbers. The
addition of a script version number allows the scripting language to be
upgraded in a backward-compatible way (i.e., using soft fork upgrades)
to introduce new script operands, syntax, or semantics. The ability to
upgrade the scripting language in a nondisruptive way will greatly
accelerate the rate of innovation in bitcoin.
Network and Storage Scaling :: The witness data is often a big
contributor to the total size of a transaction. More complex scripts
such as those used for multisig or payment channels are very large. In
some cases these scripts account for the majority (more than 75%) of the
data in a transaction. By moving the witness data outside the
transaction, Segregated Witness improves bitcoins scalability. Nodes
can prune the witness data after validating the signatures, or ignore it
altogether when doing simplified payment verification. The witness data
doesnt need to be transmitted to all nodes and does not need to be
stored on disk by all nodes.
Signature Verification Optimization :: Segregated Witness upgrades the
signature functions (+CHECKSIG+, +CHECKMULTISIG+, etc.) to reduce the
algorithm's computational complexity. Before segwit, the algorithm used
to produce a signature required a number of hash operations that was
proportional to the size of the transaction. Data-hashing computations
increased in O(n^2^) with respect to the number of signature operations,
introducing a substantial computational burden on all nodes verifying
the signature. With segwit, the algorithm is changed to reduce the
complexity to O(n).
Offline Signing Improvement :: Segregated Witness signatures incorporate
the value (amount) referenced by each input in the hash that is signed.
Previously, an offline signing device, such as a hardware wallet, would
have to verify the amount of each input before signing a transaction.
This was usually accomplished by streaming a large amount of data about
the previous transactions referenced as inputs. Since the amount is now
part of the commitment hash that is signed, an offline device does not
need the previous transactions. If the amounts do not match (are
misrepresented by a compromised online system), the signature will be
invalid.
==== How Segregated Witness Works
At first glance, Segregated Witness appears to be a change to how
transactions are constructed and therefore a transaction-level feature,
but it is not. Rather, Segregated Witness is a change to how individual
UTXO are spent and therefore is a per-output feature.
A transaction can spend Segregated Witness outputs or traditional
(inline-witness) outputs or both. Therefore, it does not make much sense
to refer to a transaction as a “Segregated Witness transaction.” Rather
we should refer to specific transaction outputs as “Segregated Witness
outputs."
When a transaction spends an UTXO, it must provide a witness. In a
traditional UTXO, the locking script requires that witness data be
provided _inline_ in the input part of the transaction that spends the
UTXO. A Segregated Witness UTXO, however, specifies a locking script
that can be satisfied with witness data outside of the input
(segregated).
==== Soft Fork (Backward Compatibility)
Segregated Witness is a significant change to the way outputs and
transactions are architected. Such a change would normally require a
simultaneous change in every Bitcoin node and wallet to change the
consensus rules&#x2014;what is known as a hard fork. Instead, segregated
witness is introduced with a much less disruptive change, which is
backward compatible, known as a soft fork. This type of upgrade allows
nonupgraded software to ignore the changes and continue to operate
without any disruption.
Segregated Witness outputs are constructed so that older systems that
are not segwit-aware can still validate them. To an old wallet or node,
a Segregated Witness output looks like an output that _anyone can
spend_. Such outputs can be spent with an empty signature, therefore the
fact that there is no signature inside the transaction (it is
segregated) does not invalidate the transaction. Newer wallets and
mining nodes, however, see the Segregated Witness output and expect to
find a valid witness for it in the transactions witness data.
[[segwit_txid]]
===== Transaction identifiers
((("transaction IDs (txid)")))One of the greatest benefits of Segregated
Witness is that it eliminates third-party transaction malleability.
Before segwit, transactions could have their signatures subtly modified
by third parties, changing their transaction ID (hash) without changing
any fundamental properties (inputs, outputs, amounts). This created
opportunities for denial-of-service attacks as well as attacks against
poorly written wallet software that assumed unconfirmed transaction
hashes were immutable.
With the introduction of Segregated Witness, transactions have two
identifiers, +txid+ and +wtxid+. The traditional transaction ID +txid+
is the double-SHA256 hash of the serialized transaction, without the
witness data. A transaction +wtxid+ is the double-SHA256 hash of the new
serialization format of the transaction with witness data.
The traditional +txid+ is calculated in exactly the same way as with a
nonsegwit transaction. However, since the segwit transaction has empty
++scriptSig++s in every input, there is no part of the transaction that
can be modified by a third party. Therefore, in a segwit transaction,
the +txid+ is immutable by a third party, even when the transaction is
unconfirmed.
The +wtxid+ is like an "extended" ID, in that the hash also incorporates
the witness data. If a transaction is transmitted without witness data,
then the +wtxid+ and +txid+ are identical. Note than since the +wtxid+
includes witness data (signatures) and since witness data may be
malleable, the +wtxid+ should be considered malleable until the
transaction is confirmed. Only the +txid+ of a segwit transaction can be
considered immutable by third parties and only if _all_ the inputs of
the transaction are segwit inputs.
[TIP]
====
Segregated Witness transactions have two IDs: +txid+ and +wtxid+. The
+txid+ is the hash of the transaction without the witness data and the
+wtxid+ is the hash inclusive of witness data. The +txid+ of a
transaction where all inputs are segwit inputs is not susceptible to
third-party transaction malleability.
====
==== Economic Incentives for Segregated Witness
Bitcoin mining nodes and full nodes incur costs for the resources used
to support the Bitcoin network and the blockchain. As the volume of
bitcoin transactions increases, so does the cost of resources (CPU,
network bandwidth, disk space, memory). Miners are compensated for these
costs through fees that are proportional to the size (in bytes) of each
transaction. Nonmining full nodes are not compensated, so they incur
these costs because they have a need to run an authoritative fully
validating full-index node, perhaps because they use the node to operate
a bitcoin business.
Without transaction fees, the growth in bitcoin data would arguably
increase dramatically. Fees are intended to align the needs of bitcoin
users with the burden their transactions impose on the network, through
a market-based price discovery mechanism.
The calculation of fees based on transaction size treats all the data in
the transaction as equal in cost. But from the perspective of full nodes
and miners, some parts of a transaction carry much higher costs. Every
transaction added to the Bitcoin network affects the consumption of four
resources on nodes:
Disk Space :: Every transaction is stored in the blockchain, adding to
the total size of the blockchain. The blockchain is stored on disk, but
the storage can be optimized by “pruning” older transactions.
CPU :: Every transaction must be validated, which requires CPU time.
Bandwidth :: Every transaction is transmitted (through flood
propagation) across the network at least once. Without any optimization
in the block propagation protocol, transactions are transmitted again as
part of a block, doubling the impact on network capacity.
Memory :: Nodes that validate transactions keep the UTXO index or the
entire UTXO set in memory to speed up validation. Because memory is at
least one order of magnitude more expensive than disk, growth of the
UTXO set contributes disproportionately to the cost of running a node.
As you can see from the list, not every part of a transaction has an
equal impact on the cost of running a node or on the ability of bitcoin
to scale to support more transactions. The most expensive part of a
transaction are the newly created outputs, as they are added to the
in-memory UTXO set. By comparison, signatures (aka witness data) add the
least burden to the network and the cost of running a node, because
witness data are only validated once and then never used again.
Furthermore, immediately after receiving a new transaction and
validating witness data, nodes can discard that witness data. If fees
are calculated on transaction size, without discriminating between these
two types of data, then the market incentives of fees are not aligned
with the actual costs imposed by a transaction. In fact, the current fee
structure actually encourages the opposite behavior, because witness
data is the largest part of a transaction.
The incentives created by fees matter because they affect the behavior
of wallets. All wallets must implement some strategy for assembling
transactions that takes into consideration a number of factors, such as
privacy (reducing address reuse), fragmentation (making lots of loose
change), and fees. If the fees are overwhelmingly motivating wallets to
use as few inputs as possible in transactions, this can lead to UTXO
picking and change address strategies that inadvertently bloat the UTXO
set.
Transactions consume UTXO in their inputs and create new UTXO with their
outputs. A transaction, therefore, that has more inputs than outputs
will result in a decrease in the UTXO set, whereas a transaction that
has more outputs than inputs will result in an increase in the UTXO set.
Lets consider the _difference_ between inputs and outputs and call that
the “Net-new-UTXO.” Thats an important metric, as it tells us what
impact a transaction will have on the most expensive network-wide
resource, the in-memory UTXO set. A transaction with positive
Net-new-UTXO adds to that burden. A transaction with a negative
Net-new-UTXO reduces the burden. We would therefore want to encourage
transactions that are either negative Net-new-UTXO or neutral with zero
Net-new-UTXO.
Lets look at an example of what incentives are created by the
transaction fee calculation, with and without Segregated Witness. We
will look at two different transactions. Transaction A is a 3-input,
2-output transaction, which has a Net-new-UTXO metric of &#x2013;1,
meaning it consumes one more UTXO than it creates, reducing the UTXO set
by one. Transaction B is a 2-input, 3-output transaction, which has a
Net-new-UTXO metric of 1, meaning it adds one UTXO to the UTXO set,
imposing additional cost on the entire Bitcoin network. Both
transactions use multisignature (2-of-3) scripts to demonstrate how
complex scripts increase the impact of segregated witness on fees. Lets
assume a transaction fee of 30 satoshi per byte and a 75% fee discount
on witness data:
++++
<dl>
<dt>Without Segregated Witness</dt>
<dd>
<p>Transaction A fee: 25,710 satoshi</p>
<p>Transaction B fee: 18,990 satoshi</p>
</dd>
<dt>With Segregated Witness</dt>
<dd>
<p>Transaction A fee: 8,130 satoshi</p>
<p>Transaction B fee: 12,045 satoshi</p>
</dd>
</dl>
++++
Both transactions are less expensive when segregated witness is
implemented. But comparing the costs between the two transactions, we
see that before Segregated Witness, the fee is higher for the
transaction that has a negative Net-new-UTXO. After Segregated Witness,
the transaction fees align with the incentive to minimize new UTXO
creation by not inadvertently penalizing transactions with many inputs.
Segregated Witness therefore has two main effects on the fees paid by
Bitcoin users. Firstly, segwit reduces the overall cost of transactions
by discounting witness data and increasing the capacity of the Bitcoin
blockchain. Secondly, segwits discount on witness data corrects a
misalignment of incentives that may have inadvertently created more
bloat in the UTXO set.((("", startref="Tadv07")))((("",
startref="Ssegwit07")))