Understanding the underpinnings of one of the most disruptive technologies of the 21st century and the harbinger of a new era of decentralization
Blockchain technology has taken the world by storm. It has spawned a new era of innovation, expanding the horizons of possibilities in finance, real estate, healthcare, and numerous other fields. Blockchain and decentralized technologies rose to prominence with the rise of cryptocurrencies, most notably, of course, that of Bitcoin. While Bitcoin was influential in that it provided a method of orchestrating borderless peer-to-peer monetary transactions without any intermediaries, its true achievement was bringing blockchain into the limelight, enabling the myriad of possibilities the technology brings far beyond the reaches of digital currency.
A blockchain, at its heart, is an immutable, decentralized ledger that stores data in the form of transactions. Transactions can constitute sending money, confirming a vote, or even updating a vaccination record. The key innovation was providing a means of establishing trust among participants, allowing it to be both a decentralized system, where no one entity or user controls the blockchain. The participants of the network are known as nodes, with each node maintaining a copy of the blockchain, hence making it distributed. This ensures each node has the capability of verifying the current state of the blockchain without the need for an intermediary. Distributed systems enable the system to be more resilient and fault-tolerant, as well as reducing the possibility for tampering or manipulation. Furthermore, blockchain enables transparency, with every node able to view and contribute to the ledger.
The upshot is: blockchain enables trustless, peer-to-peer transacting (i.e. nodes directly transact with one another rather than through an intermediary, like a bank) via a distributed and decentralized system. How exactly does blockchain manage to achieve this, however? If blockchain has seen further, it was by standing upon the shoulders of giants; much of blockchain technology is built upon the foundations of modern cryptography, which is the study of techniques that enable the protection of information and secure communication. The two main domains employed are that of hashing and asymmetric encryption.
Hashing is the process by which a string is transformed into a string of bits of a fixed length. Cryptographic hash functions are algorithms that map data of arbitrary size (the input) to a bit string that’s a fixed length (the digest), with some additional security specifications. An ideal cryptographic hash function is deterministic, meaning that the same input will always produce the same output, regardless of where or when the algorithm is run; it must be a one-way function, meaning that, given a hash, it would be computationally infeasible to determine the original input (i.e. the only way would be via brute force); they must be collision-free: two distinct messages must not produce the same hash; a slight change in the original input should yield a drastic change in the resulting hash (this is known as the avalanche effect in cryptography); it should be quick to compute the digest, and thus verify that the hash of a message corresponds to that specific message.
There are a variety of cryptographic hash functions available, the most notable being SHA-256, SHA-1, and MD5. The SHA-256 algorithm produces a 256-bit digest, normally represented in hexadecimal (Base 16 instead of Base 2, making the digest more concise). This means that there are 2²⁵⁶ possible hashes; the probability of any input producing a specific hash is 1 in 2²⁵⁶ or around 8.636 x 10⁻⁷⁸. For some perspective on how incredibly infinitesimal this number is, the width of a hydrogen atom is 1.06 x 10⁻¹⁰. Practically, this means that if we wanted to find an input that had a specific hash, there would be no better method than brute-forcing it, requiring 2²⁵⁶ attempts on average (think about rolling a die — how many attempts, on average, will it take to roll a 6?). For a sense of the magnitude of 2²⁵⁶, and just how infeasible this task is, here’s a great video from Grant Sanderson of 3blue1brown:
The second half of the relevant cryptographic knowledge is that of asymmetric cryptography or public-key cryptography. Public-key cryptography, as one might imagine, revolves around two keys: the public key and the private key. The private key is generated randomly, using a random number generator (RNG); the public key is then created from an algorithm that takes the public key as its input, mathematically linking the pair. The private key, as the name suggests, must be kept private and is known only to the owner of the key pair; the public key by contrast is known to everyone. A nice analogy is to think of your public key as your postal address (which everyone knows and is publicly available) and to think of your private key as the physical key to your mailbox. Anyone can send you mail to your mailbox, but only you can access the contents within.
This private key serves two primary purposes: the decryption of data encrypted using the corresponding public key; the creation of digital signatures that enable proof of ownership. The first use case is quite straightforward: using an encryption algorithm, you can encrypt any digital document (which is nothing more than a string of bits) using the public key. The resulting encrypted document may only be decrypted using the corresponding private key. Hence, anyone can send encrypted messages to an individual and only that individual will be able to decrypt it to produce the original message. The second use case is more subtle: a digital signature can be produced from an algorithm that takes in any digital document and the document owner’s private key. Then, a digital signature verification algorithm can be used to verify that it was indeed the owner who created the document (i.e. the document wasn’t forged or tampered with). This verification algorithm takes in the document, the digital signature, and the owner’s public key, enabling any party to verify the authenticity of the document. The following illustration from Building Ethereum Dapps (Infante, 2019) perfectly illuminates this murkier area of cryptography:
How do hash functions and public-key cryptography tie in with blockchain technology? Let’s start at the very beginning: the creation of a transaction.
A transaction is the transfer of a digital asset between two parties. With cryptocurrencies, this digital asset is currency; however, it could also be a land deed (enabling ownership verification), your personal information (enabling digital identity management), or even pieces of code (enabling logic) as we’ve seen with smart contracts. Let’s run with the example of cryptocurrencies as they’ve proven blockchain’s most popular use case. Let’s say that there are two people, Alice and Bob, who are looking to transact on the blockchain. A recent transaction states that Alice sent Bob 5 coins. But, how do we know it was Alice who sent it, and not just Bob falsifying a transaction? Or what if Alice sent Bob 2 coins, but he modified it to read 5? And on the flip side, if Alice did send 5 coins to Bob, how can we ensure only Bob can access and use them? How can we verify it was the sender who initiated the transaction and that only the receiver can access the funds? This scenario seems to be perfectly aligned for the use of asymmetric cryptography. This is where the concept of a wallet comes into play.
A wallet can be a program, service, or physical device, that stores the public and private keys required for making a transaction. The wallet generates a key pair based on elliptic curve cryptography, which is a form of asymmetric cryptography based on elliptic curves in a Cartesian grid. The private key enables owners to access and send funds. When funds are sent, a digital signature is attached to the transaction that was generated from the sender’s private key; anyone can now verify based on the sender’s public key if the transaction is legitimate. When funds are received, they are stored on the owner’s public key; only the corresponding private key can access and use these funds. This public key is shared with any third party for receiving cryptocurrency or another digital asset, as well as used in the verification of transactions. Once a transaction is completed, what happens to it? What makes this transaction any different than one that occurs via a bank except with some added cryptographic layers? Enter blockchain.
A blockchain, as the name suggests, is comprised of blocks. These blocks are made up of 2 things: a header and a body. The header contains the identifying information for a particular block; the body contains the blockchain data, generally transactions. The blockchain hence serves as a ledger, tracking all the transactions (much as a bank would, for the instance of monetary transactions), with the core revelation being its decentralized and distributed infrastructure, resolving the necessity of any intermediation.
The header of the block contains the metadata, most importantly, the hash of the previous block’s header, a data structure known as the Merkle Root, and a special number called the nonce. This is how the blocks are chained together — the hash of the current block is constructed from the hash of the previous block, providing a cryptographic connection.
The Merkle Root is the root of the Merkle Tree, which is a binary tree (a tree where each node can have at most 2 child nodes) that enables us to quickly determine if the transaction data has been modified or tampered with. Each transaction on the blockchain has a hash associated with it. We can create a tree of these hashes, where each node’s hash is the sum of its child’s hashes (the leaf nodes hash is simply the hash of its associated transaction). This continues until we end up with the Merkle Root, which is a node that has no parent.
Thus, instead of having to verify each transaction, we can simply inspect the Merkle Root which can be thought of as a pointer to the data stored in the block. By the hash of the current block being created from the hash of the previous block and the current block’s Merkle Root, the blockchain has effectively become completely tamper resistant, as any changes to its data are reflected in its hash. If a party attempts to alter or remove a transaction, the current block’s Merkle Root would change, leading to its hash to change. This would then be reflected in the succeeding block, whose hash would change, so the rest of the blockchain would be invalidated. The blockchain has effectively become tamper-resistant, with only valid transactions being accepted. So, where does the nonce value come in? That’s where the mining mechanism of blockchain comes into play.
Miners have the task of organizing & verifying transactions, creating blocks, and submitting their blocks to the global blockchain. When a transaction is created, it’s sent to the mempool. Miners choose transactions from the mempool (many of these containing a transaction fee as an incentive), verify their authenticity, and add them to a block. After the maximum amount of transactions have been added (all blocks have a memory limit), the miners must now create a valid hash for the block. A valid hash is a hash that meets a certain difficulty. The difficulty is the number of zero’s that a block’s hash must have at the beginning: if the difficulty is 5, the block’s hash must begin with 00000. Since there’s no way of determining which input will produce the hash of the desired difficulty, the best method these miners have is brute force. Since miners can’t alter the previous block’s hash or the Merkle root, they use a randomly generated nonce value to compute the block’s hash. They continuously generate nonces until finally, they produce a hash of the desired difficulty. This process of creating valid blocks is known as Proof of Work and is what is known as a consensus mechanism.
Proof of Work (where work here refers to the large amount of time and energy the miners devote to verifying transactions and generating hashes) allows for the network to achieve consensus on the current state of the blockchain — only a single block is added, and this addition is propagated to every node in the network. If every node had a different copy of the blockchain, it would be impossible to know what transactions occurred and verify what assets an individual owns. Mining also solves the double-spend problem: due to the verification done by the miners, you can’t attempt to send the same digital asset to various parties at once. Once a miner completes a block, it is broadcast to the rest of the network, where other nodes verify the transactions within the block, and if valid, further propagates it to its neighbouring nodes. Once 51% of the network agrees the block and all its transactions are valid, it is added to the global blockchain. Hence, if a bad actor wants to modify a transaction in a block or include a fraudulent transaction, it would need control of at least 51% of the network’s nodes. This kind of network attack is known as a 51% Attack and would require massive computational power to succeed.
In exchange for their work, a miner whose block is successfully added to the blockchain receives a reward. Every single node on the blockchain network stores a copy of the blockchain, so when the blockchain is updated, this update is propagated to all nodes. In cryptocurrencies, consensus mechanisms serve an additional purpose: the issuance of money. Miners are rewarded upon their block successfully being verified and added to the global blockchain. This process is repeated potentially thousands of times, with the network’s security directly proportional to its scale.
That showcases the end-to-end workings of the blockchain network and the lifecycle of a transaction. We can see that a blockchain isn’t much more than a distributed spreadsheet with some cryptographical layers to ensure immutability and security. With the blockchain stored on every node, there is no central point of failure, and with consensus mechanisms, the need for any intermediary is resolved; it is a truly fault-resistant, peer-to-peer network. Anyone can view, verify, and contribute to the blockchain, enabling transparency and greater accessibility.
Blockchain’s potential expands far beyond decentralized finance and cryptocurrencies; blockchain has the potential to revolutionize supply chains, modernize and secure election and voting processes, and even enable control of your digital identity. Smart contracts enable the development of decentralized applications (dapps), which allow blocks to store and execute code, enabling entire applications to be stored on the blockchain, opening up a decentralized universe of possibilities. Blockchain technology is continuously evolving and adapting, with use cases found in fields as broad as finance, logistics, and healthcare. Blockchain will lay the foundation for a new era of decentralization and technological democracy; let’s see what innovations will be built upon its blocks.
The Building Blocks of Blockchain was originally published in DataDrivenInvestor on Medium, where people are continuing the conversation by highlighting and responding to this story.