Build a blockchain application from scratch in Python: Understanding blockchain

Build a blockchain application from scratch in Python: Understanding Blockchain

I am not a blockchain expert. I only studied blockchain in a semester. But I know the most effective way to learn something is to build it from scratch. You will have chances to deal with the issues during building it. I am happy to share my view on the process of building a fully functional application: A Simple Blockchain-based Voting System. Open source is available used to store data, any type of data. The key difference is that data is stored in the form of blocks, a chain of blocks using a hash function. For Bitcoin, the data can be the transactions (logs of transfers of Bitcoin from one account to another). The chain of blocks is linked by hashes where the hash of the current block included the hash of the previous block.

A Simple Blockchain Architecture

Blockchain sustains the following characteristics:

  • Decentralization: It means any information stored in blockchain acts as a unit of the whole network. There is no centralized authority; instead, all the information is shared among nodes.
  • Immutability: The data of blockchain is append-only. No delete. You have to add new annotated data to fix something in the past. It makes data on the blockchain traceable.
  • Distributed ledgers: Blockchain can be public or private. But it always belongs to a group, not to anyone. It is typically managed by a peer-to-peer network collectively adhering to a protocol for inter-node communication and validating new blocks.

The Core Idea Behind Blockchain

A little bit of history, in 2008, to implement a system where document timestamps could not be tampered without requiring a trusted party, Nakamoto improved the design of Merkle trees by using a Hashcash-like method to timestamp blocks. Hashcash is a proof-of-work system that was first used to limit email spam and denial-of-service attacks. Hashcash algorithm requires the sender to add an encoding of hashcash stamp to the header of the email. The header requires the recipient’s email address, the date of the message, the counter and the important condition, the first 20 bits of the 160-bit SHA-1 hash of the header are all zeros. The sender has to find the appropriate counter to satisfy the above condition. The only known way to do that is brute force. The recipient can easily check the condition of the email. If it satisfies, the recipient knows that the sender has lost a certain amount of computations to send an email, so they can believe that the sender cannot send this mail to everyone.

5 Industry Transforming Blockchain Applications | Data Driven Investor

The Hashcash-like method in blockchain works with the same mechanism. The data is append-only and organized as blocks. A block consists of the data, previous hash, and nonce number. The nonce number plays a role as the counter and the current hash as the hashcash in the email example.

current_hash = hash(previous_hash + data + nonce)

The number of first zeros in the current hash is called the difficulty parameter. The greater the difficulty, the lower the likelihood of finding the appropriate nonce. In order to append some data, you have to create a new block and find a nonce which makes current hash representation has the number of first zeros enough. It means you have to trade a certain amount of computational power to add a new block. This algorithm is called proof of work and this process is called mining.

Since the hash of a block will be in the header of the next block, you have to recalculate the subsequent blocks if you want to change data in the past. It also means that it is possible to have many different branches and many different data versions. In that case, blockchain maintains a protocol guaranteeing that the longest-chain will be treated as the main-chain. This protocol is called consensus.

Nowadays, there are many types of blockchain with different consensus algorithms. We will not discuss them here.

Why are “hashcash”, “chain”, and “longest-chain”?

In order to comprehend, we consider a scenario called a 51% attack, which is posed when Bitcoin was born. Let imagine that you join Bitcoin and use Bitcoin to buy a car. No trusted-party in Bitcoin. It has to validate your transaction by itself ( checking if there is enough money in your account, this process will be validated by other people in the Bitcoin network ). Suppose you are a hacker and you want to treat that transaction. Bitcoin does not store your balance (Bitcoin trace all of your transactions in the past to get your balance). The only thing you can do is to make the previous transaction disappear from the Bitcoin database. How?

You can clone blockchain to your machine, remove your transaction, and continue sharing in the network and validating other transactions. You will get your own branch. In this branch, you got a car without money.

The problem is other people in the blockchain network only accept the longest-chain. You want to make other people believe that your chain is the real one. Thus, you have to make your chain longer than others. But when you mine your block, others are also mining in the main chain. As a consequence, you must have a computer with greater computing power than all others combined. In other words, your computational power must be at least 51% of the computing power of the entire network to catch up with the main chain. With a public network like Bitcoin, it is possible to consider it as impossible.

Blockchain in Python

In this section, we implement a simple blockchain from scratch by Python. This piece is a summarization of the fantastic article. This post aims to help you easier to follow source code, you should run it for yourself and print all thing that you confuse. If you have already known what it is, check part 2 to build a fully functional Blockchain-based Voting System.

First, you need a class to represent a block:

and a class to represent a blockchain:

transactions are the data that are stored in JSON format. For example, in Bitcoin, it will look like:

"index": "an unique id",
"sender": "private key of sender",
"receiver": "public key of receiver",
"timestamp": "The time at which the transaction was created"

unconfimed_transactions is a pool that stores unvalidated transactions. Miner will pick transactions from the pool to validate.

The first block in the blockchain is always constructed with previous_hash is 0. In order to add a new block to the blockchain, you must implement the proof of work algorithm which will find appropriate nonce by brute-force and mine function which will get unconfirmed_transactions and try to validate those transactions and run proof_of_work and add a new block to the chain.

add_block should be easily appending a new block. But the problem occurs when there are many people mining at the same time. They might get a new block before us, then the chain we use is no longer the longest-chain. So we must check the validity of our chain before adding.

We almost had a blockchain that can run on a single computer. The next thing we should do is create an application used this blockchain. We leverage the Flask framework to build a RESTful application that interacts with our blockchain database.

Finally, to join a blockchain network, you have to adhere to the protocol for inter-node communication, which is the consensus algorithm.

Up to this point, we have understood what is a blockchain, and how to implement a simple one. In the next post, I want to share with you how to implement a fully functional blockchain-based application with different architecture. I also want to introduce a new concept in blockchain, smart-contract, and how to handle it.



Build a blockchain application from scratch in Python: Understanding blockchain was originally published in Data Driven Investor on Medium, where people are continuing the conversation by highlighting and responding to this story.