Components of a Blockchain
Before understanding how a blockchain works, let’s understand the Blockchain Architecture and its components.
One of the most important components of Blockchain Architecture is the hashes which are generated by hashing the content of the block. Hashing is a randomized algorithmic process to calculate a fixed-size output (called as digest) for an input of any size. Look at figure 2, the output will be purely unique and even a single ‘bit’ of change in the input will change the output completely. The algorithmic function used here is called ‘hash function’ which is a one-way function (preimage resistant), which means it is practically impossible to reverse the function and obtain the desired input from a calculated output. Adding to this, it is practically infeasible too to produce the same output from two or more different inputs (second preimage resistant).
Hash (x) = Fixed size digest
Most of the Blockchains use a hashing algorithm called as Secure Hash Algorithm – 256 (SHA-256) which gives a fixed output of 256 bits (32 characters each of 8 bits) means there are 2256 different possible values. With the growing use of this technology, a larger number of input values can exist with a fixed amount of outputs which brings with it the possibility of collusion (Hash(X) = Hash (Y)). However, the probability of this collusion event is very low and thus, SHA-256 is called to be collision-resistant. Every time the data is added to the block, it generates a new hash fingerprint’ (digest) for the list of data. Even a small change in any data among the list will change the fingerprint of the block, making it very convenient to find any change occurring in the database.
Private/Public Key Cryptography
Private/Public Key cryptography is a very fundamental feature provided to secure the data flow between two users. A pair of keys is generated, called a public key and a private key, which are mathematically related to each other. The public key can be made public, but the private key needs to be kept secret to keep the database owned by the user secure. This process is also called as Asymmetric-key cryptography. Although the keys are related to each other, it is practically impossible to evaluate the private key by knowing the public key. In a Blockchain system, the receiver’s public key is used by the sender to send the data, and the receiver, with the help of his own private key can decrypt the data (see figure 3). The private key is usually longer than the public key. Using this method, it is clearly evident that the data exchange is happening only on a peer-to-peer principle.
The address is an alphanumeric string of characters which are derived by applying a hash function to the user’s public key. Addresses are usually shorter than the public keys and are mostly used to send and receive digital assets. But in this paper, we will be using the addresses for different purposes. The address is generated by a simple process:
public key → hash function → address
These addresses are responsible for the pseudo-anonymity in the Blockchain ecosystem by acting as the participant’s digital identity in the network.
A wallet securely stores the public keys, private keys, and the addresses of the user. It can also store the digital signatures of all the data which has been exchanged using the stored addresses and key-pairs. Getting your private key stolen will mean the stealer has complete access to the data stored in your wallets using the private key. Usually we hear in the news about cryptocurrencies getting stolen, this means that someone got access to the private key of the user, and all the cryptocurrencies were transferred from one wallet address to another, and since blockchain’s ledger is irrefutable, the process cannot be undone.
A block contains a list of validated data (generally a transaction) and is generated or update after the process of mining. Each block is hashed to its current state, which is done to protect and trace the changes a block is undergoing through.
A block is made up of the following components:
Block Height – Also known as block number is the number of blocks which precedes that particular block. The genesis block is the first block generated in a blockchain. Block Height can also be termed as the distance between that particular block and the genesis block.
- Current Block Hash – The newly generated hash value of the current state of the Block.
- Previous Block Hash – The hash of the block formed just before that block. The current block hash will be changed if the hash of any block generated before the current block changes.
- The Merkle Tree Root Hash – Since it is nearly impossible to store the hash of every update the block is going through; the block only stores the Merkle Tree Root Hash in the Block Header. A Merkle Tree keeps combining the hash values of the data until only one root hash is remaining to be stored. This root hash is called as the Merkle Tree Root Hash. This procedure can be used to summarize the data in the block and verify the presence of each and every change that happened in the block. The process can be seen in figure 4 below.
Looking at the figure above, the 1st layer is the data layer which contains A, B, C, and D. These are the data elements that need to be summarized. In the 2nd row, the hash of each data has been generated using a hash function. The hashes of the data are combined in the following layers and the hash of the combined hashes is generated. This process is repeated until the tree is left with a single root hash which is on the topmost layer. The Merkle root is stored in the Block header and hash value of the block header depends upon the Merkle Root Hash. Any change in any layer of the hierarchy will result in the change of hash values of Merkle Root Hash and the hash of Block Header.
- Timestamp – A timestamp is the stamp of the exact time when the block is generated or updated by the miner. The miner is responsible for inserting the timestamp in the block. In-Page 7 of Ethereum’s White paper, Vitalik Buterin says to check if the block is valid:
“Check that the timestamp of the block is greater than that of the [median of the 11 previous blocks] and less than 2 hours into the future”
- Nonce Value – It is a one-time 32-bit random number, manipulated by the miner to solve the hash puzzle, which when solved, enables the miner to add the block to the blockchain.
- Data included in the block
Chaining the Blockchain
As observed in the list of components of a block, each block contains its own hash as well as the hash of the previous block and this is how the blocks stay connected and form a linear chain of blocks. Any change in a block will result in a change in the hash of that block and in all the hashes of the succeeding blocks. This way, it is easier to locate the block which has undergone a change.
Below is the figure to summarize the complete architecture of the blockchain.
How does a Blockchain work?
Below is a step-by-step process that explains the working of a blockchain. We are explaining this working based on Bitcoin’s blockchain. To start with the process, consider A wants to send money/data to B. The transaction will be represented as a block in the network. The architecture of the block has already been discussed earlier. The transaction is broadcasted on the blockchain network. The network (miners) will work to validate the authenticity of the transaction using a consensus mechanism (discussed later). A new block is generated once the block is deemed authentic by the network. Once the block is generated, it will be added to the most current state (the most recent block) of the blockchain. After the block has been added, the blockchain will be updated to its most current state and the transaction will be executed.