Blockchain : The Secure way to keep Documents: Why and How

A crypto currency is a virtual or digital currency which is secured by cryptography, that makes it difficult to counterfeit or double-spend. Crypto currencies are decentralized networks set up on blockchain technology which is a distributed ledger administered by a network of computers. These crypto currencies store transactions within the block as digital packs of data. Theoretically any data can be stored on a blockchain.

To store documents on the Blockchain there are two important ways. One way is to store the entire document itself on-chain. Another way is to store a hash of it on the blockchain. The most efficient system is to store a document's hash on-chain while keeping the whole document elsewhere. The document could be stored on a distributed file storage system or in a centralized database. One can keep the document through a secure hash algorithm like SHA-256 and then store the hash in a block.


Storing the Whole Document on a blockchain

With certain blockchains,  storing a whole document on-chain is possible however, it is not a good idea as storing large documents will be computationally very expensive.  Unless it is  extremely  important or a very small file, it is better to store a document's hash on-chain while keeping the whole document elsewhere. But in Bitcoin, the documents such as pdf, doc or audio can be stored by compressing it into Hexadecimal format.

Due to access latency, it is problematic  to  store the whole document on a blockchain. Blockchain access  latency is the time between submitting a transaction to a network and the first confirmation of acceptance by the network. After the initial confirmation, the transaction becomes final as more blocks are attached beyond the initial confirmation. That  means how long it takes network users to upload and download files, such as documents. There are thousands of nodes on fully decentralized public blockchains. Public blockchains are slow because it takes time for the network to reach a consensus. Compared to private blockchain,  the time taken to process a single block is also more. Any file storage, including documents, needs to have low latency otherwise the system becomes slow, expensive and clogged up for usage.


Storing a Hash on a blockchain

What is a hash?  A hash is a generated string that is computed using our data input. The output hash will always be the same with the same input. Other input does result in another hash. The change in input would result in a completely new hash value, different from the original document. This most efficient method is to store a document’s hash on-chain while keeping the whole document elsewhere.

As compared to whole documents, hash values are very small and effective to store data on a blockchain. These hashes are very much scalable also.  For storing multiple documents, one can store them on a chain by putting the hashes into a distributed hash table.


Why to use Blockchain ?


Due to the digital signature and encryption feature of Blockchain, it  is considered to be a highly secured and forgery-resistant system. Cryptographically linked blocks give a record immune from tampering. This tamper resistance is highly effective in preventing document fraud and forging of documents.  Storing a hash of the document is beneficial if we cannot store the actual document on the blockchain due to file size limitations.  As compared to financial transactions, documents often take up a lot of space. Hence it is not feasible to store a whole document on a blockchain. Hashes consume a small portion of the space, hence, it is a much more efficient option to store documents. Storing the hash still offers tamper resistance. If we want to change the input of a file, its corresponding hash value will always change. Nevertheless where you store your document, whether in a distributed database like Azure or in a centralized system like DBMS, we can still verify the document has not been tampered with by comparing it to the blockchain-stored hash and rehashing.


A public  blockchain  is an open network in which the information is accessible in a public domain.  Anyone can  read,  write and view data on the blockchain and the data is accessible to all due to its permission less quality. No specific participant can regulate the data in a public blockchain.  The document is accessible to the public if we use public blockchain such as Bitcoin , Ethereum etc. The document or its hash will be permanently stored on public blockchain.  Once the data is included in the block, no one can change the data.

In order to get limited access to the documents or to have permanent visibility to a shortlisted group, one can use private blockchain. Such blockchains can furnish the ability to provide permanent visibility to a preselected group. Besides providing decentralization and tamper-resistance, the main benefit of private blockchain is speed. Private blockchains have far fewer participants, which takes short time for the network to reach a consensus. As a result, more transactions can take place. Thousands of transactions per second can be processed by private blockchains.


Without taking third party assistance, decentralized blockchains permit transactions to be made directly from person to person. If users don’t want to put trust in a central authority, decentralized blockchain plays a vital role. There is less likely to be a single point of failure and less censorship. But all blockchains are not similar. If the consensus protocol is not fairly decentralized or allows full nodes to reverse or censor transactions, then people will face the same problems as using traditional systems.


  •  December, 24, 2020
  • Shreya Abhyankar
We'll never share your email with anyone else.
Save my name, email, and website in this browser for the next time I comment.
Latest Blogs