ROBB: recurrent proximal policy optimization reinforcement learning for optimal block formation in bitcoin blockchain network
Abstract
Blockchain is a ground-breaking technology that has changed how we manage and
store protected data. It is a decentralized ledger that enables safe, open, and unchangeable
record-keeping. It relies on a distributed network of nodes rather than a
single central authority to check and verify transactions, guaranteeing that each entry
is correct and unchangeable. Transactions in a blockchain network are grouped
into blocks, which are then linked together in a chronological and immutable chain.
Block size is a critical parameter in blockchain technology, which refers to the maximum
size of each block in the chain. However, we cannot just change the block
size of the blockchain. It is challenging and will create security issues. The Block
size is crucial because it a↵ects the number of transactions processed per second,
the confirmation time, and overall network efficiency. The confirmation time should
be faster to ensure stable earnings for the miners. Moreover, it needs help with
broader applications due to high transaction fees and long verification times. We
have proposed a reinforcement learning model named ROBB that can efficiently
create a block considering the current network state and previous transactions. At
first, the problem was converted into a reinforcement learning environment to solve
using multiple reinforcement algorithms. We developed a blockchain simulator to
replicate the network environment. To transform it into a reinforcement learning
environment, we integrated it with OpenAI Gym. The simulator was trained by generating
random transactions. Finally, we designed a reward function that enables
the simulator to hold transactions and create blocks with the pending transactions
when it determines that the environment is favourable. In the final results, ROBB
successfully minimized the waiting time for transactions and utilized the blocks to
their full potential, which is crucial for private blockchains used in medical records.
Additionally, it optimized the block space, building upon the findings of previous
researchers.