Benchmarking erasure coding schemes in openStack swift
Abstract
Erasure coding (EC) is a security measure that allows for data to be reconstructed
from parity pieces, which eliminates the need for complete data replication. EC
offers increased data redundancy, efficiency, lowers storage cost and boosts fault
tolerance, making it preferable to replication in Swift. The basic idea is to encrypt
a certain amount of data in a way that guarantees that all coded pieces are transferred without any loss. The time efficiency of EC methods becomes increasingly
important in guaranteeing optimal system performance as data volumes continue
to increase rapidly. A number of variables, such as the particular algorithm used,
data size, the number of storage nodes, hardware resources, and network conditions,
can affect how quickly EC works. The primary subject of our analysis was erasure
coding algorithm- Reed-Solomon Codes. The study investigates the encoding speed
of the algorithm, considering factors like data size and the number of parity blocks
generated. In the context of addressing time efficiency and fault tolerance challenges in cloud-based object storage systems, our paper focuses on evaluating and
improving existing mechanisms. It comprehensively analyzes time efficiency mechanisms, such as data placement policies, and scheduling algorithms, to enhance data
retrieval and storage processes. Exploring the time efficiency of EC is also focused
where it is conducted as an analysis of the time it takes for a cloud storage system
to store data by examining two datasets and determining the duration it takes to
store those same dataset files on the cloud storage system (Swift). It also assesses
fault tolerance mechanisms, including redundancy schemes, error correction codes
and distributed data placement strategies to improve system resilience. The research proposes innovative approaches to minimize access latency, improve overall
time efficiency and ensure data availability even in the presence of failures.