Erasure coding for data resiliency

How to increase the resilience of your cluster with UltiHash's parity-based storage

Ensuring data resiliency is a critical concern for distributed storage systems, particularly in mitigating the risks posed by outages or hardware failures. UltiHash employs Reed-Solomon erasure coding to improve service availability and to also enhance data durability. This method organizes data in units of data called stripes, where each stripe gets split up into k shards, and an additional m additional parity shards are computed and stored for redundancy.

With this Reed-Solomon-based erasure coding scheme, an UltiHash storage cluster comprised of k + m data nodes offers a usable capacity equal to k nodes while being able to tolerate failures of up to m nodes without losing data or interrupting service availability.

In the event of failing data nodes, the parity shards allow the cluster to reconstruct the original data while keeping the service available, not requiring additional maintenance downtime. This approach provides a robust and efficient mechanism for improving service uptime and for mitigating the risk of data loss.

How to configure storage groups with erasure coding

UltiHash supports storage groups can be configured with or without erasure coding enabled, depending on user requirements concerning availability, resilience, and performance.

To configure erasure coding in the UltiHash storage cluster, you can define storage groups using Reed-Solomon coding parameters. Each group is characterized by k data shards and m parity shards, allowing the system to tolerate up to m node failures without data loss. Storage groups with erasure coding enabled are specified in the configuration file as follows:

storage:
  groups:
    - id: 0
      type: ERASURE_CODING
      storages: 6 # number of storage instances in the group
      data_shards: 4 
      parity_shards: 2
      stripe_size_kib: 256
      storageClass: local-path
      size: 70Gi # volume size allocated for each storage instance

In this example:

4 data shards and 2 parity shards, allowing the system to tolerate up to two storage instance failures without data loss.
A total of 6 storage instances is used, matching the required total of data_shards + parity_shards.
A stripe_size_kib value of 256, meaning data is handled in stripes of 256 KiB, which are then split up across the data shards.
Each storage instance uses the local-path storage class with 70Gi of allocated volume.

Implications:

Fault Tolerance: The group can lose any 2 of its 6 nodes and still recover the original data.
Storage Overhead: The parity overhead is 2 out of 6 shards, or approximately 33%.
Efficiency: This configuration balances improved fault tolerance with relatively efficient storage usage compared to full replication.

Recommendations and Limitations:

strip sizes must be evenly divisible by the number of data shards
stripe sizes beyond 4096 KiB rarely make sense and lead to degraded performance
currently, only a single storage group is supported

By carefully selecting k and m values, you can balance storage efficiency and fault tolerance to meet your specific requirements.

How to configure storage groups without erasure coding

In use cases where data resilience is not a major concern, e.g. in cases where temporary data or intermediate results are stored, UltiHash supports setting up storage groups exposing the raw storage resources with no erasure coding applied:

storage:
  groups:
    - id: 0
      type: ROUND_ROBIN
      storages: 6
      storageClass: local-path
      size: 70Gi

In this example:

6 storage instances are defined, and data is distributed sequentially across them.
Each storage instance uses the local-path storage class with 70Gi of allocated volume.

Implications:

Data Distribution: Objects are written to storage backends in round-robin order, without redundancy or parity. This ensures even distribution of storage utilization across the devices.
Fault Tolerance: There is no inherent fault tolerance—if a storage instance fails, the data it holds may be permanently lost unless external redundancy mechanisms (e.g., backups or replication) are used.
Storage Efficiency: Full storage capacity is available for use. No additional space is reserved for parity or replication, making this setup 100% efficient in terms of raw capacity.
Performance: This method provides low overhead and potentially better write throughput, as there is no encoding or parity computation involved.

This configuration is best suited for scenarios where performance and storage efficiency are more critical than fault tolerance, or where durability is managed at a higher layer in the stack.

Last updated 4 months ago

Was this helpful?