Storage group configuration
UltiHash supports storage groups can be configured with or without erasure coding enabled, depending on user requirements concerning availability, resilience, and performance.
Configuring erasure coded storage groups
To configure erasure coding in the UltiHash storage cluster, you can define storage groups using Reed-Solomon coding parameters. Each group is characterized by k
data shards and m
parity shards, allowing the system to tolerate up to m
node failures without data loss. Storage groups with erasure coding enabled are specified in the configuration file as follows:
In this example:
4
data shards and2
parity shards, allowing the system to tolerate up to two storage instance failures without data loss.A total of
6
storage instances is used, matching the required total ofdata_shards + parity_shards
.A
stripe_size_kib
value of256
, meaning data is handled in stripes of 256 KiB, which are then split up across the data shards.Each storage instance uses the
local-path
storage class with70Gi
of allocated volume.
Implications:
Fault Tolerance: The group can lose any 2 of its 6 nodes and still recover the original data.
Storage Overhead: The parity overhead is 2 out of 6 shards, or approximately 33%.
Efficiency: This configuration balances improved fault tolerance with relatively efficient storage usage compared to full replication.
Recommendations and Limitations:
strip sizes must be evenly divisible by the number of data shards
stripe sizes beyond 4096 KiB rarely make sense and lead to degraded performance
currently, only a single storage group is supported
By carefully selecting k
and m
values, you can balance storage efficiency and fault tolerance to meet your specific requirements.
Configuring storage groups without erasure coding
In use cases where data resilience is not a major concern, e.g. in cases where temporary data or intermediate results are stored, UltiHash supports setting up storage groups exposing the raw storage resources with no erasure coding applied:
In this example:
6
storage instances are defined, and data is distributed sequentially across them.Each storage instance uses the
local-path
storage class with70Gi
of allocated volume.
Implications:
Data Distribution: Objects are written to storage backends in round-robin order, without redundancy or parity. This ensures even distribution of storage utilization across the devices.
Fault Tolerance: There is no inherent fault tolerance—if a storage instance fails, the data it holds may be permanently lost unless external redundancy mechanisms (e.g., backups or replication) are used.
Storage Efficiency: Full storage capacity is available for use. No additional space is reserved for parity or replication, making this setup 100% efficient in terms of raw capacity.
Performance: This method provides low overhead and potentially better write throughput, as there is no encoding or parity computation involved.
This configuration is best suited for scenarios where performance and storage efficiency are more critical than fault tolerance, or where durability is managed at a higher layer in the stack.
Last updated
Was this helpful?