Set up object versioning
How to enable object versioning on UltiHash
UltiHash supports S3-compatible object versioning, a critical feature for teams working with evolving datasets, model artifacts, and analytic outputs. Versioning allows you to track every change to an object over time and retrieve or roll back to any previous version on demand, all without altering your workflows.
Unlike traditional storage systems where versioning often results in costly storage bloat, UltiHash’s deduplication engine ensures redundant data across versions is only stored once. You get traceability without the cost of storage overheads
When multiple versions of the same object share overlapping binary content (e.g., large datasets with only partial changes, or models with identical weights in early layers), UltiHash detects duplicated parts and matches these redundancies of files. This dramatically reduces incremental storage growth: instead of duplicating whole objects through versioning, only the changes are saved. You can keep detailed version histories without drastically growing your storage footprint.
Enabling Versioning
UltiHash’s versioning is modeled on S3 semantics and available via the standard S3-compatible API. By default, versioning is not enabled, and you need to explicitly turn it on through UltiHash S3-compatible API before it starts tracking versions.
Versioning is bucket-scoped and must be explicitly enabled first. Once turned on, every update or overwrite to an object is saved as a new, immutable version rather than replacing the old one.
Each version is automatically assigned a unique versionId
, which is returned after each write ( PUT
) operation: this lets you retrieve, compare, or roll back to any specific version whenever needed.
aws s3api put-bucket-versioning \
--bucket your-bucket-name \
--versioning-configuration Status=Enabled
Once versioning is enabled:
Every
PUT
or overwrite generates a new immutable version with its ownversionId
, so older versions are always preserved.By default,
GET
requests return the latest version unless you explicitly request a specificversionId
.Overwriting an object doesn’t delete previous data anymore, every change is saved as a new version. If you need to permanently remove an object (including all its versions), you must explicitly delete each version using its
versionId
.
Example: Tracking a Model Artifact
aws s3api put-bucket-versioning \
--bucket models \
--versioning-configuration Status=Enabled
aws s3api put-object --bucket models --key resnet50.pt --body ./resnet50.pt
# returns versionId: "3a4b6..."
# Upload a new version
aws s3api put-object --bucket models --key resnet50.pt --body ./resnet50_v2.pt
# returns versionId: "5f1c2..."
In this example, we first upload an initial model file (resnet50.pt
) to the models
bucket. Later, we upload an updated file (resnet50_v2.pt
) to the same object key, which is the unique name (or "path") used to reference that object in the bucket.
While the key stays the same, the actual object content changes, and with versioning enabled, each change automatically creates a new, immutable version. Each PUT
command returns a unique versionId
, allowing you to precisely track, retrieve, or roll back to any specific version at any time without overwriting earlier data. Each version is accessible for audit, rollback, or comparison.
When retrieving an object, if you don’t specify a versionId
, you’ll always receive the latest version by default. To download a specific version instead, you should include the --version-id
parameter:
aws s3api get-object --bucket models --key resnet50.pt --version-id 3a4b6... ./resnet50_v1.pt
The part ./resnet50_v1.pt
is manually chosen by you as the local filename to save the downloaded file, UltiHash does not add this automatically.
Namespace Behavior
UltiHash treats each object as unique based on its full namespace that is, the combination of bucket name and object key.
bucket-1/data/file.csv
bucket-1/data-v2/file.csv
In this example, even if the content of both files is identical, they’re considered two separate objects with their own independent version histories because the namespaces are different.
Even if versioning is not enabled in this case, UltiHash’s deduplication engine will still detect and reduce storage for shared content, but versioning always operates per object key.
Last updated
Was this helpful?