Test with Docker
How to set up a local test environment for UltiHash using Docker
This guide will show you how to set up a test environment for UltiHash.
You’ll set up a local environment using Docker Compose for container orchestration.
The main steps are as follows:
Install prerequisite tools
Set up UltiHash with Docker Compose
Integrate sample data + see space savings
This setup is intended for local testing - not production use.
1. Install prerequisite tools
Before you start setting up the UltiHash cluster, you need some tools installed. If you already have any of these installed, you can simply skip that step.
Install Docker Engine
Docker provides a containerized virtual environment for UltiHash to run on.
You can find general instructions for installing Docker Engine at docs.docker.com/engine/install.
To quickly install, run:
After installing Docker, you may need to add your user to the Docker group.
Run:
Make sure to restart your computer at this stage to apply the group changes.
Install AWS CLI
The AWS CLI is a unified tool to manage AWS services from the command line.
You can find general instructions for installing the AWS CLI at docs.aws.amazon.com/cli/latest/userguide/getting-started-install.
To quickly install, run:
Done! You've successfully installed all the prerequisites for testing UltiHash.
Next, you'll set up your local cluster using Docker Compose.
2. Set up UltiHash with Docker Compose
Now you’ll set up the local UltiHash environment using Docker Compose. This involves authenticating with the UltiHash registry, downloading the necessary configuration file, and running the UltiHash services locally.
Set up authentication with the registry Before you can download and run UltiHash, you need to authenticate with the UltiHash registry. The registry is where the container images (required for running UltiHash) are stored.
For this step, you'll need these credentials from your UltiHash Dashboard:
Registry login
Registry password
Log in to the UltiHash registry with your credentials:
Download compose.yaml
The compose.yaml file is a Docker Compose configuration file that defines all the services, volumes, and settings needed to run UltiHash.
Download:
Set up credentials and license
To enable access to UltiHash services, you need to export your credentials and license key. These environment variables will be used for authentication.
Run the following commands:
Make sure to replace <customer-id> , <access-token> , and <monitoring-token> with the 'Customer ID', 'Access token', and 'Monitoring token' from your Dashboard.
Start UltiHash services
Change the working directory to the folder where you saved compose.yaml. For example:
Start the UltiHash cluster:
If successful, Docker Compose will download the necessary images (if they’re not already cached) and start the UltiHash services.
Done!
You’ve successfully set up your local UltiHash cluster. Next, let's integrate sample data + see space savings.
3. Integrate sample data + see space savings
Now that UltiHash is running on your local cluster, let's integrate some sample data.
Prepare dataset
If you have a dataset you want to test already, you can skip this step.
Alternatively, you can download one of these datasets from Kaggle:
Remember to unzip your test dataset if you download it from Kaggle.
Create a bucket
Object storage systems like UltiHash use a top-level container called a bucket. To facilitate scalability, buckets don’t have a traditional hierarchical folder structure: instead, each object in a bucket has a unique key (which can resemble a file path, simulating directories).
To create a bucket, run:
Make sure to replace <bucket-name> with your chosen bucket name, e.g. test-bucket.
You can see your newly created bucket by running:
Download scripts
We've prepared some scripts to make the testing process easier.
Download the following scripts for uploading and downloading:
Integrate sample data
Now that you have a bucket in which to put objects, let's use the upload script to integrate your sample data.
To integrate your dataset, run:
Make sure to replace <upload-script-path> with the path to the upload script you downloaded, e.g. /home/user/Downloads/uh-upload.py.
Also replace <bucket-name> with your bucket name.
Finally, replace <dataset-path> with the path to the directory for the dataset you prepared or downloaded, e.g. /home/user/Downloads/test-dataset.
A bar should display the ongoing progress of your integration.
Once the integration is complete, you can run the following command to see your objects:
Make sure to replace <bucket-name> with your bucket name.
You can also download an entire bucket by running:
Make sure to replace <download-script-path> with the path to the upload script you downloaded, e.g. /home/user/Downloads/uh-download.py.
Also replace <destination-path> with the path to the directory you want to download the bucket to, e.g. /home/user/Downloads.
Finally, replace <bucket-name> with the name of the bucket to download.
See space savings in your cluster
You can see the storage space UltiHash is saving across the entire cluster by running the uh-see-space-savings script:
Make sure to replace <see-space-savings-script-path> with the path to the upload script you downloaded, e.g. /home/user/Downloads/uh-see-space-savings.py.
Done! You’ve successfully integrated a dataset to a local test cluster, and can see the space saved by UltiHash's built-in deduplication.
Last updated
Was this helpful?