Test with Docker

How to set up a local test environment for UltiHash using Docker

This guide will show you how to set up a test environment for UltiHash.

You’ll set up a local environment using Docker Compose for container orchestration.

Please note that to test UltiHash, you need to sign up for a free account.

If you want to test UltiHash in a Kubernetes environment, you can do so with Minikube.

The main steps are as follows:

1

Install prerequisite tools

2

Set up UltiHash with Docker Compose

3

Integrate sample data + see space savings

For now, UltiHash is only supported on Linux. This guide provides commands to be run in your terminal, and assumes you're running Ubuntu LTS on an AMD64 (x86_64) architecture. Other distributions and ARM architectures should work fine, although some commands may need slight adjustment.

1. Install prerequisite tools

Before you start setting up the UltiHash cluster, you need some tools installed. If you already have any of these installed, you can simply skip that step.

1

Install Docker Engine

Docker provides a containerized virtual environment for UltiHash to run on.

You can find general instructions for installing Docker Engine at docs.docker.com/engine/install.

To quickly install, run:


After installing Docker, you may need to add your user to the Docker group.

Run:

2

Install AWS CLI

The AWS CLI is a unified tool to manage AWS services from the command line.

You can find general instructions for installing the AWS CLI at docs.aws.amazon.com/cli/latest/userguide/getting-started-install.

To quickly install, run:

3

Install boto3 (and tdqm)

The Amazon Web Services (AWS) SDK for Python (often referred to as boto3, allows you to interact with AWS services programatically.

To install, run:

tqdm is a Python package that provides a progress bar, which will be used in the upload scripts.

To install, run:

2. Set up UltiHash with Docker Compose

Now you’ll set up the local UltiHash environment using Docker Compose. This involves authenticating with the UltiHash registry, downloading the necessary configuration file, and running the UltiHash services locally.

1

Set up authentication with the registry Before you can download and run UltiHash, you need to authenticate with the UltiHash registry. The registry is where the container images (required for running UltiHash) are stored.

For this step, you'll need these credentials from your UltiHash Dashboard:

  • Registry login

  • Registry password


Log in to the UltiHash registry with your credentials:

2

Download compose.yaml

The compose.yaml file is a Docker Compose configuration file that defines all the services, volumes, and settings needed to run UltiHash.

Download:

4KB
Open

Trouble downloading? Try right-clicking and selecting 'Save link as...' or similar.

3

Set up credentials and license

To enable access to UltiHash services, you need to export your credentials and license key. These environment variables will be used for authentication.

Run the following commands:

4

Start UltiHash services Change the working directory to the folder where you saved compose.yaml. For example:

Start the UltiHash cluster:

If successful, Docker Compose will download the necessary images (if they’re not already cached) and start the UltiHash services.

3. Integrate sample data + see space savings

Now that UltiHash is running on your local cluster, let's integrate some sample data.

1

Prepare dataset

If you have a dataset you want to test already, you can skip this step.

Alternatively, you can download one of these datasets from Kaggle:

UltiHash's deduplication can have significantly different results depending on the dataset integrated. For testing, try datasets likely to contain repeated content - like document libraries with shared templates, multimedia collections with common graphics, or code repositories.

2

Create a bucket

Object storage systems like UltiHash use a top-level container called a bucket. To facilitate scalability, buckets don’t have a traditional hierarchical folder structure: instead, each object in a bucket has a unique key (which can resemble a file path, simulating directories).

To create a bucket, run:


You can see your newly created bucket by running:

3

Download scripts

We've prepared some scripts to make the testing process easier.

Download the following scripts for uploading and downloading:

5KB
Open

Trouble downloading these scripts? Try right-clicking and selecting 'Save link as...' or similar.

4

Integrate sample data

Now that you have a bucket in which to put objects, let's use the upload script to integrate your sample data.

To integrate your dataset, run:

A bar should display the ongoing progress of your integration.


Once the integration is complete, you can run the following command to see your objects:


You can also download an entire bucket by running:

5

See space savings in your cluster

You can see the storage space UltiHash is saving across the entire cluster by running the uh-see-space-savings script:

Last updated

Was this helpful?