SuperAnnotate
Last updated
Was this helpful?
Last updated
Was this helpful?
SuperAnnotate is a platform for data management. With advanced annotation and QA tools, data curation, automation features, native integrations, and data governance, it enables enterprises to build high-quality datasets and ML pipelines.
UltiHash cluster integrates with SuperAnnotate as a storage solution via custom integration. It is strongly encouraged to read the corresponding article before proceeding with this guide.
UltiHash cluster, deployed in a Kubernetes environment and exposed via a public HTTPs endpoint. To deploy UH cluster on a Kubernetes cluster follow this instruction. A public HTTPs endpoint for UltiHash cluster could be provisioned at the Ingress object level.
Installed kubectl. The utility has to be configured to access the Kubernetes cluster where UltiHash cluster is deployed to.
Installed AWS CLI. The utility has to be configured with the credentials of the deployed UltiHash cluster to be able to access it.
Installed Git.
On SuperAnnotate find the Team Setup page and click there on Integrations:
Add a new custom integration:
Provide the integration with the name and URL. The Request URL should correspond to the public HTTPs endpoint that belongs to the generator of pre-signed URLs. The following format expected: https://public_domain_name/integrate. In the example below, the pre-signed URLs generator is exposed under the domain name generator.ultihash.example.
Save the displayed Secret, it will be required to pass to onto the generator of pre-signed URLs.
Before clicking on the button Create, need to deploy the pre-signed URLs Generator alongside the UltiHash cluster. Otherwise you will get the error below:
The generator of pre-signed URLs is the core component of the custom integration. It is a Kubernetes deployment that resides in the same Kubernetes cluster with the UltiHash cluster and generates S3 pre-signed URLs on demand for the requested files residing on the UltiHash cluster.
Clone the public Github repository with the scripts and switch to the directory named superannotate:
In the directory find the file presigned-urls-generator.yaml. Open it in your favorite editor and replace the following placeholders:
<ultihash-cluster-namespace> - Kubernetes namespace where your UltiHash cluster resides.
<ultihash-endpoint-url> - HTTPs endpoint of your UltiHash cluster, e.g. https://cluster.ultihash.example.
<custom-integration-secret> - secret value from the integration page, e.g. oAKg5YlvxSqiNF3ni2MUq4uJmnvexRcx67utGa62Gf7zymxYx9Ua2d5q1ZNh6p67.
<urls-generator-domain-name> - domain name under which the URLs generator is exposed publicly, e.g. generator.ultihash.example.
All placeholders required to be replaced are marked with the commentary # REQUIRED for better navigation. Finally apply the edited Kubernetes definitions:
Once the Kubernetes definitions successfully applied, click the Create button on the page of your custom integration. It has to be successfully created.
Create a bucket on the UltiHash cluster and upload there the data that needs to be accessed by SuperAnnotate. Suppose we have created a bucket named test that contains two uploaded 2 images: image1.jpg and image2.jpg.
It is recommended to configure CORS for the bucket containing the data (the bucket test in our case). For this create a text file named cors.json with the following contents:
Apply the CORS configuration above for the bucket test. In the example below the endpoint URL of the UltiHash cluster is https://cluster.ultihash.example.
On SuperAnnotate create a project that correspond to the data type being annotated. Since in the previous step we supposed that the dataset uploaded UltiHash contains images, need create a project of type Image.
Inside the project click on the button Add -> Upload Images. Switch to the External storage tab and select the integration created before.
To upload the files from the UltiHash cluster, need to create a CSV file, that lists the files that should be read by SuperAnnotate, and upload this file into the field depicted above. In out case the file will look like this:
Each line represents a single file that lies on the UltiHash cluster. The URL is the S3 URL that defines path to the file. For example, s3://test/image1.jpg means that a file named image1.jpg is inside the bucket test on the UltiHash cluster. The name is the filename that will be displayed on the side of SuperAnnotate.
After the CSV file was uploaded, you will see the list of files that need to be uploaded from the UltiHash cluster:
Click Upload and after upload is finished, find the files in the root of you project. They are ready to be annotated.