UltiHash documentation
← back to ultihash.io
  • Get started with UltiHash
  • Cheatsheet
  • Help + support
  • About UltiHash
    • Introduction
    • Features
      • Built-in deduplication
      • S3-compatible API
      • Cloud + on-prem with Kubernetes
      • Fast + lightweight deletion
      • Erasure coding for data resiliency
      • Access management
    • Benchmarks
  • Installation
    • Test installation
    • Kubernetes installation
    • AWS installation
    • System requirements
  • Connection
    • API use
    • Integrations
      • Featured: SuperAnnotate
      • Airflow
      • AWS Glue
      • Iceberg
      • Icechunk
      • Kafka
      • Neo4j
      • Presto
      • PySpark
      • PyTorch
      • Trino
      • Vector databases
    • Upload + download scripts
    • Pre-signed URLs
    • Data migration
  • Administration
    • Scaling, updates + secrets
    • Performance optimization
    • User and policy management
    • Advanced configuration
      • Helm chart customization
      • Storage group configuration
      • Kubernetes configuration
      • Monitoring configuration
    • Encryption
  • Troubleshooting
  • Changelog
    • Core image
    • Helm chart
Powered by GitBook
On this page
  • Essential Operations
  • All supported API Functions
  • S3 Compatibility Layer
  • IAM Compatibility Layer
  • Deduplication Metrics

Was this helpful?

  1. Connection

API use

You can find premade Python scripts for uploading and downloading data here.

UltiHash offers a powerful, S3-compatible API that allows developers to interact with storage clusters using familiar commands and libraries designed for Amazon S3. In addition to supporting standard S3 operations, UltiHash extends the functionality by providing unique features such as deduplication metrics.

S3 compatibility can be achieved in different ways depending on your environment. For example, Python developers typically use the boto3 library, which is part of the AWS SDK for Python and provides a straightforward interface for interacting with UltiHash as if it were S3. In contrast, data processing tools like PySpark don’t use boto3, instead, they rely on connectors like s3a, which is part of the Hadoop ecosystem and optimized for distributed data processing. This distinction is important: boto3 is great for scripting and general-purpose workloads, while s3a is better suited for large-scale data operations in frameworks like Spark.

While UltiHash is S3-compatible, you can use any S3-compliant SDK to interact with it. The AWS SDKs offer extensive support across various programming languages like Python, Java, Node.js, and more. These SDKs are well-documented and maintained, making them the ideal choice for most developers.

For developers looking to explore SDKs across various languages and environments, we highly recommend visiting AWS Developer Tools, which offers comprehensive support for integrating with S3-compatible APIs, including UltiHash.

For any further questions or advanced use cases, our support team is always ready to assist you.

Essential Operations

UltiHash supports a wide range of S3 API operations, enabling you to manage your data effectively. Here are the core operations you can perform:

  • CreateBucket: Create new buckets in your UltiHash storage.

aws s3api create-bucket --bucket your-bucket-name --endpoint-url https://your-ultihash-endpoint
  • PutObject: Upload files to your UltiHash buckets.

aws s3 cp local-file.txt s3://your-bucket-name/ --endpoint-url https://your-ultihash-endpoint
  • GetObject: Retrieve files from your UltiHash storage.

aws s3 cp s3://your-bucket-name/file.txt local-file.txt --endpoint-url https://your-ultihash-endpoint
  • ListObjectsV2: List the contents of your buckets.

aws s3api list-objects-v2 --bucket your-bucket-name --endpoint-url https://your-ultihash-endpoint
  • DeleteObject:

Manage your storage by removing one unnecessary object.

aws s3api delete-object --bucket your-bucket-name --key your-object-key --endpoint-url https://your-ultihash-endpoint

Facilitate object deletion operations by removing all objects in one bucket

aws s3 rm s3://your-bucket-name/ --recursive --endpoint-url https://your-ultihash-endpoint
  • DeleteBucket: Manage your storage by removing unnecessary buckets (buckets should be empty before removal).

aws s3api delete-bucket --bucket your-bucket-name --endpoint-url https://your-ultihash-endpoint

All supported API Functions

S3 Compatibility Layer

  • AbortMultipartUpload

  • CreateMultipartUpload

  • CompleteMultipartUpload

  • CopyObject

  • CreateBucket

  • DeleteBucket

  • DeleteBucketPolicy

  • DeleteObject

  • DeleteObjects

  • GetBucketPolicy

  • GetObject

  • HeadBucket

  • HeadObject

  • ListBuckets

  • ListMultipartUploads

  • ListObjects

  • ListObjectsV2

  • PutBucketPolicy

  • PutObject

  • UploadPart

IAM Compatibility Layer

  • CreateAccessKey

  • CreateUser

  • DeleteAccessKey

  • DeleteUser

  • DeleteUserPolicy

  • GetUserPolicy

  • ListUserPolicies

  • PutUserPolicy

For more details on available SDKs and language-specific guides, check out the AWS SDK hub.

Deduplication Metrics

get_effective_size.py : https://github.com/UltiHash/scripts/tree/main/boto3/ultihash_info

UltiHash extends beyond the standard S3 API with features like deduplication metrics, which allow you to query the effective size of your data after deduplication. This unique functionality is crucial for optimizing storage and understanding your actual storage usage.

Example:

# Retrieve deduplicated data size from UltiHash
get_effective_size.py --url <https://ultihash>

Last updated 2 months ago

Was this helpful?