# Install Self-Hosted on AWS

For cloud deployments, UltiHash integrates seamlessly with AWS and Elastic Block Storage (EBS). Unlike traditional object storage solutions that charge based on the number of requests, which leads to intransparent and unpredictable costs, UltiHash eliminates these uncertainties. Instead, users can choose from different storage classes based on performance needs, which is especially useful for I/O-intensive and mission-critical workloads.

***

This guide describes the full installation process of UltiHash in AWS environment, including:

* provision of EKS cluster in a dedicated VPC
* deployment of the essential Kubernetes controllers
* installation of UltiHash on the EKS cluster

This guide outlines the recommended UltiHash setup for managing 10 TB of data. The setup diagram is shown below. UltiHash cluster is deployed on a single EC2 instance of type `r8g.4xlarge` with a network load balancer that routes traffic to it. The cluster uses `gp3` volumes optimized for performance, ensuring efficient storage management. In case you have other storage requirements, you may freely change the volume sizes in the configuration. You are free to select any EC2 instance type and EBS volume configurations for production purposes based on your specific needs. The diagram below depicts the resources to deploy in an AWS account by the Terraform scripts.

By default the deployment is done in a single AZ (see the diagram below). However it could be adjusted; see [#eks-cluster-setup](#eks-cluster-setup "mention")

<figure><img src="/files/ZiaccFhQHbO9ZKQgmjNq" alt=""><figcaption><p>Diagram of the deployed resources</p></figcaption></figure>

**Expected performance:**

* Write throughput: up to 200 MB/s
* Read throughput: up to 1000 MB/s

**Expected costs:**

* Hourly:
  * EC2 cost: 1.14 USD
  * EBS cost: 1.58 USD
  * UltiHash Pay-as-you-go license cost: 0.14 USD
* Monthly:
  * EC2 cost: 829.98 USD
  * EBS cost: 1152.38 USD
  * UltiHash 1 month subscription license cost: 92.16 USD

UltiHash license is available as pay-as-you-go license with pricing for the number of used GiBs per hour or subscription license with pricing for the number of used GiBs for the subscription duration available in 1 month, 12 month, 24 month and 36 month contract variations.

**List of billable AWS services:**

* mandatory: EKS, EC2, S3, KMS
* optional: SQS, Eventbridge

**Estimated amount of time to complete a deployment:** \~45 minutes.

<details>

<summary>System hardware requirements</summary>

* **Storage:** NVMe SSDs are required for optimal disk performance.
* **Network:** 10 Gbps interface minimum between nodes.
* **Kubernetes:** Version 1.20+ with Nginx Ingress and a CSI Controller installed.
* **Containerization:** Docker 19.03+ or Containerd 1.3+.
* **Helm:** Version 3.x.
* **Cloud:** for AWS, EC2 instances with Elastic Block Storage (EBS). GCP/Azure support is in development.

Resource needs will vary depending on the amount of data being stored and managed. For best performance, especially with larger datasets, it’s essential to provision additional resources accordingly.

</details>

***

{% stepper %}
{% step %}

## Prerequisites

#### Skills

* good knowledge of the following AWS services: IAM, VPC, EKS, EC2
* high-level knowledge of Terraform

#### Remote Environment

* access to an AWS account
  * **Warning:** do not use AWS account root to provision and manage the deployed resources! Instead create an IAM user that has sufficient privileges to manage these AWS services: IAM, VPC, EKS, EC2.
  * IAM permissions required to deploy and manage UltiHash cluster are listed in [#iam-permissions-required-to-deploy-and-manage-an-uh-cluster](#iam-permissions-required-to-deploy-and-manage-an-uh-cluster "mention")
  * make sure your AWS account has sufficient limits before deploying UltiHash cluster: [#manage-aws-service-limits](#manage-aws-service-limits "mention")
  * UltiHash cluster could be installed in any region where[ AWS EKS is supported](https://docs.aws.amazon.com/general/latest/gr/eks.html)

#### Local Environment

* [installed](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html#cli-configure-files-methods) AWS CLI
* [installed](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli) terraform
* [installed](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.28.md#v1280) kubectl of version 1.30
* personal credentials found on the [UltiHash dashboard](https://www.ultihash.io/user/dashboard)

{% endstep %}

{% step %}

## Setup S3 Bucket for Terraform States

Since the Terraform state for this setup has to be stored on S3, need to provision a dedicated S3 bucket. Execute the following command, replacing the `<bucket-name>` and `<aws-region>` placeholders:

```
aws s3api create-bucket --bucket <bucket-name> --create-bucket-configuration LocationConstraint=<aws-region> --region <aws-region> 
```

The S3 bucket will be created with the default encryption of type SSE-S3 (AWS managed KMS key) enabled.

{% endstep %}

{% step %}

## Clone the scripts repository

Clone the repository by executing the command below:

```
git clone https://github.com/UltiHash/scripts.git
```

Later its code will be required to setup UltiHash in AWS environment.

{% endstep %}

{% step %}

## EKS Cluster Setup

Since UltiHash has to be deployed on Kubernetes cluster, need to provision EKS cluster on AWS. For this purpose use [this Terraform project](https://github.com/UltiHash/scripts/tree/main/terraform/aws/eks-cluster). The project deploys a dedicated VPC and provisions there an EKS cluster with a single `c5.large` machine to host the essential Kubernetes controllers.

**Note:** by default the EKS cluster is provisioned with a public endpoint that is reachable over the Internet. In case the EKS cluster endpoint should be private, change the parameter [cluster\_endpoint\_public\_access](https://github.com/UltiHash/scripts/blob/f1790a97440bc42cb902e188275a63622f3375c0/terraform/aws/eks-cluster/eks.tf#L32C3-L32C40) from *true* to *false*.&#x20;

Once the [scripts](#clone-the-scripts-repository) repository is cloned, perform the following actions to deploy the Terraform project:

1. Update the `bucket name` and its `region` in the [main.tf](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster/main.tf) with the onces done at [the previous step](#setup-s3-bucket-for-terraform-states).
2. Update the configuration in [config.tfvars](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster/config.tfvars). The only required change is the parameter `cluster_admins` - specify the list of ARNs of IAM users and/or IAM roles that need to have access to the provisioned EKS cluster. Other parameters could be left intact.
3. Initialize and apply the Terraform project

   ```
   cd scripts/terraform/aws/eks-cluster
   terraform init
   terraform apply --var-file config.tfvars
   ```

   Wait until the installation is completed.

Make sure the access to the EKS cluster has been granted to the required IAM users and roles To check that, download the `kubeconfig` for the EKS cluster, executing the command below. Replace the `<cluster-name>` (by default `ultihash-test`) and the `<aws-region>` (by default `eu-central-1`) with the corresponding values defined in [config.tfvars](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster/config.tfvars).

```
aws eks update-kubeconfig --name <cluster-name> --region <aws-region>
```

Execute the following `kubectl` command to check the available EKS cluster nodes:

```
kubectl get nodes
```

The command has to output a name of a single provisioned EC2 instance.

{% endstep %}

{% step %}

## Install Controllers on EKS

The next step is installation of the essential Kubernetes controllers on the provisioned EKS cluster. For this purpose use [this Terraform project](https://github.com/UltiHash/scripts/tree/main/terraform/aws/eks-cluster-controllers). The project deploys the following Kuberentes controllers on the EKS cluster:

* `Nginx Ingress` - exposes UltiHash outside of the EKS cluster with a Network Load Balancer.
* `Load Balancer Controller` - provisions a Network Load Balancer for the `Nginx Ingress` controller.
* `Karpenter` - provisions EC2 instances on-demand to host UltiHash workloads.
* `EBS CSI Driver` - CSI controller that automatically provisions persistent volumes the UltiHash workfloads. The volumes are based on `gp3` storage class and optimised in terms of performance. The default storage class provisions unencrypted EBS volumes. To provision encrypted EBS volumes, create a new storage class like [this](https://gist.github.com/AndrzejKomarnicki/3926bae40060cb07a66a3f193cbbcd7e).&#x20;

Perform the following actions to deploy the Terraform project:

1. Update the `bucket name` and its `region` in the [main.tf](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster-controllers/main.tf) with the onces done at [the previous step](#setup-s3-bucket-for-terraform-states).
2. Update the configuration in [config.tfvars](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster-controllers/config.tfvars) if required. The helm values for the deployed controlers are found [here](https://github.com/UltiHash/scripts/tree/main/terraform/aws/eks-cluster-controllers/controllers-values). It is not recommended to change any of these configurations, the only parameter that should be selected in advance is the `Network Load Balancer type` (`internal` or `internet-facing`) in this [file](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster-controllers/controllers-values/nginx-ingress.yaml).
3. In case it is required to change the instance type for the UltiHash services, update it in the following [Karpenter manifest](https://github.com/UltiHash/scripts/blob/main/terraform/aws/eks-cluster-controllers/karpenter-manifests/10TB.node-pool.yaml).
4. Initialize and apply the Terraform project

   ```
   cd scripts/terraform/aws/eks-cluster-controllers
   terraform init
   terraform apply --var-file config.tfvars
   ```

   Wait until the installation is completed. A Network Load balancer should be provisioned in the same region as the EKS cluster.

{% endstep %}

{% step %}

## UltiHash installation

The last step is installation of UltiHash. For this purpose use [this Terraform project](https://github.com/UltiHash/scripts/tree/main/terraform/aws/ultihash). Perform the following actions to deploy the Terraform project:

1. Update the `bucket name` and its `region` in the [main.tf](https://github.com/UltiHash/scripts/blob/main/terraform/aws/ultihash/main.tf) with the ones done at [the previous step](#setup-s3-bucket-for-terraform-states).
2. Update the configuration in [config.tfvars](https://github.com/UltiHash/scripts/blob/main/terraform/aws/ultihash/config.tfvars) with the credentials obtained from your account on [ultihash.io](https://www.ultihash.io/). The credentials in the `config.tfvars` are mocked. The helm values for UltiHash are found [here](https://github.com/UltiHash/scripts/blob/main/terraform/aws/ultihash/ultihash-helm-values.yaml). Adjust the helm values to set your custom storage class if required.
3. Initialize and apply the Terraform project

   ```
   cd scripts/terraform/aws/ultihash
   terraform init
   terraform apply --var-file config.tfvars
   ```

   Wait until the installation is completed.

The UltiHash cluster is installed in the `default` Kuberentes namespace, you `kubectl` to see the deployed workloads:

```
kubectl get all
```

To get access to the deployed UltiHash cluster, configure your AWS CLI/SDK with the Ultihash root credentials:

```bash
# Obtain credentials for the UltiHash root user
aws_access_key_id=`kubectl get secret ultihash-super-user-credentials -o jsonpath="{.data.access-key-id}" | base64 --decode`
aws_secret_access_key=`kubectl get secret ultihash-super-user-credentials -o jsonpath="{.data.secret-key}" | base64 --decode`
      
# Set the credentials for the UltiHash root user
export AWS_ACCESS_KEY_ID=$aws_access_key_id
export AWS_SECRET_ACCESS_KEY=$aws_secret_access_key
```

Finally access the UltiHash cluster by using AWS CLI/SDK, use the domain name of the Network Load Balancer provisioned at [the previous step](#install-controllers-on-eks):

```bash
aws s3api list-buckets --endpoint-url http://ultihash-test-6a925a272ca1f954.elb.eu-central-1.amazonaws.com/
```

{% endstep %}
{% endstepper %}

<details>

<summary>How to uninstall UltiHash on AWS</summary>

To uninstall all previously deployed AWS resources follow the steps below:

{% hint style="info" %}
Make sure you are in a new Terminal window when uninstalling.
{% endhint %}

First, uninstall UltiHash by running the following commands:

```
cd scripts/terraform/aws/ultihash
terraform destroy --var-file config.tfvars
kubectl delete pvc --all
```

Next, uninstall the Kubernetes controllers:

```
cd scripts/terraform/aws/eks-cluster-controllers
terraform destroy --var-file config.tfvars
```

Finally, uninstall the EKS cluster:

```
cd scripts/terraform/aws/eks-cluster
terraform destroy --var-file config.tfvars
```

</details>

***

### More information

{% hint style="info" %}

### Manage AWS service limits

When deploying UltiHash cluster on Amazon EKS, it is important to ensure that your AWS account has sufficient **EC2 vCPU-based instance limits** in the selected region. Amazon EKS worker nodes are backed by EC2 instances, and if vCPU quotas are too low, the cluster may fail to scale or provision nodes, causing deployment failures.

The relevant quota is: **Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances**\
Default limit: **5 vCPUs per region**

If the EKS cluster attempts to launch EC2 instances exceeding your vCPU quota, node provisioning will fail, and workloads may not start or scale properly. In case you need more vCPUs in your region than the quota provides, we recommend increasing quota proactively before scaling out your UltiHash cluster.

Check your current **Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances)** quota on [Service Quotas Console for EC2](https://eu-central-1.console.aws.amazon.com/servicequotas/home/services/ec2/quotas/L-1216C47A) and if it is not enough, create a quota increase request by clicking on the button **Request increase at account level** in the top right corner.
{% endhint %}

{% hint style="info" %}

## Enforce Least Privilege Access

Whenever interacting with AWS cloud, we strongly encourage you to **follow the principle of least privilege**. This means permissions should be limited to the **minimum actions and resources** required for each role or service to function.

**Why this matters:**

* Reduces the **attack surface** and limits the impact of compromised credentials or components.
* Prevents **unintentional changes** or access to unauthorized resources.
* Aligns with AWS **security best practices** and the **Well-Architected Framework**.
* Enables better auditing, control, and compliance with security standards.

More information on this topic can be found at [this AWS link.](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#grant-least-privilege)
{% endhint %}

## IAM permissions required to deploy and manage an UH cluster

The IAM user or the role that is used to provision and manage UH cluster in an AWS account should have the following IAM permissions. The IAM permissions below are applied for all resources, after successful deployment they could be adjusted to match certain resource ARNs for improved security. &#x20;

<details>

<summary><strong>S3 permissions (required to manage Terraform states in S3):</strong></summary>

<pre class="language-json"><code class="lang-json"><strong>{
</strong>    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:CreateBucket",
                "s3:ListBucket"
            ],
            "Resource": "*"
        }
    ]
}
</code></pre>

</details>

<details>

<summary><strong>EventBridge permissions (required by Karpenter to manage EC2 interruption events):</strong></summary>

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "events:TagResource",
                "events:DeleteRule",
                "events:PutTargets",
                "events:DescribeRule",
                "events:PutRule",
                "events:ListTagsForResource",
                "events:RemoveTargets",
                "events:ListTargetsByRule"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary><strong>SQS permissions (required by Karpenter to manage EC2 interruption events):</strong></summary>

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sqs:DeleteQueue",
                "sqs:GetQueueAttributes",
                "sqs:ListQueueTags",
                "sqs:CreateQueue",
                "sqs:SetQueueAttributes"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary><strong>KMS permissions (required by EKS cluster to manage Kubernetes secrets):</strong></summary>

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kms:TagResource",
                "kms:ListAliases",
                "kms:CreateAlias",
                "kms:CreateKey",
                "kms:DeleteAlias"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary><strong>EKS permissions (required to manage EKS cluster):</strong></summary>

<pre class="language-json"><code class="lang-json"><strong>{
</strong>    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "eks:DeleteAccessEntry",
                "eks:ListNodegroups",
                "eks:DescribeAddonConfiguration",
                "eks:UpdateAddon",
                "eks:ListAddons",
                "eks:AssociateAccessPolicy",
                "eks:ListAccessEntries",
                "eks:CreateNodegroup",
                "eks:DescribeAccessEntry",
                "eks:DescribeAddon",
                "eks:DeleteCluster",
                "eks:ListAssociatedAccessPolicies",
                "eks:DescribeNodegroup",
                "eks:DeleteAddon",
                "eks:DeleteNodegroup",
                "eks:DisassociateAccessPolicy",
                "eks:TagResource",
                "eks:CreateAddon",
                "eks:CreateAccessEntry",
                "eks:UpdateNodegroupConfig",
                "eks:DescribeCluster",
                "eks:ListAccessPolicies",
                "eks:DescribeAddonVersions",
                "eks:CreateCluster"
            ],
            "Resource": "*"
        }
    ]
}
</code></pre>

</details>

<details>

<summary><strong>IAM permissions (required by EKS cluster and EC2 instances):</strong></summary>

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "iam:GetRole",
                "iam:GetPolicyVersion",
                "iam:GetPolicy",
                "iam:DeletePolicy",
                "iam:CreateRole",
                "iam:DeleteRole",
                "iam:AttachRolePolicy",
                "iam:CreateOpenIDConnectProvider",
                "iam:CreatePolicy",
                "iam:ListInstanceProfilesForRole",
                "iam:PassRole",
                "iam:DetachRolePolicy",
                "iam:ListPolicyVersions",
                "iam:ListAttachedRolePolicies",
                "iam:ListRolePolicies",
                "iam:GetOpenIDConnectProvider",
                "iam:DeleteOpenIDConnectProvider",
                "iam:TagOpenIDConnectProvider"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary><strong>EC2 permissions (required to manage EC2 instances):</strong></summary>

<pre class="language-json"><code class="lang-json"><strong>{
</strong>    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:DeleteSubnet",
                "ec2:AttachInternetGateway",
                "ec2:DeleteRouteTable",
                "ec2:AssociateRouteTable",
                "ec2:DescribeInternetGateways",
                "ec2:CreateRoute",
                "ec2:CreateInternetGateway",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:DeleteInternetGateway",
                "ec2:DescribeNetworkAcls",
                "ec2:DescribeRouteTables",
                "ec2:DescribeLaunchTemplates",
                "ec2:CreateTags",
                "ec2:CreateRouteTable",
                "ec2:RunInstances",
                "ec2:DetachInternetGateway",
                "ssm:GetParameters",
                "ec2:DisassociateRouteTable",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:DescribeSecurityGroupRules",
                "ec2:DeleteNatGateway",
                "ec2:DeleteVpc",
                "ec2:CreateSubnet",
                "ec2:DescribeSubnets",
                "ec2:DeleteNetworkAclEntry",
                "ec2:DisassociateAddress",
                "ec2:DescribeAddresses",
                "ec2:CreateNatGateway",
                "ec2:CreateVpc",
                "ec2:DescribeAddressesAttribute",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeNetworkInterfaces",
                "ec2:CreateSecurityGroup",
                "ec2:ModifyVpcAttribute",
                "ec2:DeleteLaunchTemplateVersions",
                "ec2:ReleaseAddress",
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:DeleteLaunchTemplate",
                "ec2:DeleteRoute",
                "ec2:DescribeLaunchTemplateVersions",
                "ec2:DescribeNatGateways",
                "ec2:AllocateAddress",
                "ec2:DescribeSecurityGroups",
                "ec2:CreateLaunchTemplateVersion",
                "ec2:CreateLaunchTemplate",
                "ec2:DescribeVpcs",
                "ec2:DeleteSecurityGroup",
                "ec2:CreateNetworkAclEntry"
            ],
            "Resource": "*"
        }
    ]
}
</code></pre>

</details>

***

***

### Frequent issues troubleshooting

<details>

<summary>Helm chart install or upgrade failure</summary>

**Symptoms:**

* `helm install` or `helm upgrade` hangs or returns an error
* Application pods do not start
* Helm status is stuck at `pending-install` or `failed`

**Steps to resolve:**

* **Inspect the Helm release status:**

  ```bash
  helm status <release_name> -n <namespace>
  ```
* **Check for resource creation errors or pending pods:**

  ```bash
  kubectl get pods -n <namespace>
  ```
* **Describe a failing pod to view events and errors:**

  ```bash
  kubectl describe pod <pod_name> -n <namespace>
  ```
* **Debug with Helm’s dry run mode:**

  ```bash
  helm upgrade <release_name> oci://registry.ultihash.io/stable/ultihash-cluster \
    -n <namespace> --dry-run --values values.yaml --debug
  ```
* After the issue has been found and eliminated, process with install or upgrade further.

**Recommendation:** Always use `--dry-run` and `--debug` to validate changes before applying them in production.

</details>

<details>

<summary>Missing or incorrect values in values.yaml</summary>

**Symptoms:**

* Helm fails with a rendering error
* Application fails at runtime due to missing config (e.g., secrets, ports, env vars)

**Steps to resolve:**

* **Compare your values file with the chart defaults:**

  ```
  helm show values oci://registry.ultihash.io/stable/ultihash-cluster
  ```
* **Test the rendered templates locally:**

  ```
  helm template <your_release_name> oci://registry.ultihash.io/stable/ultihash-cluster --values <your_values.yaml>
  ```
* **Reapply the corrected configuration:**

  ```
  helm upgrade <release_name> oci://registry.ultihash.io/stable/ultihash-cluster \
    -n <namespace> --values <your_values.yaml>
  ```

**Recommendation:** Use a version-controlled values file and validate changes in a staging environment before rolling out to production.

#### 3. Application pods stuck in `CrashLoopBackOff` or `ImagePullBackOff`

**Purpose:** Diagnose runtime pod failures due to misconfiguration or image issues.

**Symptoms:**

* Pods keep restarting or cannot pull the container image

**Steps to resolve:**

* **Inspect the pod state:**

  ```
  kubectl get pods -n <namespace>
  ```
* **Check the logs of the failing pod:**

  ```
  kubectl logs <pod_name> -n <namespace>
  ```
* **Correct the config causing failure, then upgrade:**

  ```
  helm upgrade <release_name> oci://registry.ultihash.io/stable/ultihash-cluster \
    -n <namespace> --values <your_values.yaml>
  ```

**Recommendation:** Ensure that image repositories are accessible and secrets for private registries are correctly configured in the cluster.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ultihash.io/installation/install-self-hosted-on-aws.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
