Deploy MOSTLY AI to an Azure AKS cluster
You can deploy MOSTLY AI to an Azure AKS cluster. This page covers the list of prerequisites and the pre-deployment and deployment steps for a successful installation.
Prerequisites
- Create an Azure account (opens in a new tab).
- Create an Azure subscription (opens in a new tab).
- Install Azure CLI (opens in a new tab).
- Install kubectl (opens in a new tab).
- Install helm (opens in a new tab).
- Prepare a fully-qualified domain name (FQDN). A TLS certificate is optional.
- Obtain deployment details from your Customer Experience Engineer.
- MOSTLY AI Helm chart. Required for Task 5.
- (Optional) MOSTLY AI image repository pull secret. Required only if you intend to use the MOSTLY AI image repository to pull the container images. Optional for Task 5.
- First-time log in credentials for the MOSTLY AI application. Required in Task 8.
Pre-deployment
Before you deploy MOSTLY AI to an Azure AKS cluster, you need to complete several pre-deployment tasks.
- Task 1: Create an AKS cluster
- Task 2: Log in to the Azure CLI
- Task 3: Connect to the Azure AKS cluster
- Task 4: Install NGINX
Task 1. Create an AKS cluster
Use the Create Kubernetes cluster wizard in Azure to create your AKS cluster.
Steps
- Log in to Microsoft Azure and select Create a resource.
- In the services search bar, type
aks
and press Enter. - From the results, select Azure Kubernetes Service (AKS).
- Click Create. Step result: The Create Kubernetes cluster wizard starts.
- Configure the Basics page of the wizard.
💡
You can change the settings below to suit your needs. The steps provide a minimal configuration to help you set up a new cluster quickly.
- For Subscription, select the Azure subscription you want to use for the cluster.
For more information, see the Prerequisites. - For Resource group, define a new resource group for your MOSTLY AI cluster.
For more information, see Manage resource groups (opens in a new tab) in the Azure documentation. - For Cluster preset configuration, select Production Standard.
- For Kubernetes cluster name, define the cluster name.
For example,MOSTLYAI-AKS
. - (Optional) For Region, select the region to use for the cluster.
- Retain the remaining default options and click Next.
- For Subscription, select the Azure subscription you want to use for the cluster.
- Configure the Node pools page of the wizard.
- For the agentpool node pool, click the Node size entry.
- On the Update node pool page, for Node size click Choose a size.
- On the Select a VM size page, select D8s_v3 and click Select.
- Back on the Update node pool page, for Scale method, select Manual and leave Node count to 1.
- For the userpool node pool, click the Node size entry.
- On the Update node pool page, for Node size click Choose a size.
- On the Select a VM size page, select Standard_D16ls_v5 and click Select.
- Back on the Update node pool page, for Scale method, select Manual and leave Node count to 2.
- Click Next.
- Configure the Networking page of the wizard.
- For Network policy, select None.
- (Optional) Review the Integrations, Advanced, and Tags pages.
- Click Review + Create.
- If the final configuration validation completes, click Create.
Result
The cluster deployment starts. Wait several minutes until Azure creates your AKS cluster.
When the cluster is created, Azure reports that the deployment is complete.
Task 2. Log in to the Azure CLI
As listed in the Prerequisites, install the Azure CLI and use it to log in to your account.
Steps
- In a web browser, log in to your Azure account.
- Log in to the Azure CLI.
az login
Result
The command opens a browser window where it authenticates the Azure CLI.
In the command line, you should see output similar to the following:
Real values are obfuscated with asterisks (*
).
A web browser has been opened at https://login.microsoftonline.com/organizations/oauth2/v2.0/authorize. Please continue the login in the web browser. If no web browser is available or if the web browser fails to open, use device code flow with `az login --use-device-code`.
[
{
"cloudName": "AzureCloud",
"homeTenantId": "********",
"id": "********",
"isDefault": true,
"managedByTenants": [],
"name": "Pay-As-You-Go",
"state": "Enabled",
"tenantId": "********",
"user": {
"name": "******@mostly.ai",
"type": "user"
}
}
]
Task 3. Connect to the Azure AKS cluster
You can find the exact Azure CLI commands to connect to your AKS cluster from the cluster web page.
Steps
- From the AKS cluster page, click Connect to cluster.
- Copy and run the commands to connect to your cluster.
- Use the Azure CLI to set the account subscription.
💡
Real values are obfuscated with asterisks (
*
).az account set --subscription ********-4a22-414c-a2a6-************
- Set the credentials to register your AKS cluster as the current CLI context.
You should see output similar to the following:
az aks get-credentials --resource-group MOSTLYAI-AKS --name MOSTLYAI-AKS
Merged "MOSTLYAI-AKS" as current context in /Users/mostlyzach/.kube/config
- Use the Azure CLI to set the account subscription.
- (Optional) Check the cluster name of your current CLI context.
Depending on the cluster name you defined (in our case
kubectl config view --minify -o jsonpath='{.clusters[].name}'
MOSTLYAI-AKS
), you should see output similar to the following:MOSTLYAI-AKS%
Result
Your CLI is now set up to use the new Azure AKS cluster as the default K8S context. You can now run kubectl
and helm
command against this context.
Task 4: Install NGINX
If you do not have an ingress controller running for your cluster, you can install NGINX to allow external traffic via HTTP or HTTPS to your cluster.
For more information, see Basic configuration for an unmanaged ingress controller (opens in a new tab) in the Azure Kubernetes Services documentation.
Steps
- Add the helm chart repository for the Ingress NGINX Controller.
You should see the following result:
NAMESPACE=ingress-basic helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx helm repo update
"ingress-nginx" has been added to your repositories
- Install the Ingress NGINX Controller.
You should see a result similar to the following:
helm install ingress-nginx ingress-nginx/ingress-nginx \ --create-namespace \ --namespace $NAMESPACE \ --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz
NAME: nginx-ingress LAST DEPLOYED: Thu Nov 9 10:04:24 2023 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: The ingress-nginx controller has been installed. It may take a few minutes for the LoadBalancer IP to be available. You can watch the status by running 'kubectl --namespace default get services -o wide -w nginx-ingress-ingress-nginx-controller' An example Ingress that makes use of the controller: apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: example namespace: foo spec: ingressClassName: nginx rules: - host: www.example.com http: paths: - pathType: Prefix backend: service: name: exampleService port: number: 80 path: / # This section is only required if TLS is to be enabled for the Ingress tls: - hosts: - www.example.com secretName: example-tls If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided: apiVersion: v1 kind: Secret metadata: name: example-tls namespace: foo data: tls.crt: <base64 encoded cert> tls.key: <base64 encoded key> type: kubernetes.io/tls
- Get the load balancer IP address to assign to your cluster.
The output should be similar to the following:
kubectl --namespace ingress-basic get services -o wide -w ingress-nginx-controller
💡Real values are obfuscated with asterisks (
*
).NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR nginx-ingress-ingress-nginx-controller LoadBalancer 10.0.***.*** 20.72.***.*** 80:30454/TCP,443:32319/TCP 8m8s app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx
- Copy the value under
External-IP
address and assign it to your FQDN through your DNS provider.
Result
Your cluster can now accept incoming connections. Also, if configured, the cluster external IP address is accessible through your FQDN for the MOSTLY AI app.
Deployment
When you are ready with the pre-deployment tasks, you can configure your deployment in the values.yaml
file and use the helm
command to start the deployment process.
- Task 5. Configure MOSLTY AI Helm chart
- Task 6. Add taints to the worker nodes pool
- Task 7. Deploy MOSTLY AI
Task 5. Configure MOSTLY AI Helm chart
The MOSTLY AI Helm chart defines default configurations for your MOSTLY AI Kubernetes deployment. Before you can deploy, you need to configure some of the default values to match your cluster configuration.
As mentioned in the Prerequisites, you need to obtain the MOSTLY AI Helm chart from your Customer Experience Engineer.
Steps
- Open the
values.yaml
file in a text editor. - At the start, set the application domain name to an FQDN. Do the same as listed below for
minio
.💡minio
is the shared storage service.values.yaml_customerInstallation: domainNames: mostly-ai: &fqdn yourfqdn.com minio: &fqdnMinio minio-yourfqdn.com
- (Optional) Apply one of the configurations below depending on whether you intend to use TLS-encrypted access to the MOSTLY AI application.
➡️ You use an TLS certificate. Replaceyour-tls-secret
with the TLS secret name as defined in your cluster configuration.💡Your IT department or Kubernetes administrator creates the FQDN and its TLS certificate and adds it to the configuration of your cluster. When added, it comes with a TLS secret name that you can define in the
values.yaml
file. For details, see Configure your domain TLS certificate.➡️ You do not use an TLS certificate. Replace thevalues.yaml_customerInstallation: ... deploymentSettings: tlsSecretName: &tlsSecretName your-tls-secret ...
your-tls-secret
with an empty string and, forglobal.tls
, setenabled
tofalse
.values.yaml_customerInstallation: ... deploymentSettings: tlsSecretName: &tlsSecretName [] # your-tls-secret ... global: ... tls: enabled: false ...
- (Optional) If you host third-party container images in an internal repository, replace
docker.io
inregistryFor3rdPartyComponents
.values.yaml_customerInstallation: ... deploymentSettings: ... registryFor3rdPartyComponents: ®istryFor3rdPartyComponents docker.io ...
- (Optional) If you need to host MOSTLY AI container images in an internal repository, replace
harbor.env.mostlylab.com/mostlyai
inmostlyRegistry
.values.yaml_customerInstallation: ... deploymentSettings: ... mostlyRegistry: &mostlyRegistry harbor.env.mostlylab.com/mostlyai ...
- (Optional) If you intend to use the MOSTLY AI image repository at
harbor.env.mostlylab.com/mostlyai
, set its secret inmostlyRegistryDockerConfigJson
.💡To obtain the secret, contact your MOSTLY AI Customer Experience Engineer.
values.yaml_customerInstallation: ... deploymentSettings: ... mostlyRegistryDockerConfigJson: &mostlyRegistryDockerConfigJson <HARBOR_SECRET> ...
- Define the Default compute under
mostly-app
.
Use CPU and memory resources that are slightly below the ones defined for your worker nodes. For example, if your worker node has 14 CPU cores and 24GB memory, define 10 CPU cores and 20GB memory in your Helm chart.values.yamlmostly-app: ... mostly: defaultComputePool: name: Default type: KUBERNETES toleration: engine-jobs resources: cpu: 10 memory: 20 gpu: 0 ...
- Enable the
ingress
annotations for NGINX.- Assign the
ingressClassName
to benginx
underglobal.ingress.ingressClassName
.values.yamlglobal: ... ingress: annotations: route.openshift.io/termination: edge ingressClassName: nginx ...
- Enable the ingress annotations under
mostly-app
.values.yamlmostly-app: ingress: annotations: nginx.ingress.kubernetes.io/proxy-body-size: 10240m nginx.ingress.kubernetes.io/proxy-buffer-size: 128k nginx.org/proxy-connect-timeout: 3000s nginx.org/proxy-read-timeout: 3000s nginx.org/client-max-body-size: 3000m # annotations: {}
- Enable the ingress annotations under
mostly-keycloak
.values.yamlmostly-keycloak: deployment: resources: {} tolerations: [] affinity: {} ingress: annotations: nginx.ingress.kubernetes.io/proxy-body-size: 10240m nginx.ingress.kubernetes.io/proxy-buffer-size: 128k # annotations: {}
- Assign the
Result
Your values.yaml
file is now configured with the required settings for your MOSTLY AI deployment.
Task 6. Add taints to the worker nodes pool
The MOSTLY AI worker pods can only run on nodes with the engine-jobs
taint. You need to add this taint to the worker nodes pool.
Steps
- For your Azure AKS cluster, select Settings > Node pools.
- Select the
userpool
node pool. - For Taints, click edit.
- Add the
engine-jobs
taint with theNoSchedule
effect.- For Key, enter
scheduling.mostly.ai/node
. - For Value, enter
engine-jobs
. - For Effect, select
NoSchedule
. - Click Save.
- For Key, enter
Result
The worker nodes for the generator training and synthetic dataset generation jobs will be tainted correctly with the engine-jobs
taint.
Task 7. Deploy MOSTLY AI
With all required configurations made in the values.yaml
and worker nodes tainted correctly, you can now create a separate namespace and deploy MOSTLY AI to it.
Steps
- Create the
mostly-ai
namespace.kubectl create ns mostly-ai
- Deploy the MOSTLY AI Helm chart.
The result from the command should be similar to the following. If you see errors, see the Troubleshooting section.
helm upgrade --install mostly-ai ./mostly-combined --values values.yaml --namespace mostly-ai
Release "mostly-ai" does not exist. Installing it now. NAME: mostly-ai LAST DEPLOYED: Fri Nov 10 18:45:58 2023 NAMESPACE: mostly-ai STATUS: deployed REVISION: 1 TEST SUITE: None
- In Azure, go to your AKS cluster and select Workloads.
Initially, all of the MOSTLY AI pods are in progress of starting and connecting to storage and to each services. Wait for a few minutes until AKS provisions pods and all pods connect successfully to all required services.
Post-deployment
With the MOSTLY AI pods running, you can now log in to your MOSTLY AI deployment for the first time.
Task 8: Log in to your MOSTLY AI deployment
Log in for the first time to your MOSTLY AI deployment to set a new password for the superadmin
user.
Prerequisites
Contact MOSTLY AI to obtain the superadmin
credentials. You need them to log in for the first time.
Steps
- Open your FQDN in your browser.
Step result: You Sign in page for your MOSTLY AI deployment opens. - Enter the superadmin credentials and click Sign in.
- Provide a new password and click Change password.
Result
Your superadmin
password is now changed and you can use it to log in again to your MOSTLY AI deployment.
Uninstall and cleanup
To uninstall, you can delete the Kubernetes namespace that holds all MOSTLY AI pods (by default, we suggest that you name this namespace mostly-ai
). To clean up, you can delete your AKS cluster afterwards.
Delete MOSTLY AI namespace
kubectl delete namespace mostly-ai
Delete AKS cluster
Use the Azure CLI to delete your AKS cluster.
Steps
-
Obtain your cluster name and resource group name in Azure.
-
Delete your cluster with the following command.
az aks delete --name MyClusterName --resource-group MyResourceGroup
-
Enter
y
and press Enter at the prompt.Are you sure you want to perform this operation? (y/n): y