InstallationDeployDeployment checklist

Deployment checklist

The checklist below provides a list of prerequisites to ensure a successful installation process. Before you contact MOSTLY AI to complete the installation or troubleshoot installation issues, make sure to complete the checklist.

General checklist for Kubernetes clusters

Compute (CPU and memory) requirements

Storage requirements

Networking requirements

  • Your Kubernetes cluster can access the container image repository.
    • By default, MOSTLY AI serves the container images from the MOSTLY AI image repository at nexus.test.mostlylab.com. To deploy the images, your Kubernetes cluster needs to have Internet access.
    • If your internal IT policies require that you pull the images from an internal repository, ensure that your Kubernetes cluster has access to it. For more information, Configure an internal image repository.
  • Your Kubernetes cluster has network access to the defined storage classes and to the data sources (databases and cloud object storage providers) from which you want to pull original data.
  • Collaborate with your IT department and Customer Experience Engineer to configure an domain SSL certificate for your Kubernetes cluster and for the MOSTLY AI Helm chart.

Access and permissions requirements

  • Your Kubernetes cluster user has permissions to read and write into the storage class.
  • On the AI worker nodes where MOSTLY AI jobs run, you should have no taints defined that might not allow pods to be created with the minimum and maximum resource requirements specified in the values.yaml for the engine. Otherwise, MOSTLY AI jobs will fail to run.
  • If you already have taints on your worker nodes, you need to add tolerations on the MOSTLY AI pods in the values.yaml file, under agent.tolerations and engine.tolerations.
values.yaml
    ...
    tolerations:
        # Replace with the actual key label of the taint
        # For example: `Tainted-worker:NoSchedule`
        - key: "Tainted-worker" 
            operator: "Exists"
            effect: "NoSchedule"
    ...
  • On the AI worker nodes where MOSTLY AI jobs run, make sure that no other pods belonging to other applications can run so as not to interfere with MOSTLY AI workloads. You can apply taints on nodes dedicated to MOSTLY AI workloads so as to prevent other workloads from running.
  • Verify that any resource quotas created for your namespace allow MOSTLY AI to successfully run worker nodes based on their requirements.
  • If you have specific username requirements to access databases or other resources, update the Helm chart values.yaml file. In specific cases, due to Oracle security policies you might need to allowlist container users in Oracle.

Other requirements

  • Disable any tools or service mesh services in the MOSTLY AI namespace that intercept communications between pods and require manual approval to proceed. If you have enabled such services, such as Linkerd or Istio, MOSTLY AI jobs might be prevented from starting and completing successfully.
  • Work with your IT team to enable backups of the PostgreSQL database.

AWS EKS Kubernetes cluster checklist

  • AWS EKS cluster running Kubernetes 1.23 or higher. For more information, see Amazon EKS documentation.
  • Compute resources. The AWS EKS cluster has at least six worker nodes of type m5.xlarge. For more information, see Amazon EC2 instance types.
  • Storage (required for single- and multi-node). Integrate with Amazon Elastic Block Store (EBS) with your EKS cluster by installing and configuring the aws-ebs-csi-driver. For more information, Amazon EBS CSI driver.
  • Networking. Create a Virtual Private Cloud (VPC) with a /16 subnet netmask. This provides up to 65,536 private IPv4 addresses. For more information, see the AWS VPC documentation.
  • Networking. Integrate with Amazon Elastic Load Balancing (ALB) to automatically distribute your incoming traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more Availability Zones. Install the aws-load-balancer-controller in your EKS cluster to manage your ALBs and create an Ingress that uses this controller​.
    For more information, see Installing the AWS Load Balancer Controller add-on.
  • Networking. Create a Domain name in Route 53 that will point to your ALB​. Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. To configure a domain registered in Route 53 to point to AWS ALB, you can use the Amazon Route 53 Documentation​. For specific configurations and support, contact AWS Support.
  • Security. AWS Certificate Manager (ACM) helps you to provision, manage, and renew publicly trusted TLS certificates on AWS based websites. Create an SSL certificate in ACM that can be used with your ALB​. For more information, see AWS Certificate Manager > Requesting a public certificate.

MOSTLY AI Helm charts checklist

  • Obtain the MOSTLY AI Helm charts from your Customer Experience Engineer.
  • Obtain a Docker pull image secret from your Customer Experience Engineer.
  • Ensure Internet connectivity to pull the Docker images from the MOSTLY AI repository.
  • Get acquainted with the default configuration values.yaml file in the MOSTLY AI Helm charts.
  • Make sure that the nodes in your Kubernetes cluster can accommodate the container resource requirements defined in the values.yaml file.