Installation
Compute resources

Overview

The MOSTLY AI Platform runs Kubernetes jobs (opens in a new tab) to complete AI tasks. Depending on the size of the original data, a job may require a lot of memory and CPU to complete successfully. Because of this, it is very important that jobs are assigned to a node that has enough resources. Ideally, the node should be dedicated only to a single job and all of its resources should be available to the job.

Memory is crucial because insufficient memory causes job failure. The memory required for a job depends on the size and complexity of the original dataset.

CPU is less critical. Fewer CPUs slow down the job but do not cause failures.

Kubernetes nodes configuration

MOSTLY AI recommends that you run the platform on dedicated high capacity nodes. An alternative is to implement auto-scaling. By default, the MOSTLY AI Helm chart uses affinity-based node selectors.

Before you deploy, either label your nodes as indicated below or remove the selectors from the values.yaml file (not recommended).

  • Web application nodes: mostly-app=yes
  • AI worker nodes: mostly-worker=yes

Worker node configuration

Assign jobs to nodes where they will succeed and complete quickly. Use the following values.yaml parameters to configure the AI workloads on the worker nodes:

  • CPU and memory requirements: Kubernetes will assign the AI job to a node, which has at least this amount of resources available.
    • engine.resources.requests.cpu: Cores in millicores (1/1000 of a core). For example, configure 3 cores by setting cpu: 3000m
    • engine.resources.requests.memory: Memory in Gigabytes. For example, configure 8 GB of memory by setting memory: 8Gi
  • CPU and memory limits: Kubernetes will not assign more resources than these amounts to a single AI job, so as to avoid overloaded and blocked nodes. Unlike requirements, limits are not considered during the assignment of pods to nodes. Limits are set a cap to the consumption after a job is scheduled to a node.
    • engine.resources.limits.cpu: Cores in millicores (1/1000 of a core). For example, configure a 5 cores limit by setting cpu: 5000m
    • engine.resources.limits.memory: Memory in Gigabytes. For example, configure a 24 GB memory limit by setting memory: 24Gi

Guidelines for worker node configuration

For efficient workloads, you might need to adjust the worker node requirements and limits depending on the size of your nodes, the cluster configuration, and the size of the data you want to synthesize.

To run n jobs in parallel per node, the general rule of thumb is:

  • Set aside 1 CPU and 1-2 GB memory for other workloads (Kubernetes, OS, and so on)
  • Set aside 0.5 CPU and 6 GB memory per number of jobs to run in parallel on a node
  • Assign CPU limits with the formula: (Number of CPUs - n * 0.5 CPU - 1 CPU) / n
  • Assign memory limits with the formula: (Total memory - n * 5 GB - 1 GB) / n
  • Minimize the gap between CPU and memory requirements and limits. It is also an option to set the same size of requirements and limits.
  • If the sizes of the worker nodes vary (not recommended), base the requirements on the smallest node capacity.

Example: Let us assume the following cluster consists of 3 nodes, each with 32 CPUs and 128GB memory.

Here are alternatives to configure the nodes:

  • One job per node: This is the safest alternative, to dedicate almost all resources to a job. It is less efficient if your original data is not too big and complex in terms of number of columns, rows, and sequences.

    values.yaml
      resources:
        limits:
          cpu: 30000m
          memory: 122Gi
        requests:
          cpu: 28000m
          memory: 110Gi
  • Two jobs per node: This alternative makes better use of resources as long as you do not intend to synthesize exceptionally big and complex datasets.

    values.yaml
      resources:
        limits:
          cpu: 15000m
          memory: 58Gi
        requests:
          cpu: 13000m
          memory: 50Gi