You can optionally use the table settings to specify:

  • Whether AI model training needs to be done quickly or accurately.

  • Whether you want detailed accuracy and privacy charts for this table’s QA report.

Additionally, if the results of an earlier job were not of the desired accuracy or took too long to generate, you can open the advanced settings to improve AI model training performance.

You can find the table settings at the very bottom of the Column details tab. Select the table you want to configure from the table list and scroll down. Read on below to learn more about each setting.

Table settings

Training goal

Use the Optimize for dropdown menu to select a training goal that best suits your use case.
The following options are available:

Accuracy

Recommended for ML/Analytics use cases
This option trains the table’s AI model to achieve the highest attainable synthetic data accuracy. The training is stopped when the validation loss stops improving.

Speed

Recommended for Testing use cases
This option trains the table’s AI model to deliver accurate synthetic data using significantly shorter training times. The training is stopped as soon as the rate of improvement of the validation loss decreases.

Generate detailed QA report

The Generate detailed QA report toggle allows you to disable the generation of detailed accuracy and privacy charts for this table. An executive summary stating the synthetic data accuracy and whether the privacy tests passed or failed will always be available.

This option allows you to speed up the analysis of the synthetic data if its accuracy and privacy are in good shape. However, if the accuracy is lower than 90% or the privacy tests fail, the detailed accuracy and privacy charts will be generated anyway.

Advanced settings

Open the advanced settings if you want to improve AI model training performance. Based on the results of previous jobs, you can use these settings to improve synthetic data accuracy and training time. Read below how to use them.

Table settings


Maximum training epochs

An epoch refers to the process of passing the table forward and backward through the neural network only once. MOSTLY AI will start new epochs until the neural network optimally learned your dataset’s features. Unfortunately, it’s not possible to know beforehand how many epochs are needed.

This setting allows you to limit the numbers of epochs to, for instance, 2, 5, or 10 — significantly reducing the time to generate your synthetic dataset but at the cost of accuracy.


Model size

If the synthetization job runs into memory issues, takes too long to complete, or produces synthetic data with less than desired accuracy, you can choose a different AI model size to mitigate the issue.

Model size dimensions

Smaller sizes require less memory, run faster, and reduce synthetic data accuracy, whereas bigger sizes increase accuracy, require more memory, and take more time to complete.


Batch size

MOSTLY AI won’t pass the entire dataset into the neural net at once. Instead, it divides your dataset into batches and updates the neural network’s parameters after each batch.

Setting the batch size to 1 will update these parameters after processing each training example. This results in the longest training time but allows you to process the largest possible models.

Setting a large batch size can significantly speed up the training but at the cost of memory.

If you get out of memory errors during the training stage, then you can try to resolve it by decreasing the batch size.