💡 Download the complete guide to AI-generated synthetic data!
Go to the ebook

Holdout set

Holdout data (also called testing data) refers to a portion of original data that is held out of the data sets used for training and validating synthetic data models. The purpose is to provide a final unbiased comparison of the machine learning model's performance trained on the original and the synthetic data. Accurate synthetic data should not overfit on the training set of the original data and should generalize so that a model trained on such synthetic data achieves comparable results on the original holdout dataset with a model trained on the original training data set.

Ready to try synthetic data generation?

The best way to learn about synthetic data is to experiment with synthetic data generation. Try it for free or get in touch with our sales team for a demo.