The training dataset is the set of original data used to train the synthetic model. The model learns all the data patterns from the training data. It is the largest set of original data used in synthetic data generation (compared to the validation set and holdout set).