💡 Download the complete guide to AI-generated synthetic data!
Go to the ebook


Comparison of the overall distribution of synthetic data distances to closest records (DCR) in the original data is one of the possible (dis-)similarity-based privacy tests (having actual holdout records as a reference). Bad synthetic data is when the original target data is perturbed by noise. The DCR is intended to capture such scenarios.