If you don’t want to touch your production dataset / database, or it is too big to handle easily, you can always make a copy or sample a part of it with the data you consider to be more important to synthesize. Sometimes, large databases contain multiple tables that are not important; this gives you the opportunity of sampling only the tables / data that you and your team consider relevant for synthesization. In addition, MOSTLY AI allows you to create Data Catalogs for your databases, which give you the opportunity to select only the tables and columns that you want to synthesize, as well as to rank their importance. You can also make sure that the references are properly mapped using the Reference Manager.
This video tutorial can guide you through the steps to achieve the best results when generating synthetic data with MOSTLY AI.