Prepare your data
Before you synthesize data with MOSTLY AI, you can review some of the considerations and requirements that can help you avoid unexpected errors, maintain the privacy of the subjects (people, companies, or any other entities), and ensure higher accuracy of the generated synthetic data.
How you prepare your data depends to a certain extent on the type of synthetic dataset that you want to generate in MOSTLY AI. For more information, see Types of synthetic datasets.
- If your original data is in CSV files, see CSV file requirements.
- If you want to synthesize two-table or multi-table relational data, see Subject and linked table requirements.
In the context of MOSTLY AI, subject tables are the tables that contain the private information of people, companies, or other entities.
Linked tables typically have foreign key relationships to subject tables. Before you run successful synthesis of subject and linked tables, it is important to understand the requirements for how your original data in those tables should be structured.