Synthesize database tables

Synthesize database tables

MOSTLY AI supports connections to popular databases that you can use as data sources to generate synthetic datasets from database tables. You can define the relationships between tables to maintain the referential integrity and retain the correlations between tables in the synthetic dataset.

You can also then have the synthetic data delivered to a separate database.

Steps

  1. Create a connector for your database.

    Note
    In a database connector, you define the connection details, credentials, and other details to access specific tables in your database.

  2. Create a database catalog.

    Note
    In a database catalog, you add the tables from your database that you want to synthesize.
    You also define AI model training settings, relationships between tables, and output settings.

    For more information, see Configure a catalog.

  3. In the catalog, configure the relationships between tables.
  4. From the Catalogs tab, open the catalog. Get started with databases - open Catalog
  5. Click Next. Get started with databases - click Next
  6. (Optional) On the Synthetic datasets / Start job screen, review the synthetic dataset configuration.

    For more information, see Configure a synthetic dataset.

  7. Configure a data destination.
    1. Select Output settings.
    2. For Data destination, select Download as CSV/Parquet.
      💡

      Tip
      If you want to deliver the generated synthetic dataset to another database, see Configure a data destination.

  8. To start the synthetic dataset, click Create a synthetic dataset. Get started with databases - click Create a synthetic dataset

Result

The Synthetic datasets tab opens where you can track the proress of the new synthetic dataset.

Get started with databases - Track progress of synthetic dataset

What's next

After it completes, you can preview and download the generated synthetic dataset.