Fair synthetic data

Fair synthetic data

MOSTLY AI allows the generation of fair synthetic data. It helps to generate statistical parity (opens in a new tab) synthetic data where you can target a specific column for fairness (for example, income) and easily remove biases based on other sensitive columns in your datasets, such as race, sex, age, or any other attribute that you define as sensitive.

Prerequisites

To use Fairness, the model responsible for the table containing missing or null values must have flexible generation enabled.

If you are using the UI of the MOSTLY AI Platform, you can generate fair synthetic data with a new synthetic dataset.

Steps

  1. Start a new synthetic dataset.
  2. On the Synthetic dataset configuration page, click a table to expand its generation options.
  3. Configure fair generation.
    1. For Fairness target column, select a categorical column. The data in the column will be generated with statistical parity for the categories in the sensitive columns you select.
      For example, income.
    2. For Fairness sensitive columns, select at least one column the categories of which will influence the generation of the target column.
      For example, race, gender, age. MOSTLY AI - Fair synthetic data - Target column and sensitive columns