Python SDK

MOSTLY AI Python SDK

The MOSTLY AI Python SDK enables full programmatic use of the MOSTLY AI Platform features both in a local environment as well as by connecting to a remote MOSTLY AI Platform.

IntentPrimitive
Train a generator on tabular or language datag = mostly.train(config)
Generate any number of synthetic data recordssd = mostly.generate(g, config)
Live probe the generator on demanddf = mostly.probe(g, config)
Connect to any data source within your orgc = mostly.connect(config)
💡

For a complete API reference, see the Python SDK package documentation.

Installation

Use pip install to install the latest version of mostlyai.

shell
# CPU
pip install -U "mostlyai[local]"
 
# GPU
pip install -U "mostlyai[local-gpu]"
📑

GPU support in Local mode is available on Linux only.

If you need to use one of the supported connectors in Local mode, install with the optional dependencies: databricks, googlebigquery, hive, mssql, mysql, oracle, postgres, snowflake.

shell
pip install -U "mostlyai[local,databricks]"

Local and Client modes

The Python SDK is designed to work in a local environment (your computer or any supported Python environment) or by connecting to a remote MOSTLY AI Platform (such as https://app.mostly.ai). See the comparison below.

Local modeClient mode
PrerequisitesLocal Python installation (in Local mode)• Remote MOSTLY AI Platform
• Platform API key
• Local Python installation (in Client mode)
InstallationInstall the Python SDK in Local mode1. Deploy MOSTLY AI Platform in a Kubernetes cluster
2. Connect to the Platform with Python SDK in Client mode
ServiceUse a locally running server that provides the REST APIConnect to the MOSTLY AI platform REST API
ComputeUses local compute resources (CPU, GPU)Uses compute resources available on the MOSTLY AI Platform

The same API is available in Local and Client modes. The only difference is how you instantiate the MostlyAI client depending on the mode you need.

python
from mostlyai.sdk import MostlyAI
 
mostly = MostlyAI(local=True)

Get an API key

Get your API key for the REST API or Python SDK from your user profile menu in the web application.

Steps

  1. Hover over the profile menu in the upper right and select API key. MOSTLY AI - Python SDK - From the Profile menu select API key
  2. Click Generate API Key. MOSTLY AI - Python SDK - Click Generate API key

What’s next

Your key is immediately copied to your clipboard. You can now use it for the REST API or to instantiate your Python SDK in Client mode.

Examples

As you explore the Generators, Synthetic datasets, and Connectors pages, you will find Python code snippets that show how to accomplish a task with the Python SDK. Use the UI tab for the UI steps in the MOSTLY AI Platform and the Python SDK tab to switch between UI steps in the Platform and how to accomplish the same with the Python SDK.

Quick start

Use the Python SDK quick start below to train a generator locally on a tabular dataset, probe it, generate synthetic data, and export it to a file. Then, import the generator into a remote MOSTLY AI platform.

python
import pandas as pd
from mostlyai.sdk import MostlyAI
 
# initialize client (locally or remotely)
mostly = MostlyAI(local=True)
 
mostly_remote = MostlyAI(
    api_key='INSERT_YOUR_API_KEY',   # or set env var `MOSTLYAI_API_KEY`
    base_url='https://app.mostly.ai' # or set env var `MOSTLYAI_BASE_URL`
)
 
# train a generator
df = pd.read_csv('https://github.com/mostly-ai/public-demo-data/raw/dev/census/census.csv.gz')
g = mostly.train(data=df)
 
# probe for some samples
syn = mostly.probe(g, size=10)
 
# generate a synthetic dataset
sd = mostly.generate(g, size=2_000)
 
# start using it
sd.data()
 
# export a local generator
g.export_to_file('generator_census.zip')
 
# import into a remote platform
g_remote = mostly_remote.generators.import_from_file('path/to/generator_census.zip')