Python SDK

MOSTLY AI Python SDK

The MOSTLY AI Python SDK enables the full programmatic use of the MOSTLY AI Platform features both in a local environment as well as byh connect

You can run the SDK locally with all primitives as well as when connected to a remote MOSTLY AI Platform instance.

IntentPrimitive
Train a generator on tabular or language datag = mostly.train(config)
Generate any number of synthetic data recordssd = mostly.generate(g, config)
Live probe the generator on demanddf = mostly.probe(g, config)
Connect to any data source within your orgc = mostly.connect(config)
💡

For a complete API reference, see the Python SDK package documentation.

Installation

Use pip to install the latest release of mostlyai:

shell
pip install -U mostlyai

Local and remote use

You can use the Python SDK in a local environment or by connecting to a remote MOSTLY AI Platform. The Python SDK is designed to work in both environments. The same API is available in both cases, with minor differences in the way you instantiate the client.

For a brief overview, see the comparison in the table below.

Local environmentMOSTLY AI Platform (Remote)
PrerequisitesLocal Python installation• Kubernetes cluster
• Platform API key
• Local Python installation
InstallationInstall the Python SDK in your local environment1. Deploy MOSTLY AI Platform.
2. Connect to the Platform with Python SDK.
ServiceUse a locally running server that provides the REST APIConnect to the MOSTLY AI platform REST API
ComputeUses any local compute resourcesUses compute resources available on the MOSTLY AI Platform

Use in a local environment

Use the SDK locally (local computer or any on-premises environment) by instantiating the client with local=True.

python
from mostlyai.sdk import MostlyAI
 
mostly = MostlyAI(local=True)

Use with a remote MOSTLY AI platform

Use the SDK with any remote MOSTLY AI Platform instance by providing the base URL and API key.

python
from mostlyai.sdk import MostlyAI
 
api_key="mostly-**********"
base_url="https://app.mostly.ai" # replace with your Platform URL
 
mostly_remote = MostlyAI(base_url=base_url, api_key=api_key)

See the next section to learn how to get an API key.

Get an API key

For remote use, you need a Platform API key. Get your API key for the REST API or Python SDK from your user profile menu in the web application.

Steps

  1. Hover over the profile menu in the upper right and select API key. MOSTLY AI - Python SDK - From the Profile menu select API key
  2. Click Generate API Key. MOSTLY AI - Python SDK - Click Generate API key

What’s next

Your key is immediately copied to your clipboard. You can now use it for the REST API or to instantiate your Python SDK.

Examples

As you explore the Generators, Synthetic datasets, and Connectors pages, you will find Python code snippets that show how to accomplish a task with the Python SDK. Use the UI tab for the UI steps in the MOSTLY AI Platform and the Local Python environment and Remote platform tabs to switch between suggested local and remote usage of the Python SDK.

In general, the local and remote use of the Python SDK overlaps completely in terms of available classes and methods. Small differences may exist to demonstrate best practices or conventions when using the SDK in a local or remote environment.

Python SDK and UI tabs

Quick start

To whet your appetite, use the Python SDK quick start showcase below to train a generator locally on a tabular dataset, probe it, generate synthetic data, and export it to a file. Then, import the generator into a remote MOSTLY AI platform.

python
import pandas as pd
from mostlyai.sdk import MostlyAI
 
# initialize client (locally or remotely)
mostly = MostlyAI(local=True)
 
mostly_remote = MostlyAI(
    api_key='INSERT_YOUR_API_KEY',   # or set env var `MOSTLYAI_API_KEY`
    base_url='https://app.mostly.ai' # or set env var `MOSTLYAI_BASE_URL`
)
 
# train a generator
df = pd.read_csv('https://github.com/mostly-ai/public-demo-data/raw/dev/census/census.csv.gz')
g = mostly.train(data=df)
 
# probe for some samples
syn = mostly.probe(g, size=10)
 
# generate a synthetic dataset
sd = mostly.generate(g, size=2_000)
 
# start using it
sd.data()
 
# export a local generator
g.export_to_file('generator_census.zip')
 
# import into a remote platform
g_remote = mostly_remote.generators.import_from_file('path/to/generator_census.zip')