MOSTLY AI Python SDK
The MOSTLY AI Python SDK enables the full programmatic use of the MOSTLY AI Platform features both in a local environment as well as byh connect
You can run the SDK locally with all primitives as well as when connected to a remote MOSTLY AI Platform instance.
Intent | Primitive |
---|---|
Train a generator on tabular or language data | g = mostly.train(config) |
Generate any number of synthetic data records | sd = mostly.generate(g, config) |
Live probe the generator on demand | df = mostly.probe(g, config) |
Connect to any data source within your org | c = mostly.connect(config) |
For a complete API reference, see the Python SDK package documentation.
Installation
Use pip
to install the latest release of mostlyai
:
pip install -U mostlyai
Local and remote use
You can use the Python SDK in a local environment or by connecting to a remote MOSTLY AI Platform. The Python SDK is designed to work in both environments. The same API is available in both cases, with minor differences in the way you instantiate the client.
For a brief overview, see the comparison in the table below.
Local environment | MOSTLY AI Platform (Remote) | |
---|---|---|
Prerequisites | Local Python installation | • Kubernetes cluster • Platform API key • Local Python installation |
Installation | Install the Python SDK in your local environment | 1. Deploy MOSTLY AI Platform. 2. Connect to the Platform with Python SDK. |
Service | Use a locally running server that provides the REST API | Connect to the MOSTLY AI platform REST API |
Compute | Uses any local compute resources | Uses compute resources available on the MOSTLY AI Platform |
Use in a local environment
Use the SDK locally (local computer or any on-premises environment) by instantiating the client with local=True
.
from mostlyai.sdk import MostlyAI
mostly = MostlyAI(local=True)
Use with a remote MOSTLY AI platform
Use the SDK with any remote MOSTLY AI Platform instance by providing the base URL and API key.
from mostlyai.sdk import MostlyAI
api_key="mostly-**********"
base_url="https://app.mostly.ai" # replace with your Platform URL
mostly_remote = MostlyAI(base_url=base_url, api_key=api_key)
See the next section to learn how to get an API key.
Get an API key
For remote use, you need a Platform API key. Get your API key for the REST API or Python SDK from your user profile menu in the web application.
Steps
- Hover over the profile menu in the upper right and select API key.
- Click Generate API Key.
What’s next
Your key is immediately copied to your clipboard. You can now use it for the REST API or to instantiate your Python SDK.
Examples
As you explore the Generators, Synthetic datasets, and Connectors pages, you will find Python code snippets that show how to accomplish a task with the Python SDK. Use the UI tab for the UI steps in the MOSTLY AI Platform and the Local Python environment and Remote platform tabs to switch between suggested local and remote usage of the Python SDK.
In general, the local and remote use of the Python SDK overlaps completely in terms of available classes and methods. Small differences may exist to demonstrate best practices or conventions when using the SDK in a local or remote environment.
Quick start
To whet your appetite, use the Python SDK quick start showcase below to train a generator locally on a tabular dataset, probe it, generate synthetic data, and export it to a file. Then, import the generator into a remote MOSTLY AI platform.
import pandas as pd
from mostlyai.sdk import MostlyAI
# initialize client (locally or remotely)
mostly = MostlyAI(local=True)
mostly_remote = MostlyAI(
api_key='INSERT_YOUR_API_KEY', # or set env var `MOSTLYAI_API_KEY`
base_url='https://app.mostly.ai' # or set env var `MOSTLYAI_BASE_URL`
)
# train a generator
df = pd.read_csv('https://github.com/mostly-ai/public-demo-data/raw/dev/census/census.csv.gz')
g = mostly.train(data=df)
# probe for some samples
syn = mostly.probe(g, size=10)
# generate a synthetic dataset
sd = mostly.generate(g, size=2_000)
# start using it
sd.data()
# export a local generator
g.export_to_file('generator_census.zip')
# import into a remote platform
g_remote = mostly_remote.generators.import_from_file('path/to/generator_census.zip')