Synthesize a cloud bucket dataset
Datasets are often shared and made available for team members or other collaborators through cloud object storage. MOSTLY AI integrates with cloud object storage providers and you can use uploaded datasets as a data source for synthetic data.
- Create a connector for your cloud bucket.
In a cloud storage connector, you define the connection details and credentials to access files in your cloud buckets. In the list below, you can find instructions to create a connector to one of the supported cloud object storage providers.
- Create a cloud storage catalog.
- With the cloud storage catalog open, click Next.
- (Optional) On the Synthetic datasets / Start job screen, review the synthetic dataset configuration.
For more information, see Configure a synthetic dataset.
- Configure a data destination.
- Select Output settings.
- For Data destination, select Download as CSV/Parquet.
If you want to deliver the generated synthetic dataset to the same or another cloud bucket (or even a database), see Configure a data destination.
- To start the synthetic dataset, click Create a synthetic dataset.
The Synthetic datasets tab opens where you can track the progress of generating a synthetic dataset from a cloud storage bucket.