Use Databricks for synthetic data

With MOSTLY AI, you can connect to a Databricks SQL Warehouse and use it as a data source or destination for your synthetic data.

Prerequisites

To create a Databricks connector, you need to obtain your SQL Warehouse connection details, a Databricks catalog name, and a personal access token for Databricks. The linked sections below provide step-by-step guidance on how to complete the prerequisites.

Get connection details for your Databricks SQL Warehouse

  1. In Databricks, open the workspace that contains the SQL Warehouse you want to use.

    Databricks - Open workspace
  2. Open the sidebar menu again and select SQL Warehouses.

    Databricks - Sidebar Select SQL Warehouses
  3. From the list, open the SQL warehouse you want to use for synthetic data.

    Databricks - Open SQL Warehouses
  4. Select the Connection details tab.

    Databricks - SQL Warehouses Connection details
  5. Copy the necessary connection details (hostname, port, protocol, and HTTP path) for the MOSTLY AI Databricks connector.

    Databricks - Get connection details

Get Databricks catalog name

  1. From the Databricks sidebar menu, select Data.

    Databricks - Sidebar select Data
  2. Copy the name of the catalog you want to use in MOSTLY AI.

    Databricks - Copy catalog name

Create a Databricks personal access token

  1. In Databricks, open your account menu and select User Settings.

    Databricks account - User settings
  2. Under Settings, select Developer.

  3. Click Manage for Access Tokens.

    Databricks account - Select Developer and click Manage for Access Tokens
  4. Click Generate new token.

    Databricks account - Generate new token
  5. In the Generate new token window, enter a name that identifies where you intend to use the token.

    💡

    Adjust the expiration of the token in the Lifetime (days) box.

  6. Click Generate.

    Databricks account - Adjust token name and expiry
  7. Copy the access token and save it in a secure location.

    ⚠️

    Before you close the window, save the token in a location you can access later.

    Databricks account - Copy token

Create a Databricks connector

If you use the web application, create a new Databricks connector from the Connectors page.

Steps

  1. From the Connectors tab, click Create connector. Click Create connector button
  2. On the Connect to database tab, select Databricks. Select Databricks connector
  3. Configure the Databricks connector.
    1. For Name, enter a name you can distinguish from other connectors.
    2. For Access type, select whether you want to use the connector as a source or destination.
    3. For Host, enter your SQL warehouse server hostname. For more information, see the Prerequisites above.
    4. For HTTP path, enter your SQL warehouse HTTP path.
    5. For Access token, enter your Databricks personal access token.
    6. For Catalog, enter the name of your Databricks catalog. Configure Databricks connector
  4. Click Save to save your new Databricks connector.

    MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.

    You can click Save anyway to save the connector disregarding any errors.

Authenticate with a Service principal

With MOSTLY AI, you can use a Service principal account to access original data stored in Databricks.

In the web application, the Databricks connector configuration includes configuration details that support the authentication with a Service principal account.

Steps

  1. To use a Service principal for authentication in your Databricks connector, select the Authenticate with Service Principal checkbox.
  2. Configure the Databricks connector.
    1. For Name, enter a name you can distinguish from other connectors.
    2. For Access type, select whether you want to use the connector as a source or destination.
    3. For Host, enter your SQL warehouse server hostname. For more information, see the Prerequisites above.
    4. For HTTP path, enter your SQL warehouse HTTP path.
    5. For Catalog, enter the name of your Databricks catalog.
    6. For Tenant ID, enter your tenant ID.
    7. For Client ID, enter your client ID.
    8. For Client secret, enter your client secret. Configure Databricks connector with a Service principal
  3. Click Save to save your new Databricks connector.

    MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.

    You can click Save anyway to save the connector disregarding any errors.

What’s next

Depending on whether you created a source or a destination connector, you can use the connector as: