Guides
Databricks

Connect to Databricks

With MOSTLY AI, you can connect to a Databricks SQL Warehouse and use it as a data source or destination for your synthetic data.

Prerequisites

To create a Databricks connector, you need to obtain your SQL Warehouse connection details, a Databricks catalog name, and a personal access token for Databricks. The linked sections below provide step-by-step guidance on how to complete the prerequisites.

Get connection details for your Databricks SQL Warehouse

  1. In Databricks, open the workspace that contains the SQL Warehouse you want to use.

    Databricks - Open workspace
  2. Open the sidebar and from the main menu, select SQL.

    Databricks - Sidebar Select SQL
  3. Open the sidebar menu again and select SQL Warehouses.

    Databricks - Sidebar Select SQL Warehouses
  4. From the list, open the SQL warehouse you want to use for synthetic data.

    Databricks - Open SQL Warehouses
  5. Select the Connection details tab.

    Databricks - SQL Warehouses Connection details
  6. Copy the necessary connection details (hostname, port, protocol, and HTTP path) for the MOSTLY AI Databricks connector.

    Databricks - Get connection details

Get Databricks catalog name

  1. From the Databricks sidebar menu, select Data.

    Databricks - Sidebar select Data
  2. Copy the name of the catalog you want to use in MOSTLY AI.

    Databricks - Copy catalog name

Create a Databricks personal access token

  1. In Databricks, open your account menu and select User Settings.

    Databricks account - User settings
  2. On the Access tokens tab, click Generate new token.

    Databricks account - Generate new token
  3. In the dialog window, enter a name that identifies where you intend to use the token.

    [NOTE] Adjust the expiration of the token in the Lifetime (days) box.

  4. Click Generate.

    Databricks account - Adjust token name and expiry
  5. Copy the access token and save it in a secure location.

    ⚠️

    Do not close the dialog window before you make sure you save the token in a location that you can access later.

    Databricks account - Copy token

Create a Databricks connector

  1. From the Connectors tab, click Create connector.

    Click Create connector button

    The Create connector drawer appears on the right.

  2. On the Connect to database tab, select Databricks.

    Select Databricks connector
  3. Configure the Databricks connector.

    1. For Connector name, enter a name for the Databricks connector.

      A combination of Databricks + _CATALOG_ might help you identify this connector among other Databricks connectors.

    2. For Connection type, select whether you want to use the connector as a source or destination.

      You can select only data source connectors when you create a new catalog.

      Similarly, you can select only data destination connectors when you configure a destination for the new synthetic dataset.

    3. For Hostname, enter your SQL warehouse server hostname. For more information, see the Prerequisites above.

    4. For HTTP path, enter your SQL warehouse HTTP path.

    5. For Access token, enter your Databricks personal access token.

    6. For Catalog, enter the name of your Databricks.

    7. For Schema, enter the schema you want to use.

      If you leave Schema empty, MOSTLY AI uses the default schema default.

      Configure Databricks connector
  4. Click Save.

Result

Your Databricks connector is now saved.

What's next

You can now use the Databricks connector as a data source when you create a new catalog.

You can also use the Databricks connector as a destination.

You can use different types of data sources and destinations for a synthetic dataset. For example, if your data source is a Databricks database, you can deliver the generated synthetic to any of the supported databases or cloud storage providers.