From 6e00bdcc3ee545b397646507c6c53d119b2b99ed Mon Sep 17 00:00:00 2001 From: Gergelj Kis Date: Fri, 17 Apr 2026 14:16:10 +0200 Subject: [PATCH 1/3] readme fixed --- .../azure/azure-vnet-injection/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md b/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md index 7ff1eca..6849dc7 100644 --- a/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md @@ -91,12 +91,12 @@ Before proceeding, ensure your VNet meets the following requirements: If you want Terraform to automatically load values for variables from a file, the file must be named either `terraform.tfvars`, `terraform.tfvars.json`, or end with `.auto.tfvars` or `.auto.tfvars.json`. If your file has a custom name (like `random_name.tfvars`), you must provide it explicitly using the `-var-file` flag when running Terraform commands. -You can use the `terraform.tfvars.example` file as a base for your variables. Leter renaming this file to `terraform.tfvars` will automatically load the values for the variables. +You can use the `terraform.tfvars.example` file as a base for your variables. Later renaming this file to `terraform.tfvars` will automatically load the values for the variables. ### List of variables - tenant_id - - You Azure tenant ID + - Your Azure Tenant ID - azure_subscription_id - Your Azure Subscription ID - resource_group_name @@ -130,7 +130,7 @@ You can use the `terraform.tfvars.example` file as a base for your variables. Le - subnet_private_cidr - The CIDR address of the second subnet - managed_resource_group_name - - The name of the managed resource group + - The name of the managed resource group. This is an optional field; if no name is provided, Azure will generate one automatically. ## Deploy @@ -156,7 +156,7 @@ After successful deployment: terraform output workspace_url # Get the workspace ID -terraform output workspace_id +terraform output databricks_workspace_id ``` Navigate to the workspace URL and log in with your Databricks credentials. From aba3a3b8b4165392857acd2b9dae6efe40b036d1 Mon Sep 17 00:00:00 2001 From: Gergelj Kis Date: Fri, 17 Apr 2026 14:16:26 +0200 Subject: [PATCH 2/3] Added PLAYBOOK.md file --- .../azure/azure-vnet-injection/PLAYBOOK.md | 447 ++++++++++++++++++ 1 file changed, 447 insertions(+) create mode 100644 workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md b/workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md new file mode 100644 index 0000000..d0d900d --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md @@ -0,0 +1,447 @@ +# Deployment Playbook: Azure Databricks Workspace with VNet Injection + +This playbook walks you through deploying an Azure Databricks workspace with VNet injection using the Terraform code provided in the `tf/` directory. Follow each step in order. + +--- + +## Table of Contents + +1. [What This Deploys](#1-what-this-deploys) +2. [Prerequisites](#2-prerequisites) +3. [Gather Required Information](#3-gather-required-information) +4. [Authenticate to Azure](#4-authenticate-to-azure) +5. [Configure Variables](#5-configure-variables) +6. [Deploy](#6-deploy) +7. [Verify the Deployment](#7-verify-the-deployment) +8. [Tear Down / Cleanup](#8-tear-down--cleanup) +9. [Troubleshooting](#9-troubleshooting) +10. [Additional Resources](#10-additional-resources) + +--- + +## 1. What This Deploys + +This Terraform project provisions a **Premium-tier** Azure Databricks workspace deployed into your own Virtual Network (VNet injection) with **Secure Cluster Connectivity** (no public IP on cluster nodes). The deployment supports two modes: creating a new VNet or injecting into an existing one. + +### Resources Created + + +| Category | Resources | +| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| **Resource Groups** | Workspace resource group; VNet resource group (if creating a new VNet); Azure-managed resource group for Databricks internals | +| **Networking** | VNet (new or existing), public subnet, private subnet, Network Security Group (NSG), NAT Gateway with static public IP | +| **Databricks** | Premium workspace with VNet injection, Unity Catalog metastore (new or existing assignment), workspace admin user assignment | + + +### Architecture at a Glance + +``` +Azure Subscription +│ +├── Resource Group (workspace) +│ └── Databricks Workspace (Premium, no public IP) +│ └── Managed Resource Group (created by Azure) +│ +└── Resource Group (VNet — new or existing) + ├── Virtual Network + │ ├── Public Subnet (delegated to Databricks) + │ └── Private Subnet (delegated to Databricks) + ├── Network Security Group (associated with both subnets) + ├── NAT Gateway (associated with both subnets) + └── Static Public IP (attached to NAT Gateway) +``` + +### Key Security Characteristics + +- **No public IPs** on cluster nodes (`no_public_ip = true`). +- **NAT Gateway** provides a single, stable egress IP for allowlisting. +- **Default outbound access disabled** on both subnets — all egress must go through the NAT Gateway. +- Both subnets are **delegated** to `Microsoft.Databricks/workspaces`. + +--- + +## 2. Prerequisites + +Before you begin, ensure you have the following: + +### Tools + + +| Tool | Minimum Version | Install Link | +| --------- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Terraform | ~> 1.3 | [Install Terraform](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli#install-terraform) | +| Azure CLI | Latest | [Mac](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos) / [Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows) | + + +### Access & Permissions + +- **Azure**: Contributor role on the target Azure subscription. Subscription-level access is required because Databricks provisioning creates resources in a separate managed resource group. +- **Databricks**: Account admin access to your Databricks account. +- The `admin_user` email you plan to use must already exist as a user in the Databricks account. + +### Network Planning + +Decide which deployment mode you will use: + + +| Mode | When to Use | Variable Setting | +| ----------------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------- | +| **New VNet** | You want Terraform to create the VNet, subnets, and resource group for you. | `create_new_vnet = true` | +| **Existing VNet** | You already have a VNet and want to inject Databricks into it. The VNet and its resource group must already exist. | `create_new_vnet = false` | + + +In both cases: + +- The VNet CIDR must be between `/16` and `/24`. +- You need two dedicated, non-overlapping subnets (public and private) that are **not used by other resources**. The subnets will be created by Terraform in both modes. + +--- + +## 3. Gather Required Information + +Collect the following values before starting. You will need them in Step 5. + +### Azure + + +| Value | Where to Find It | +| ------------------- | ---------------------------------------------------------------------------------------------- | +| **Tenant ID** | Azure Portal > Microsoft Entra ID > Overview, or run `az account show --query tenantId -o tsv` | +| **Subscription ID** | Azure Portal > Subscriptions, or run `az account show --query id -o tsv` | + + +### Databricks + + +| Value | Where to Find It | +| ----------------------------------------- | ----------------------------------------------------------------------------------------------------- | +| **Account ID** | [Databricks Account Console](https://accounts.azuredatabricks.net/) > Settings > Account details | +| **Admin user email** | The email of a user that already exists in the Databricks account | +| **Existing metastore ID** (if applicable) | Account Console > Catalog > Metastore details. Leave empty if you want Terraform to create a new one. | + + +### Naming + +Choose names for the following resources. These must be unique within your subscription/account: + +- Resource group name (e.g., `rg-databricks-prod`) +- Workspace name (e.g., `databricks-workspace-prod`) +- Root storage account name — **lowercase letters and numbers only, 3-24 characters** (e.g., `dbaborootprod01`) +- VNet name (e.g., `vnet-databricks-prod`) +- VNet resource group name (e.g., `rg-vnet-databricks-prod`) +- Managed resource group name (optional — Azure will auto-generate if not provided) +- New metastore name (only if creating a new metastore) + +### Network CIDRs + +Plan your address space. Example: + + +| Parameter | Example Value | +| ------------------- | ------------- | +| VNet CIDR | `10.0.0.0/16` | +| Public subnet CIDR | `10.0.1.0/24` | +| Private subnet CIDR | `10.0.2.0/24` | + + +The subnet CIDRs must fall within the VNet CIDR range and must not overlap with each other. + +--- + +## 4. Authenticate to Azure + +Open a terminal and navigate to the `workspace-setup/terraform-examples/azure/azure-vnet-injection/tf/` directory: + +```sh +cd ./workspace-setup/terraform-examples/azure/azure-vnet-injection/tf/ +``` + +### Interactive Login (recommended for first-time deployment) + +```sh +az login +``` + +This opens a browser window for authentication. If you have multiple subscriptions, set the correct one: + +```sh +az account set --subscription "" +``` + +Verify your active subscription: + +```sh +az account show +``` + +### Service Principal Login (for CI/CD pipelines) + +If you are running this from an automated pipeline, authenticate with a service principal: + +```sh +az login --service-principal -u -p --tenant +``` + +See the [README](README.md) for detailed instructions on creating a service principal. + +--- + +## 5. Configure Variables + +### 5.1 Create Your Variables File + +Copy the example file: + +```sh +cp terraform.tfvars.example terraform.tfvars +``` + +### 5.2 Edit the Variables + +Open `terraform.tfvars` in your editor and fill in every value. Below are two complete examples for the two deployment modes. + +#### Example A: Creating a New VNet + +```hcl +# Azure Configuration +tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +azure_subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +tags = { + "Owner" = "Jane Doe" + "Environment" = "Production" +} + +# Databricks Workspace Configuration +resource_group_name = "rg-databricks-prod" +workspace_name = "databricks-workspace-prod" +admin_user = "jane.doe@company.com" +root_storage_name = "daborootprod01" +location = "westeurope" +databricks_account_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +existing_metastore_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +new_metastore_name = "" +managed_resource_group_name = null + +# Network Configuration — new VNet +create_new_vnet = true +vnet_name = "vnet-databricks-prod" +vnet_resource_group_name = "rg-vnet-databricks-prod" +cidr = "10.0.0.0/16" +subnet_public_cidr = "10.0.1.0/24" +subnet_private_cidr = "10.0.2.0/24" +``` + +#### Example B: Using an Existing VNet + +```hcl +# Azure Configuration +tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +azure_subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +tags = { + "Owner" = "Jane Doe" + "Environment" = "Production" +} + +# Databricks Workspace Configuration +resource_group_name = "rg-databricks-prod" +workspace_name = "databricks-workspace-prod" +admin_user = "jane.doe@company.com" +root_storage_name = "daborootprod01" +location = "westeurope" +databricks_account_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +existing_metastore_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +new_metastore_name = "" +managed_resource_group_name = null + +# Network Configuration — existing VNet +create_new_vnet = false +vnet_name = "my-existing-vnet" +vnet_resource_group_name = "my-existing-vnet-rg" +cidr = "10.0.0.0/16" +subnet_public_cidr = "10.0.10.0/24" +subnet_private_cidr = "10.0.11.0/24" +``` + +> **Note on metastore configuration:** If `existing_metastore_id` is left empty (`""`), Terraform will create a new metastore using `new_metastore_name`. If you provide an existing metastore ID, `new_metastore_name` is ignored. Most regions already have a metastore — check the Account Console first. + +### 5.3 Variable Reference + + +| Variable | Required | Default | Description | +| ----------------------------- | -------- | ------------- | ----------------------------------------------------------------------------------------------------------------------- | +| `tenant_id` | Yes | — | Your Azure Tenant ID | +| `azure_subscription_id` | Yes | — | Your Azure Subscription ID | +| `resource_group_name` | Yes | — | Name of the resource group for the Databricks workspace | +| `managed_resource_group_name` | No | `null` | Name of the managed resource group. Must differ from `resource_group_name`. If `null`, Azure generates one. | +| `tags` | No | `{}` | Map of tags applied to all resources | +| `databricks_account_id` | Yes | — | Your Databricks account ID (treated as sensitive) | +| `workspace_name` | Yes | — | Name of the Databricks workspace | +| `admin_user` | Yes | — | Email of the user to assign as workspace and metastore admin | +| `root_storage_name` | Yes | — | Root storage account name. Lowercase letters and numbers only, 3-24 characters. | +| `location` | Yes | — | Azure region ([supported regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions)) | +| `existing_metastore_id` | No | `""` | ID of an existing metastore. Leave empty to create a new one. | +| `new_metastore_name` | No | `""` | Name for a new metastore. Only used when `existing_metastore_id` is empty. Alphanumeric, hyphens, and underscores only. | +| `create_new_vnet` | No | `true` | `true` to create a new VNet; `false` to use an existing one | +| `vnet_name` | Yes | — | Name of the VNet (new or existing) | +| `vnet_resource_group_name` | Yes | — | Resource group containing the VNet (created if new, must exist if existing) | +| `cidr` | No | `10.0.0.0/20` | CIDR block for the VNet (between /16 and /24) | +| `subnet_public_cidr` | Yes | — | CIDR for the public (host) subnet | +| `subnet_private_cidr` | Yes | — | CIDR for the private (container) subnet | + + +--- + +## 6. Deploy + +Run the following commands from the `tf/` directory: + +### 6.1 Initialize Terraform + +```sh +terraform init +``` + +This downloads the required providers (`azurerm ~> 4.50`, `databricks ~> 1.84`) and initializes the working directory. You should see **"Terraform has been successfully initialized"**. + +### 6.2 Review the Plan + +```sh +terraform plan +``` + +Review the output carefully. It will show you every resource that Terraform intends to create. Verify that: + +- The correct subscription and region are being used. +- Resource names match your expectations. +- The number of resources to be created looks right (roughly 13-16 depending on whether you're creating a new VNet and metastore). + +### 6.3 Apply + +```sh +terraform apply +``` + +Terraform will display the plan again and prompt for confirmation. Type `yes` to proceed. + +The deployment typically takes **10-15 minutes**. The most time-consuming step is the Databricks workspace provisioning itself. + +When complete, Terraform will print the outputs, including your workspace URL. + +--- + +## 7. Verify the Deployment + +### 7.1 Check Terraform Outputs + +```sh +terraform output workspace_url +terraform output databricks_workspace_id +terraform output nat_gateway_public_ip +``` + +All available outputs: + + +| Output | Description | +| --------------------------- | ----------------------------------------------------------------------- | +| `workspace_url` | The URL of the deployed Databricks workspace | +| `databricks_workspace_id` | The Azure resource ID of the workspace | +| `nat_gateway_public_ip` | The static public IP used for egress (useful for firewall allowlisting) | +| `vnet_id` | Azure resource ID of the VNet | +| `public_subnet_id` | Azure resource ID of the public subnet | +| `private_subnet_id` | Azure resource ID of the private subnet | +| `nat_gateway_id` | Azure resource ID of the NAT Gateway | +| `security_group_id` | Azure resource ID of the Network Security Group | +| `managed_resource_group_id` | Azure resource ID of the Databricks managed resource group | + + +### 7.2 Access the Workspace + +1. Open the `workspace_url` in your browser. +2. Log in with the credentials of the `admin_user` you configured. +3. Verify Unity Catalog is attached: navigate to **Catalog** in the left sidebar. You should see the metastore assigned to your workspace. + +### 7.3 Validate Networking + +To confirm VNet injection and Secure Cluster Connectivity are working: + +1. Create a small test cluster in the workspace. +2. While the cluster is starting, navigate to the Azure Portal: + - Open the VNet resource and verify both subnets show Databricks delegation. + - Open the managed resource group — you should see Databricks-managed resources (disks, NICs) appearing without public IPs. +3. Once the cluster is running, check the NAT Gateway's metrics in the Azure Portal to confirm egress traffic is flowing through it. + +### 7.4 Note the NAT Gateway IP + +The `nat_gateway_public_ip` output provides the single egress IP for all outbound traffic from your Databricks clusters. Use this IP for: + +- Firewall allowlisting on external data sources. +- Network security rules on downstream services. + +--- + +## 8. Tear Down / Cleanup + +To destroy all resources created by this Terraform project: + +```sh +terraform destroy +``` + +Type `yes` when prompted. This will remove the workspace, networking resources, resource groups, and the metastore (if Terraform created it). + +> **Warning:** This is irreversible. Ensure you have backed up any data, notebooks, or configurations from the workspace before destroying it. + +> **Note:** If you manually created resources inside the managed resource group (or the workspace itself), `terraform destroy` may fail. Remove those resources manually first, then retry. + +--- + +## 9. Troubleshooting + +### "Managed resource group name should not be same as resource group name" + +The `managed_resource_group_name` variable must differ from `resource_group_name`. Either choose a different name or set it to `null` to let Azure generate one. + +### "root_storage_name can only contain lowercase letters and numbers" + +The root storage account name has strict naming rules: only lowercase letters (`a-z`) and numbers (`0-9`), between 3 and 24 characters. No hyphens, underscores, or uppercase characters. + +### Deployment fails at workspace creation with a VNet error + +- Verify your VNet CIDR is between `/16` and `/24`. +- Ensure the public and private subnet CIDRs fall within the VNet CIDR and do not overlap. +- If using an existing VNet (`create_new_vnet = false`), confirm the VNet and its resource group already exist and you have the correct names. +- Ensure the subnets are not already in use by other services. + +### "Error: retrieving User" or permission errors on the Databricks provider + +- Confirm the `admin_user` email exists in the Databricks account (Account Console > User Management). +- Ensure your `databricks_account_id` is correct. +- Verify your Azure CLI session is authenticated and has the correct subscription set. + +### Terraform state lock errors + +If a previous `terraform apply` was interrupted, you may see state lock errors. Wait a few minutes for the lock to expire, or remove it manually: + +```sh +terraform force-unlock +``` + +### Metastore assignment fails + +- If providing `existing_metastore_id`, confirm the metastore ID is correct and belongs to the same region as your workspace. +- If creating a new metastore, ensure `new_metastore_name` is not empty and contains only alphanumeric characters, hyphens, and underscores. +- A region can only have one metastore. If one already exists, use `existing_metastore_id` instead of creating a new one. + +--- + +## 10. Additional Resources + +- [Azure Databricks VNet Injection Documentation](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject) +- [Secure Cluster Connectivity (No Public IP)](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/secure-cluster-connectivity) +- [Terraform Databricks Provider Documentation](https://registry.terraform.io/providers/databricks/databricks/latest/docs) +- [Terraform AzureRM Provider Documentation](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs) +- [Databricks Security Reference Architecture (Terraform)](https://github.com/databricks/terraform-databricks-sra/tree/main/azure) +- [Terraform Examples with Private Link](https://github.com/databricks/terraform-databricks-examples/tree/main/examples/adb-with-private-link-standard) +- [Azure Databricks Supported Regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions) + From 6d5985aecdb373da6f13a0f29ae6eda3453552c0 Mon Sep 17 00:00:00 2001 From: Gergelj Kis Date: Wed, 13 May 2026 13:44:54 +0200 Subject: [PATCH 3/3] Playbook contents moved to the readme file --- .../azure/azure-vnet-injection/PLAYBOOK.md | 447 --------------- .../azure/azure-vnet-injection/README.md | 515 +++++++++++++----- 2 files changed, 384 insertions(+), 578 deletions(-) delete mode 100644 workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md b/workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md deleted file mode 100644 index d0d900d..0000000 --- a/workspace-setup/terraform-examples/azure/azure-vnet-injection/PLAYBOOK.md +++ /dev/null @@ -1,447 +0,0 @@ -# Deployment Playbook: Azure Databricks Workspace with VNet Injection - -This playbook walks you through deploying an Azure Databricks workspace with VNet injection using the Terraform code provided in the `tf/` directory. Follow each step in order. - ---- - -## Table of Contents - -1. [What This Deploys](#1-what-this-deploys) -2. [Prerequisites](#2-prerequisites) -3. [Gather Required Information](#3-gather-required-information) -4. [Authenticate to Azure](#4-authenticate-to-azure) -5. [Configure Variables](#5-configure-variables) -6. [Deploy](#6-deploy) -7. [Verify the Deployment](#7-verify-the-deployment) -8. [Tear Down / Cleanup](#8-tear-down--cleanup) -9. [Troubleshooting](#9-troubleshooting) -10. [Additional Resources](#10-additional-resources) - ---- - -## 1. What This Deploys - -This Terraform project provisions a **Premium-tier** Azure Databricks workspace deployed into your own Virtual Network (VNet injection) with **Secure Cluster Connectivity** (no public IP on cluster nodes). The deployment supports two modes: creating a new VNet or injecting into an existing one. - -### Resources Created - - -| Category | Resources | -| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | -| **Resource Groups** | Workspace resource group; VNet resource group (if creating a new VNet); Azure-managed resource group for Databricks internals | -| **Networking** | VNet (new or existing), public subnet, private subnet, Network Security Group (NSG), NAT Gateway with static public IP | -| **Databricks** | Premium workspace with VNet injection, Unity Catalog metastore (new or existing assignment), workspace admin user assignment | - - -### Architecture at a Glance - -``` -Azure Subscription -│ -├── Resource Group (workspace) -│ └── Databricks Workspace (Premium, no public IP) -│ └── Managed Resource Group (created by Azure) -│ -└── Resource Group (VNet — new or existing) - ├── Virtual Network - │ ├── Public Subnet (delegated to Databricks) - │ └── Private Subnet (delegated to Databricks) - ├── Network Security Group (associated with both subnets) - ├── NAT Gateway (associated with both subnets) - └── Static Public IP (attached to NAT Gateway) -``` - -### Key Security Characteristics - -- **No public IPs** on cluster nodes (`no_public_ip = true`). -- **NAT Gateway** provides a single, stable egress IP for allowlisting. -- **Default outbound access disabled** on both subnets — all egress must go through the NAT Gateway. -- Both subnets are **delegated** to `Microsoft.Databricks/workspaces`. - ---- - -## 2. Prerequisites - -Before you begin, ensure you have the following: - -### Tools - - -| Tool | Minimum Version | Install Link | -| --------- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Terraform | ~> 1.3 | [Install Terraform](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli#install-terraform) | -| Azure CLI | Latest | [Mac](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos) / [Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows) | - - -### Access & Permissions - -- **Azure**: Contributor role on the target Azure subscription. Subscription-level access is required because Databricks provisioning creates resources in a separate managed resource group. -- **Databricks**: Account admin access to your Databricks account. -- The `admin_user` email you plan to use must already exist as a user in the Databricks account. - -### Network Planning - -Decide which deployment mode you will use: - - -| Mode | When to Use | Variable Setting | -| ----------------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------- | -| **New VNet** | You want Terraform to create the VNet, subnets, and resource group for you. | `create_new_vnet = true` | -| **Existing VNet** | You already have a VNet and want to inject Databricks into it. The VNet and its resource group must already exist. | `create_new_vnet = false` | - - -In both cases: - -- The VNet CIDR must be between `/16` and `/24`. -- You need two dedicated, non-overlapping subnets (public and private) that are **not used by other resources**. The subnets will be created by Terraform in both modes. - ---- - -## 3. Gather Required Information - -Collect the following values before starting. You will need them in Step 5. - -### Azure - - -| Value | Where to Find It | -| ------------------- | ---------------------------------------------------------------------------------------------- | -| **Tenant ID** | Azure Portal > Microsoft Entra ID > Overview, or run `az account show --query tenantId -o tsv` | -| **Subscription ID** | Azure Portal > Subscriptions, or run `az account show --query id -o tsv` | - - -### Databricks - - -| Value | Where to Find It | -| ----------------------------------------- | ----------------------------------------------------------------------------------------------------- | -| **Account ID** | [Databricks Account Console](https://accounts.azuredatabricks.net/) > Settings > Account details | -| **Admin user email** | The email of a user that already exists in the Databricks account | -| **Existing metastore ID** (if applicable) | Account Console > Catalog > Metastore details. Leave empty if you want Terraform to create a new one. | - - -### Naming - -Choose names for the following resources. These must be unique within your subscription/account: - -- Resource group name (e.g., `rg-databricks-prod`) -- Workspace name (e.g., `databricks-workspace-prod`) -- Root storage account name — **lowercase letters and numbers only, 3-24 characters** (e.g., `dbaborootprod01`) -- VNet name (e.g., `vnet-databricks-prod`) -- VNet resource group name (e.g., `rg-vnet-databricks-prod`) -- Managed resource group name (optional — Azure will auto-generate if not provided) -- New metastore name (only if creating a new metastore) - -### Network CIDRs - -Plan your address space. Example: - - -| Parameter | Example Value | -| ------------------- | ------------- | -| VNet CIDR | `10.0.0.0/16` | -| Public subnet CIDR | `10.0.1.0/24` | -| Private subnet CIDR | `10.0.2.0/24` | - - -The subnet CIDRs must fall within the VNet CIDR range and must not overlap with each other. - ---- - -## 4. Authenticate to Azure - -Open a terminal and navigate to the `workspace-setup/terraform-examples/azure/azure-vnet-injection/tf/` directory: - -```sh -cd ./workspace-setup/terraform-examples/azure/azure-vnet-injection/tf/ -``` - -### Interactive Login (recommended for first-time deployment) - -```sh -az login -``` - -This opens a browser window for authentication. If you have multiple subscriptions, set the correct one: - -```sh -az account set --subscription "" -``` - -Verify your active subscription: - -```sh -az account show -``` - -### Service Principal Login (for CI/CD pipelines) - -If you are running this from an automated pipeline, authenticate with a service principal: - -```sh -az login --service-principal -u -p --tenant -``` - -See the [README](README.md) for detailed instructions on creating a service principal. - ---- - -## 5. Configure Variables - -### 5.1 Create Your Variables File - -Copy the example file: - -```sh -cp terraform.tfvars.example terraform.tfvars -``` - -### 5.2 Edit the Variables - -Open `terraform.tfvars` in your editor and fill in every value. Below are two complete examples for the two deployment modes. - -#### Example A: Creating a New VNet - -```hcl -# Azure Configuration -tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -azure_subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -tags = { - "Owner" = "Jane Doe" - "Environment" = "Production" -} - -# Databricks Workspace Configuration -resource_group_name = "rg-databricks-prod" -workspace_name = "databricks-workspace-prod" -admin_user = "jane.doe@company.com" -root_storage_name = "daborootprod01" -location = "westeurope" -databricks_account_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -existing_metastore_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -new_metastore_name = "" -managed_resource_group_name = null - -# Network Configuration — new VNet -create_new_vnet = true -vnet_name = "vnet-databricks-prod" -vnet_resource_group_name = "rg-vnet-databricks-prod" -cidr = "10.0.0.0/16" -subnet_public_cidr = "10.0.1.0/24" -subnet_private_cidr = "10.0.2.0/24" -``` - -#### Example B: Using an Existing VNet - -```hcl -# Azure Configuration -tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -azure_subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -tags = { - "Owner" = "Jane Doe" - "Environment" = "Production" -} - -# Databricks Workspace Configuration -resource_group_name = "rg-databricks-prod" -workspace_name = "databricks-workspace-prod" -admin_user = "jane.doe@company.com" -root_storage_name = "daborootprod01" -location = "westeurope" -databricks_account_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -existing_metastore_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -new_metastore_name = "" -managed_resource_group_name = null - -# Network Configuration — existing VNet -create_new_vnet = false -vnet_name = "my-existing-vnet" -vnet_resource_group_name = "my-existing-vnet-rg" -cidr = "10.0.0.0/16" -subnet_public_cidr = "10.0.10.0/24" -subnet_private_cidr = "10.0.11.0/24" -``` - -> **Note on metastore configuration:** If `existing_metastore_id` is left empty (`""`), Terraform will create a new metastore using `new_metastore_name`. If you provide an existing metastore ID, `new_metastore_name` is ignored. Most regions already have a metastore — check the Account Console first. - -### 5.3 Variable Reference - - -| Variable | Required | Default | Description | -| ----------------------------- | -------- | ------------- | ----------------------------------------------------------------------------------------------------------------------- | -| `tenant_id` | Yes | — | Your Azure Tenant ID | -| `azure_subscription_id` | Yes | — | Your Azure Subscription ID | -| `resource_group_name` | Yes | — | Name of the resource group for the Databricks workspace | -| `managed_resource_group_name` | No | `null` | Name of the managed resource group. Must differ from `resource_group_name`. If `null`, Azure generates one. | -| `tags` | No | `{}` | Map of tags applied to all resources | -| `databricks_account_id` | Yes | — | Your Databricks account ID (treated as sensitive) | -| `workspace_name` | Yes | — | Name of the Databricks workspace | -| `admin_user` | Yes | — | Email of the user to assign as workspace and metastore admin | -| `root_storage_name` | Yes | — | Root storage account name. Lowercase letters and numbers only, 3-24 characters. | -| `location` | Yes | — | Azure region ([supported regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions)) | -| `existing_metastore_id` | No | `""` | ID of an existing metastore. Leave empty to create a new one. | -| `new_metastore_name` | No | `""` | Name for a new metastore. Only used when `existing_metastore_id` is empty. Alphanumeric, hyphens, and underscores only. | -| `create_new_vnet` | No | `true` | `true` to create a new VNet; `false` to use an existing one | -| `vnet_name` | Yes | — | Name of the VNet (new or existing) | -| `vnet_resource_group_name` | Yes | — | Resource group containing the VNet (created if new, must exist if existing) | -| `cidr` | No | `10.0.0.0/20` | CIDR block for the VNet (between /16 and /24) | -| `subnet_public_cidr` | Yes | — | CIDR for the public (host) subnet | -| `subnet_private_cidr` | Yes | — | CIDR for the private (container) subnet | - - ---- - -## 6. Deploy - -Run the following commands from the `tf/` directory: - -### 6.1 Initialize Terraform - -```sh -terraform init -``` - -This downloads the required providers (`azurerm ~> 4.50`, `databricks ~> 1.84`) and initializes the working directory. You should see **"Terraform has been successfully initialized"**. - -### 6.2 Review the Plan - -```sh -terraform plan -``` - -Review the output carefully. It will show you every resource that Terraform intends to create. Verify that: - -- The correct subscription and region are being used. -- Resource names match your expectations. -- The number of resources to be created looks right (roughly 13-16 depending on whether you're creating a new VNet and metastore). - -### 6.3 Apply - -```sh -terraform apply -``` - -Terraform will display the plan again and prompt for confirmation. Type `yes` to proceed. - -The deployment typically takes **10-15 minutes**. The most time-consuming step is the Databricks workspace provisioning itself. - -When complete, Terraform will print the outputs, including your workspace URL. - ---- - -## 7. Verify the Deployment - -### 7.1 Check Terraform Outputs - -```sh -terraform output workspace_url -terraform output databricks_workspace_id -terraform output nat_gateway_public_ip -``` - -All available outputs: - - -| Output | Description | -| --------------------------- | ----------------------------------------------------------------------- | -| `workspace_url` | The URL of the deployed Databricks workspace | -| `databricks_workspace_id` | The Azure resource ID of the workspace | -| `nat_gateway_public_ip` | The static public IP used for egress (useful for firewall allowlisting) | -| `vnet_id` | Azure resource ID of the VNet | -| `public_subnet_id` | Azure resource ID of the public subnet | -| `private_subnet_id` | Azure resource ID of the private subnet | -| `nat_gateway_id` | Azure resource ID of the NAT Gateway | -| `security_group_id` | Azure resource ID of the Network Security Group | -| `managed_resource_group_id` | Azure resource ID of the Databricks managed resource group | - - -### 7.2 Access the Workspace - -1. Open the `workspace_url` in your browser. -2. Log in with the credentials of the `admin_user` you configured. -3. Verify Unity Catalog is attached: navigate to **Catalog** in the left sidebar. You should see the metastore assigned to your workspace. - -### 7.3 Validate Networking - -To confirm VNet injection and Secure Cluster Connectivity are working: - -1. Create a small test cluster in the workspace. -2. While the cluster is starting, navigate to the Azure Portal: - - Open the VNet resource and verify both subnets show Databricks delegation. - - Open the managed resource group — you should see Databricks-managed resources (disks, NICs) appearing without public IPs. -3. Once the cluster is running, check the NAT Gateway's metrics in the Azure Portal to confirm egress traffic is flowing through it. - -### 7.4 Note the NAT Gateway IP - -The `nat_gateway_public_ip` output provides the single egress IP for all outbound traffic from your Databricks clusters. Use this IP for: - -- Firewall allowlisting on external data sources. -- Network security rules on downstream services. - ---- - -## 8. Tear Down / Cleanup - -To destroy all resources created by this Terraform project: - -```sh -terraform destroy -``` - -Type `yes` when prompted. This will remove the workspace, networking resources, resource groups, and the metastore (if Terraform created it). - -> **Warning:** This is irreversible. Ensure you have backed up any data, notebooks, or configurations from the workspace before destroying it. - -> **Note:** If you manually created resources inside the managed resource group (or the workspace itself), `terraform destroy` may fail. Remove those resources manually first, then retry. - ---- - -## 9. Troubleshooting - -### "Managed resource group name should not be same as resource group name" - -The `managed_resource_group_name` variable must differ from `resource_group_name`. Either choose a different name or set it to `null` to let Azure generate one. - -### "root_storage_name can only contain lowercase letters and numbers" - -The root storage account name has strict naming rules: only lowercase letters (`a-z`) and numbers (`0-9`), between 3 and 24 characters. No hyphens, underscores, or uppercase characters. - -### Deployment fails at workspace creation with a VNet error - -- Verify your VNet CIDR is between `/16` and `/24`. -- Ensure the public and private subnet CIDRs fall within the VNet CIDR and do not overlap. -- If using an existing VNet (`create_new_vnet = false`), confirm the VNet and its resource group already exist and you have the correct names. -- Ensure the subnets are not already in use by other services. - -### "Error: retrieving User" or permission errors on the Databricks provider - -- Confirm the `admin_user` email exists in the Databricks account (Account Console > User Management). -- Ensure your `databricks_account_id` is correct. -- Verify your Azure CLI session is authenticated and has the correct subscription set. - -### Terraform state lock errors - -If a previous `terraform apply` was interrupted, you may see state lock errors. Wait a few minutes for the lock to expire, or remove it manually: - -```sh -terraform force-unlock -``` - -### Metastore assignment fails - -- If providing `existing_metastore_id`, confirm the metastore ID is correct and belongs to the same region as your workspace. -- If creating a new metastore, ensure `new_metastore_name` is not empty and contains only alphanumeric characters, hyphens, and underscores. -- A region can only have one metastore. If one already exists, use `existing_metastore_id` instead of creating a new one. - ---- - -## 10. Additional Resources - -- [Azure Databricks VNet Injection Documentation](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject) -- [Secure Cluster Connectivity (No Public IP)](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/secure-cluster-connectivity) -- [Terraform Databricks Provider Documentation](https://registry.terraform.io/providers/databricks/databricks/latest/docs) -- [Terraform AzureRM Provider Documentation](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs) -- [Databricks Security Reference Architecture (Terraform)](https://github.com/databricks/terraform-databricks-sra/tree/main/azure) -- [Terraform Examples with Private Link](https://github.com/databricks/terraform-databricks-examples/tree/main/examples/adb-with-private-link-standard) -- [Azure Databricks Supported Regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions) - diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md b/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md index 6849dc7..d0d900d 100644 --- a/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection/README.md @@ -1,194 +1,447 @@ -# Azure VNet injection Workspace Setup Guide (with VNet deployment) +# Deployment Playbook: Azure Databricks Workspace with VNet Injection -## Requirements +This playbook walks you through deploying an Azure Databricks workspace with VNet injection using the Terraform code provided in the `tf/` directory. Follow each step in order. -- Terraform is installed on your local machine: [link](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli#install-terraform) -- Azure CLI is installed on your local machine: [Mac](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew) or [Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows?view=azure-cli-latest&pivots=winget) -- Azure CLI configured with appropriate credentials -- Databricks account created -- Databricks account admin access -- Contributor rights to your Azure subscription (Contributor rights on the resource group level are not sufficient, as Databricks provisioning creates resources in a separate managed resource group, which requires subscription-level access.) +--- -## Before you begin +## Table of Contents -In this deployment, we define key configuration values, such as subscription ID, resource group location, CIDR block, asset naming, and others, as variables. This keeps our code organized and makes it easy to adjust settings without changing the core infrastructure definitions. You can choose to define these variables directly or reference them from a separate configuration file for better modularity. In this document, we will create a configuration file to store them separately (`terraform.tfvars.example`). +1. [What This Deploys](#1-what-this-deploys) +2. [Prerequisites](#2-prerequisites) +3. [Gather Required Information](#3-gather-required-information) +4. [Authenticate to Azure](#4-authenticate-to-azure) +5. [Configure Variables](#5-configure-variables) +6. [Deploy](#6-deploy) +7. [Verify the Deployment](#7-verify-the-deployment) +8. [Tear Down / Cleanup](#8-tear-down--cleanup) +9. [Troubleshooting](#9-troubleshooting) +10. [Additional Resources](#10-additional-resources) -## Authenticate the Azure CLI +--- -### Option 1: Interactive user login (for users) +## 1. What This Deploys -```sh -az login +This Terraform project provisions a **Premium-tier** Azure Databricks workspace deployed into your own Virtual Network (VNet injection) with **Secure Cluster Connectivity** (no public IP on cluster nodes). The deployment supports two modes: creating a new VNet or injecting into an existing one. + +### Resources Created + + +| Category | Resources | +| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| **Resource Groups** | Workspace resource group; VNet resource group (if creating a new VNet); Azure-managed resource group for Databricks internals | +| **Networking** | VNet (new or existing), public subnet, private subnet, Network Security Group (NSG), NAT Gateway with static public IP | +| **Databricks** | Premium workspace with VNet injection, Unity Catalog metastore (new or existing assignment), workspace admin user assignment | + + +### Architecture at a Glance + +``` +Azure Subscription +│ +├── Resource Group (workspace) +│ └── Databricks Workspace (Premium, no public IP) +│ └── Managed Resource Group (created by Azure) +│ +└── Resource Group (VNet — new or existing) + ├── Virtual Network + │ ├── Public Subnet (delegated to Databricks) + │ └── Private Subnet (delegated to Databricks) + ├── Network Security Group (associated with both subnets) + ├── NAT Gateway (associated with both subnets) + └── Static Public IP (attached to NAT Gateway) ``` -This command opens a browser for user authentication, and it is commonly referred to as U2M (User-to-machine) authentication. This command is sufficient for all operations in this document. +### Key Security Characteristics + +- **No public IPs** on cluster nodes (`no_public_ip = true`). +- **NAT Gateway** provides a single, stable egress IP for allowlisting. +- **Default outbound access disabled** on both subnets — all egress must go through the NAT Gateway. +- Both subnets are **delegated** to `Microsoft.Databricks/workspaces`. + +--- + +## 2. Prerequisites + +Before you begin, ensure you have the following: + +### Tools + + +| Tool | Minimum Version | Install Link | +| --------- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Terraform | ~> 1.3 | [Install Terraform](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli#install-terraform) | +| Azure CLI | Latest | [Mac](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos) / [Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows) | + + +### Access & Permissions + +- **Azure**: Contributor role on the target Azure subscription. Subscription-level access is required because Databricks provisioning creates resources in a separate managed resource group. +- **Databricks**: Account admin access to your Databricks account. +- The `admin_user` email you plan to use must already exist as a user in the Databricks account. + +### Network Planning + +Decide which deployment mode you will use: + + +| Mode | When to Use | Variable Setting | +| ----------------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------- | +| **New VNet** | You want Terraform to create the VNet, subnets, and resource group for you. | `create_new_vnet = true` | +| **Existing VNet** | You already have a VNet and want to inject Databricks into it. The VNet and its resource group must already exist. | `create_new_vnet = false` | + + +In both cases: + +- The VNet CIDR must be between `/16` and `/24`. +- You need two dedicated, non-overlapping subnets (public and private) that are **not used by other resources**. The subnets will be created by Terraform in both modes. + +--- + +## 3. Gather Required Information -### Option 2: Service principal login (for automation, CI/CD) +Collect the following values before starting. You will need them in Step 5. -Choose this option if you want to deploy the Terraform script to a Git repository and integrate it into your CI/CD processes after completing this guide. It is the recommended approach for automation in non-interactive environments such as pipelines or scripts. +### Azure -Steps to Create a Service Principal via Azure CLI: -1. Log in to Azure via Azure CLI +| Value | Where to Find It | +| ------------------- | ---------------------------------------------------------------------------------------------- | +| **Tenant ID** | Azure Portal > Microsoft Entra ID > Overview, or run `az account show --query tenantId -o tsv` | +| **Subscription ID** | Azure Portal > Subscriptions, or run `az account show --query id -o tsv` | + + +### Databricks + + +| Value | Where to Find It | +| ----------------------------------------- | ----------------------------------------------------------------------------------------------------- | +| **Account ID** | [Databricks Account Console](https://accounts.azuredatabricks.net/) > Settings > Account details | +| **Admin user email** | The email of a user that already exists in the Databricks account | +| **Existing metastore ID** (if applicable) | Account Console > Catalog > Metastore details. Leave empty if you want Terraform to create a new one. | + + +### Naming + +Choose names for the following resources. These must be unique within your subscription/account: + +- Resource group name (e.g., `rg-databricks-prod`) +- Workspace name (e.g., `databricks-workspace-prod`) +- Root storage account name — **lowercase letters and numbers only, 3-24 characters** (e.g., `dbaborootprod01`) +- VNet name (e.g., `vnet-databricks-prod`) +- VNet resource group name (e.g., `rg-vnet-databricks-prod`) +- Managed resource group name (optional — Azure will auto-generate if not provided) +- New metastore name (only if creating a new metastore) + +### Network CIDRs + +Plan your address space. Example: + + +| Parameter | Example Value | +| ------------------- | ------------- | +| VNet CIDR | `10.0.0.0/16` | +| Public subnet CIDR | `10.0.1.0/24` | +| Private subnet CIDR | `10.0.2.0/24` | + + +The subnet CIDRs must fall within the VNet CIDR range and must not overlap with each other. + +--- + +## 4. Authenticate to Azure + +Open a terminal and navigate to the `workspace-setup/terraform-examples/azure/azure-vnet-injection/tf/` directory: ```sh -az login +cd ./workspace-setup/terraform-examples/azure/azure-vnet-injection/tf/ ``` -This command opens a browser to authenticate your Azure user account. +### Interactive Login (recommended for first-time deployment) -2. (Optional) Choose the Target Subscription +```sh +az login +``` -If you have multiple subscriptions, set your target subscription: +This opens a browser window for authentication. If you have multiple subscriptions, set the correct one: ```sh -az account set --subscription "" +az account set --subscription "" ``` -You can find your subscription ID with: +Verify your active subscription: ```sh az account show ``` -3. Create the Service Principal -Use the following command to create a service principal, specifying the name, role, and scope: +### Service Principal Login (for CI/CD pipelines) + +If you are running this from an automated pipeline, authenticate with a service principal: ```sh -az ad sp create-for-rbac --name "" --role --scopes /subscriptions/ +az login --service-principal -u -p --tenant ``` -- ``: Desired service principal name. -- ``: e.g. Contributor, Reader, Owner. -- ``: Your Azure Subscription ID. +See the [README](README.md) for detailed instructions on creating a service principal. + +--- -The command outputs JSON with appId, password, and tenant. +## 5. Configure Variables -**Important**: Save the password (client secret) immediately; you cannot retrieve it later. +### 5.1 Create Your Variables File -4. Use the Newly Created SP Credentials +Copy the example file: -You can now use the output values: -- `appId` for the username -- `password` as the client secret -- `tenant` as the tenant ID +```sh +cp terraform.tfvars.example terraform.tfvars +``` -For authentication in automation (like CI/CD or scripts), use: +### 5.2 Edit the Variables + +Open `terraform.tfvars` in your editor and fill in every value. Below are two complete examples for the two deployment modes. + +#### Example A: Creating a New VNet + +```hcl +# Azure Configuration +tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +azure_subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +tags = { + "Owner" = "Jane Doe" + "Environment" = "Production" +} + +# Databricks Workspace Configuration +resource_group_name = "rg-databricks-prod" +workspace_name = "databricks-workspace-prod" +admin_user = "jane.doe@company.com" +root_storage_name = "daborootprod01" +location = "westeurope" +databricks_account_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +existing_metastore_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +new_metastore_name = "" +managed_resource_group_name = null + +# Network Configuration — new VNet +create_new_vnet = true +vnet_name = "vnet-databricks-prod" +vnet_resource_group_name = "rg-vnet-databricks-prod" +cidr = "10.0.0.0/16" +subnet_public_cidr = "10.0.1.0/24" +subnet_private_cidr = "10.0.2.0/24" +``` + +#### Example B: Using an Existing VNet + +```hcl +# Azure Configuration +tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +azure_subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +tags = { + "Owner" = "Jane Doe" + "Environment" = "Production" +} + +# Databricks Workspace Configuration +resource_group_name = "rg-databricks-prod" +workspace_name = "databricks-workspace-prod" +admin_user = "jane.doe@company.com" +root_storage_name = "daborootprod01" +location = "westeurope" +databricks_account_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +existing_metastore_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" +new_metastore_name = "" +managed_resource_group_name = null + +# Network Configuration — existing VNet +create_new_vnet = false +vnet_name = "my-existing-vnet" +vnet_resource_group_name = "my-existing-vnet-rg" +cidr = "10.0.0.0/16" +subnet_public_cidr = "10.0.10.0/24" +subnet_private_cidr = "10.0.11.0/24" +``` + +> **Note on metastore configuration:** If `existing_metastore_id` is left empty (`""`), Terraform will create a new metastore using `new_metastore_name`. If you provide an existing metastore ID, `new_metastore_name` is ignored. Most regions already have a metastore — check the Account Console first. + +### 5.3 Variable Reference + + +| Variable | Required | Default | Description | +| ----------------------------- | -------- | ------------- | ----------------------------------------------------------------------------------------------------------------------- | +| `tenant_id` | Yes | — | Your Azure Tenant ID | +| `azure_subscription_id` | Yes | — | Your Azure Subscription ID | +| `resource_group_name` | Yes | — | Name of the resource group for the Databricks workspace | +| `managed_resource_group_name` | No | `null` | Name of the managed resource group. Must differ from `resource_group_name`. If `null`, Azure generates one. | +| `tags` | No | `{}` | Map of tags applied to all resources | +| `databricks_account_id` | Yes | — | Your Databricks account ID (treated as sensitive) | +| `workspace_name` | Yes | — | Name of the Databricks workspace | +| `admin_user` | Yes | — | Email of the user to assign as workspace and metastore admin | +| `root_storage_name` | Yes | — | Root storage account name. Lowercase letters and numbers only, 3-24 characters. | +| `location` | Yes | — | Azure region ([supported regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions)) | +| `existing_metastore_id` | No | `""` | ID of an existing metastore. Leave empty to create a new one. | +| `new_metastore_name` | No | `""` | Name for a new metastore. Only used when `existing_metastore_id` is empty. Alphanumeric, hyphens, and underscores only. | +| `create_new_vnet` | No | `true` | `true` to create a new VNet; `false` to use an existing one | +| `vnet_name` | Yes | — | Name of the VNet (new or existing) | +| `vnet_resource_group_name` | Yes | — | Resource group containing the VNet (created if new, must exist if existing) | +| `cidr` | No | `10.0.0.0/20` | CIDR block for the VNet (between /16 and /24) | +| `subnet_public_cidr` | Yes | — | CIDR for the public (host) subnet | +| `subnet_private_cidr` | Yes | — | CIDR for the private (container) subnet | + + +--- + +## 6. Deploy + +Run the following commands from the `tf/` directory: + +### 6.1 Initialize Terraform ```sh -az login --service-principal -u -p --tenant -``` - -For more information on creating a Service Principal, visit the [following link](https://learn.microsoft.com/en-us/cli/azure/azure-cli-sp-tutorial-1?view=azure-cli-latest&tabs=bash). - - -## General Requirements for VNet -Before proceeding, ensure your VNet meets the following requirements: - -- The address space for the VNet must use a CIDR block between /16 and /24. - -## Variables - -If you want Terraform to automatically load values for variables from a file, the file must be named either `terraform.tfvars`, `terraform.tfvars.json`, or end with `.auto.tfvars` or `.auto.tfvars.json`. If your file has a custom name (like `random_name.tfvars`), you must provide it explicitly using the `-var-file` flag when running Terraform commands. - -You can use the `terraform.tfvars.example` file as a base for your variables. Later renaming this file to `terraform.tfvars` will automatically load the values for the variables. - -### List of variables - -- tenant_id - - Your Azure Tenant ID -- azure_subscription_id - - Your Azure Subscription ID -- resource_group_name - - The name of the resource group where the Databricks Workspace will be deployed -- tags - - A map of tags to assign to the resources -- databricks_account_id - - ID of the Databricks Account -- workspace_name - - The name of the Databricks workspace -- admin_user - - The email of the user to assign admin access to the workspace and the new metastore -- root_storage_name - - The name of the root storage account. Can only consist of lowercase letters and numbers, and must be between 3 and 24 characters long. -- location - - The Azure region to deploy the workspace to. See [supported regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions). -- existing_metastore_id - - The ID of the existing metastore. Leave empty to create a new metastore. -- new_metastore_name - - The name of the new metastore. -- create_new_vnet - - Whether to create a new VNet or use an existing one -- vnet_name - - The name of the virtual network -- vnet_resource_group_name - - The name of the VNet resource group -- cidr - - The CIDR address of the virtual network -- subnet_public_cidr - - The CIDR address of the first subnet -- subnet_private_cidr - - The CIDR address of the second subnet -- managed_resource_group_name - - The name of the managed resource group. This is an optional field; if no name is provided, Azure will generate one automatically. - - -## Deploy - -```bash -# Initialize Terraform terraform init +``` + +This downloads the required providers (`azurerm ~> 4.50`, `databricks ~> 1.84`) and initializes the working directory. You should see **"Terraform has been successfully initialized"**. -# Review the execution plan +### 6.2 Review the Plan + +```sh terraform plan +``` -# Apply the configuration +Review the output carefully. It will show you every resource that Terraform intends to create. Verify that: + +- The correct subscription and region are being used. +- Resource names match your expectations. +- The number of resources to be created looks right (roughly 13-16 depending on whether you're creating a new VNet and metastore). + +### 6.3 Apply + +```sh terraform apply ``` -Occasionally, you'll be asked to confirm certain actions; type yes when prompted. The deployment typically takes 10-15 minutes. Once the execution finishes, the terminal will output the URL of the created workspace. +Terraform will display the plan again and prompt for confirmation. Type `yes` to proceed. -## Access Your Workspace +The deployment typically takes **10-15 minutes**. The most time-consuming step is the Databricks workspace provisioning itself. -After successful deployment: -```bash -# Get the workspace URL -terraform output workspace_url +When complete, Terraform will print the outputs, including your workspace URL. + +--- + +## 7. Verify the Deployment -# Get the workspace ID +### 7.1 Check Terraform Outputs + +```sh +terraform output workspace_url terraform output databricks_workspace_id +terraform output nat_gateway_public_ip ``` -Navigate to the workspace URL and log in with your Databricks credentials. +All available outputs: -## File Structure -This project uses a flat, organized structure with purpose-specific files instead of a monolithic `main.tf`: +| Output | Description | +| --------------------------- | ----------------------------------------------------------------------- | +| `workspace_url` | The URL of the deployed Databricks workspace | +| `databricks_workspace_id` | The Azure resource ID of the workspace | +| `nat_gateway_public_ip` | The static public IP used for egress (useful for firewall allowlisting) | +| `vnet_id` | Azure resource ID of the VNet | +| `public_subnet_id` | Azure resource ID of the public subnet | +| `private_subnet_id` | Azure resource ID of the private subnet | +| `nat_gateway_id` | Azure resource ID of the NAT Gateway | +| `security_group_id` | Azure resource ID of the Network Security Group | +| `managed_resource_group_id` | Azure resource ID of the Databricks managed resource group | + +### 7.2 Access the Workspace + +1. Open the `workspace_url` in your browser. +2. Log in with the credentials of the `admin_user` you configured. +3. Verify Unity Catalog is attached: navigate to **Catalog** in the left sidebar. You should see the metastore assigned to your workspace. + +### 7.3 Validate Networking + +To confirm VNet injection and Secure Cluster Connectivity are working: + +1. Create a small test cluster in the workspace. +2. While the cluster is starting, navigate to the Azure Portal: + - Open the VNet resource and verify both subnets show Databricks delegation. + - Open the managed resource group — you should see Databricks-managed resources (disks, NICs) appearing without public IPs. +3. Once the cluster is running, check the NAT Gateway's metrics in the Azure Portal to confirm egress traffic is flowing through it. + +### 7.4 Note the NAT Gateway IP + +The `nat_gateway_public_ip` output provides the single egress IP for all outbound traffic from your Databricks clusters. Use this IP for: + +- Firewall allowlisting on external data sources. +- Network security rules on downstream services. + +--- + +## 8. Tear Down / Cleanup + +To destroy all resources created by this Terraform project: + +```sh +terraform destroy ``` -tf/ -├── azure.tf # Azure resources -├── databricks.tf # Databricks workspace -├── network.tf # VNet, subnets, and networking -├── outputs.tf # All output values -├── providers.tf # Provider configurations -├── terraform.tfvars.example # Configuration template -├── variables.tf # All input variable definitions -├── versions.tf # Version of the providers -``` -**Note:** There is no `main.tf` file in this project. Instead, resources are organized into descriptive, purpose-specific files. +Type `yes` when prompted. This will remove the workspace, networking resources, resource groups, and the metastore (if Terraform created it). + +> **Warning:** This is irreversible. Ensure you have backed up any data, notebooks, or configurations from the workspace before destroying it. + +> **Note:** If you manually created resources inside the managed resource group (or the workspace itself), `terraform destroy` may fail. Remove those resources manually first, then retry. + +--- + +## 9. Troubleshooting + +### "Managed resource group name should not be same as resource group name" + +The `managed_resource_group_name` variable must differ from `resource_group_name`. Either choose a different name or set it to `null` to let Azure generate one. + +### "root_storage_name can only contain lowercase letters and numbers" + +The root storage account name has strict naming rules: only lowercase letters (`a-z`) and numbers (`0-9`), between 3 and 24 characters. No hyphens, underscores, or uppercase characters. + +### Deployment fails at workspace creation with a VNet error + +- Verify your VNet CIDR is between `/16` and `/24`. +- Ensure the public and private subnet CIDRs fall within the VNet CIDR and do not overlap. +- If using an existing VNet (`create_new_vnet = false`), confirm the VNet and its resource group already exist and you have the correct names. +- Ensure the subnets are not already in use by other services. + +### "Error: retrieving User" or permission errors on the Databricks provider + +- Confirm the `admin_user` email exists in the Databricks account (Account Console > User Management). +- Ensure your `databricks_account_id` is correct. +- Verify your Azure CLI session is authenticated and has the correct subscription set. + +### Terraform state lock errors + +If a previous `terraform apply` was interrupted, you may see state lock errors. Wait a few minutes for the lock to expire, or remove it manually: + +```sh +terraform force-unlock +``` -Terraform will automatically load all `.tf` files in the directory, so the absence of `main.tf` doesn't affect functionality. +### Metastore assignment fails +- If providing `existing_metastore_id`, confirm the metastore ID is correct and belongs to the same region as your workspace. +- If creating a new metastore, ensure `new_metastore_name` is not empty and contains only alphanumeric characters, hyphens, and underscores. +- A region can only have one metastore. If one already exists, use `existing_metastore_id` instead of creating a new one. -## Terraform template examples and more documentation: +--- -Keep in mind that the git code is not always up to date. You should use these templates as an example and not directly copy and paste. Please note that the code in the template projects is provided for your exploration only and is not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS, and we do not make any guarantees of any kind. +## 10. Additional Resources -- [Deploy with Private Link](https://github.com/databricks/terraform-databricks-examples/tree/main/examples/adb-with-private-link-standard) -- [Security Reference Architecture Template](https://github.com/databricks/terraform-databricks-sra/tree/main/azure) - - This is a template that adheres to the best security practices we recommend. -- [Terraform Databricks provider documentation](https://registry.terraform.io/providers/databricks/databricks/latest/docs) -- [Configure a workspace with VNet injection](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject) +- [Azure Databricks VNet Injection Documentation](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject) +- [Secure Cluster Connectivity (No Public IP)](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/secure-cluster-connectivity) +- [Terraform Databricks Provider Documentation](https://registry.terraform.io/providers/databricks/databricks/latest/docs) +- [Terraform AzureRM Provider Documentation](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs) +- [Databricks Security Reference Architecture (Terraform)](https://github.com/databricks/terraform-databricks-sra/tree/main/azure) +- [Terraform Examples with Private Link](https://github.com/databricks/terraform-databricks-examples/tree/main/examples/adb-with-private-link-standard) +- [Azure Databricks Supported Regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions)