Skip to content
Home » Blog » Effortlessly manage Databricks workspaces with Terraform

Effortlessly manage Databricks workspaces with Terraform

Some of you may know that if you create a Databricks workspace on Azure, you’ll pay around €40 per month even if it’s completely idle. Why? Because Azure requires a NAT Gateway for Databricks, and that alone comes with a cost.

Now, I often work with Databricks for demos, testing, and exploration, and I usually set up three environments (dev, test, prod). That would mean €120 per month—just for resources sitting there doing nothing! No thanks.

💡 The Solution? Create & Destroy with Terraform

Instead of keeping Databricks running 24/7, I thought—why not create it when I need it and destroy it when I’m done? Thanks to Terraform, I can now spin up a Databricks workspace in minutes and tear it down just as fast—saving money without extra effort!


🚀 Step 1: Install & Configure Your Environment

Before running Terraform, make sure you have everything set up:

🔧 Install Required Tools

  1. Terraform (I used v3.117.0)
  2. Azure CLI (I used v2.63.0

🔑 Authenticate with Azure

Once you’ve installed Azure CLI, log into your Azure account and select the right subscription:

az login
az account set --subscription "<your-subscription-id>"


🛠️ What Resources Do We Actually Need?

To deploy Databricks properly, we need to set up several essential Azure resources. Here’s what they are and why we need them:

  • Resource Group → Acts as a container for all the resources, keeping everything organized.
  • Virtual Network (VNet) → Provides networking for Databricks to communicate securely.
  • Network Security Group (NSG) → Controls inbound and outbound traffic for security.
  • Public & Private Subnets
    • Public Subnet → Hosts the Databricks workspace itself.
    • Private Subnet → Hosts compute clusters securely, without public exposure.
  • NSG-Subnet Associations → Ensures NSG rules apply to the subnets.
  • Databricks Workspace → The actual workspace where clusters, notebooks, and jobs run.


📜 Terraform Code: Let’s Deploy!

Now, let’s define these resources using Terraform. Here’s how the code looks:

🔹 providers.tf

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">=3.0.0"
    }
  }
}

provider "azurerm" {
  features {}
}



🔹 variables.tf

variable "env" {
  type        = string
  default     = "dev" 
}

variable "location" {
  type        = string
  default     = "westeurope"
  description = "Location of the resources."
}

variable "project_name" {
  type        = string
  default     = "cncdemo"
  description = "Name of a project."
}

variable "cidr" {
  type        = string
  default     = "10.179.0.0/20"
}

variable "tags" {
  type        = map(string)
  description = "Optional tags to add to resources"
  default     = {}
}



🔹 dev.tfvars

env = "dev"


🔹 main.tf

resource "azurerm_resource_group" "rg" {
  location = var.location
  name     = "rg-${lower(var.project_name)}-${lower(var.env)}"
}

resource "azurerm_virtual_network" "vnet" {
  name                 = "vnet-${lower(var.project_name)}-${lower(var.env)}"
  location             = azurerm_resource_group.rg.location
  resource_group_name  = azurerm_resource_group.rg.name
  address_space        = [var.cidr]
}


resource "azurerm_network_security_group" "nsg_dbx" {
  location            = azurerm_resource_group.rg.location
  name                = "nsg-dbx-${lower(var.project_name)}-${lower(var.env)}"
  resource_group_name = azurerm_resource_group.rg.name
}


resource "azurerm_subnet" "subnet_pub_dbx" {
  name                  = "sub-pub-${lower(var.project_name)}-${lower(var.env)}"
  resource_group_name   = azurerm_resource_group.rg.name
  virtual_network_name  = azurerm_virtual_network.vnet.name
  address_prefixes      = [cidrsubnet(var.cidr, 3, 0)]
  # https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet

  delegation {
    name = "databricks"
    service_delegation {
      name = "Microsoft.Databricks/workspaces"
      actions = [
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
      "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
    }
  }
}

resource "azurerm_subnet_network_security_group_association" "pubsub_sg_dbx_association" {
  network_security_group_id = azurerm_network_security_group.nsg_dbx.id
  subnet_id                 = azurerm_subnet.subnet_pub_dbx.id
}



resource "azurerm_subnet" "subnet_priv_dbx" 
  name                 = "sub-priv-${lower(var.project_name)}-${lower(var.env)}"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = [cidrsubnet(var.cidr, 3, 1)]

  # https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet

  delegation {
    name = "databricks"
    service_delegation {
      name = "Microsoft.Databricks/workspaces"
      actions = [
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
      "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
    }
  }
}



resource "azurerm_subnet_network_security_group_association" "privsub_sg_dbx_association" {
  network_security_group_id = azurerm_network_security_group.nsg_dbx.id
  subnet_id                 = azurerm_subnet.subnet_priv_dbx.id
}



resource "azurerm_databricks_workspace" "dbx" {
  name                        = "dbx-${lower(var.project_name)}-${lower(var.env)}"
  resource_group_name         = azurerm_resource_group.rg.name
  location                    = azurerm_resource_group.rg.location
  sku                         = "premium"

  custom_parameters {
    virtual_network_id  = azurerm_virtual_network.vnet.id
    public_subnet_name  = azurerm_subnet.subnet_pub_dbx.name
    private_subnet_name = azurerm_subnet.subnet_priv_dbx.name
    public_subnet_network_security_group_association_id  = azurerm_subnet_network_security_group_association.pubsub_sg_dbx_association.id
    private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.privsub_sg_dbx_association.id 
  }
  # We need this, otherwise destroy doesn't cleanup things correctly
  depends_on = [
    azurerm_subnet_network_security_group_association.pubsub_sg_dbx_association,
    azurerm_subnet_network_security_group_association.privsub_sg_dbx_association
  ]
}

🛠️ Deploy & Destroy with Terraform

Now that we have everything set up, let’s deploy Databricks!

🚀 Deploy Resources

Run these commands:

terraform init -upgrade
terraform plan -var-file=dev.tfvars -out main.tfplan
terraform apply main.tfplan

and finally…

And just like that—Databricks is live! 🎉

🗑️ Destroy Resources When Done

Once you’re finished using Databricks, destroy everything to avoid costs:

terraform plan -destroy -out main.destroy.tfplan
terraform apply main.destroy.tfplan

And boom—no more unnecessary costs!


🔍 Summary

Using Terraform, you can automate the entire Databricks setup, ensuring you only pay for what we use. This is just the beginning and I would aim to expan on this setup as it turned out to be a lot of fun.

Happy coding!

Filip