Some of you may know that if you create a Databricks workspace on Azure, you’ll pay around €40 per month even if it’s completely idle. Why? Because Azure requires a NAT Gateway for Databricks, and that alone comes with a cost.
Now, I often work with Databricks for demos, testing, and exploration, and I usually set up three environments (dev, test, prod). That would mean €120 per month—just for resources sitting there doing nothing! No thanks.
💡 The Solution? Create & Destroy with Terraform
Instead of keeping Databricks running 24/7, I thought—why not create it when I need it and destroy it when I’m done? Thanks to Terraform, I can now spin up a Databricks workspace in minutes and tear it down just as fast—saving money without extra effort!
🚀 Step 1: Install & Configure Your Environment
Before running Terraform, make sure you have everything set up:
🔧 Install Required Tools
- Terraform (I used v3.117.0)
- Azure CLI (I used v2.63.0
🔑 Authenticate with Azure
Once you’ve installed Azure CLI, log into your Azure account and select the right subscription:
az login
az account set --subscription "<your-subscription-id>"
🛠️ What Resources Do We Actually Need?
To deploy Databricks properly, we need to set up several essential Azure resources. Here’s what they are and why we need them:
- Resource Group → Acts as a container for all the resources, keeping everything organized.
- Virtual Network (VNet) → Provides networking for Databricks to communicate securely.
- Network Security Group (NSG) → Controls inbound and outbound traffic for security.
- Public & Private Subnets →
- Public Subnet → Hosts the Databricks workspace itself.
- Private Subnet → Hosts compute clusters securely, without public exposure.
- NSG-Subnet Associations → Ensures NSG rules apply to the subnets.
- Databricks Workspace → The actual workspace where clusters, notebooks, and jobs run.
📜 Terraform Code: Let’s Deploy!
Now, let’s define these resources using Terraform. Here’s how the code looks:
🔹 providers.tf
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">=3.0.0"
}
}
}
provider "azurerm" {
features {}
}
🔹 variables.tf
variable "env" {
type = string
default = "dev"
}
variable "location" {
type = string
default = "westeurope"
description = "Location of the resources."
}
variable "project_name" {
type = string
default = "cncdemo"
description = "Name of a project."
}
variable "cidr" {
type = string
default = "10.179.0.0/20"
}
variable "tags" {
type = map(string)
description = "Optional tags to add to resources"
default = {}
}
🔹 dev.tf
vars
env = "dev"
🔹 main.tf
resource "azurerm_resource_group" "rg" {
location = var.location
name = "rg-${lower(var.project_name)}-${lower(var.env)}"
}
resource "azurerm_virtual_network" "vnet" {
name = "vnet-${lower(var.project_name)}-${lower(var.env)}"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.cidr]
}
resource "azurerm_network_security_group" "nsg_dbx" {
location = azurerm_resource_group.rg.location
name = "nsg-dbx-${lower(var.project_name)}-${lower(var.env)}"
resource_group_name = azurerm_resource_group.rg.name
}
resource "azurerm_subnet" "subnet_pub_dbx" {
name = "sub-pub-${lower(var.project_name)}-${lower(var.env)}"
resource_group_name = azurerm_resource_group.rg.name
virtual_network_name = azurerm_virtual_network.vnet.name
address_prefixes = [cidrsubnet(var.cidr, 3, 0)]
# https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet
delegation {
name = "databricks"
service_delegation {
name = "Microsoft.Databricks/workspaces"
actions = [
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
"Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
}
}
}
resource "azurerm_subnet_network_security_group_association" "pubsub_sg_dbx_association" {
network_security_group_id = azurerm_network_security_group.nsg_dbx.id
subnet_id = azurerm_subnet.subnet_pub_dbx.id
}
resource "azurerm_subnet" "subnet_priv_dbx"
name = "sub-priv-${lower(var.project_name)}-${lower(var.env)}"
resource_group_name = azurerm_resource_group.rg.name
virtual_network_name = azurerm_virtual_network.vnet.name
address_prefixes = [cidrsubnet(var.cidr, 3, 1)]
# https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet
delegation {
name = "databricks"
service_delegation {
name = "Microsoft.Databricks/workspaces"
actions = [
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
"Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
}
}
}
resource "azurerm_subnet_network_security_group_association" "privsub_sg_dbx_association" {
network_security_group_id = azurerm_network_security_group.nsg_dbx.id
subnet_id = azurerm_subnet.subnet_priv_dbx.id
}
resource "azurerm_databricks_workspace" "dbx" {
name = "dbx-${lower(var.project_name)}-${lower(var.env)}"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
sku = "premium"
custom_parameters {
virtual_network_id = azurerm_virtual_network.vnet.id
public_subnet_name = azurerm_subnet.subnet_pub_dbx.name
private_subnet_name = azurerm_subnet.subnet_priv_dbx.name
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.pubsub_sg_dbx_association.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.privsub_sg_dbx_association.id
}
# We need this, otherwise destroy doesn't cleanup things correctly
depends_on = [
azurerm_subnet_network_security_group_association.pubsub_sg_dbx_association,
azurerm_subnet_network_security_group_association.privsub_sg_dbx_association
]
}
🛠️ Deploy & Destroy with Terraform
Now that we have everything set up, let’s deploy Databricks!
🚀 Deploy Resources
Run these commands:
terraform init -upgrade
terraform plan -var-file=dev.tfvars -out main.tfplan
terraform apply main.tfplan
data:image/s3,"s3://crabby-images/63a6f/63a6f270bfa6f328d42e1236a4b0bd756beb2af2" alt=""
and finally…
data:image/s3,"s3://crabby-images/32de4/32de48ce41409d86a6d5d88dceacd0409133467c" alt=""
data:image/s3,"s3://crabby-images/343ad/343add9b4e6f35cfdf9f75782f5d2519b785a955" alt=""
And just like that—Databricks is live! 🎉
🗑️ Destroy Resources When Done
Once you’re finished using Databricks, destroy everything to avoid costs:
terraform plan -destroy -out main.destroy.tfplan
terraform apply main.destroy.tfplan
And boom—no more unnecessary costs!
🔍 Summary
Using Terraform, you can automate the entire Databricks setup, ensuring you only pay for what we use. This is just the beginning and I would aim to expan on this setup as it turned out to be a lot of fun.
Happy coding!
Filip