This template provides a self-contained deployment of a Databricks cluster pre-configured with Claude Code CLI for AI-assisted development directly on the cluster.
- Unity Catalog Volume for init script storage
- Databricks cluster with Claude Code CLI auto-installed on startup
- MLflow experiment for tracing Claude Code sessions
- Bash helper functions for easy usage
- Copy
terraform.tfvars.exampletoterraform.tfvars - Update
terraform.tfvarswith your values:databricks_resource_id: Your Azure Databricks workspace resource IDcluster_name: Name for your clustercatalog_name: Unity Catalog name to use
- (Optional) Customize cluster configuration in
terraform.tfvars(node type, autoscaling, etc.) - (Optional) Configure your remote backend
- Run
terraform initto initialize terraform and get provider ready - Run
terraform planto review the resources that will be created - Run
terraform applyto create the resources
- Databricks workspace with Unity Catalog enabled
- Unity Catalog with an existing catalog and schema
- Unity Catalog metastore must have a root storage credential configured (required for volumes)
- Permission to create clusters
- (For Azure) Authenticated via
az loginor environment variables - Databricks Runtime 14.3 LTS or higher recommended
Note: If you encounter an error about missing root storage credential, you need to configure the metastore's root storage credential first. See Databricks documentation for details.
After the cluster starts, you can connect via SSH to use Claude Code and other development tools.
Use the Databricks CLI to set up SSH access to your new cluster:
# Authenticate if needed
databricks auth login --host https://your-workspace-url.cloud.databricks.com
# Set up SSH config (replace 'claude-dev' with your preferred alias)
databricks ssh setup --name claude-dev
# Select your cluster from the list when promptedThis creates an entry in your ~/.ssh/config file.
- Install the Remote - SSH extension in VSCode or Cursor.
- Open the Command Palette (
Cmd+Shift+P/Ctrl+Shift+P). - Select Remote-SSH: Connect to Host.
- Choose
claude-dev(or the alias you created). - Select Linux as the platform.
- Once connected, open your persistent workspace folder:
/Workspace/Users/<your-email>/.
Important: Work Storage Location
⚠️ DO NOT use Databricks Repos (/Repos/...) for active development work. Repos folders can be unreliable for persistent storage and may lose uncommitted changes during cluster restarts or sync operations.✅ Use
/Workspace/Users/<your-email>/instead. This location provides reliable persistent storage. You can use regular git commands to manage version control (see "Using Git in /Workspace" section below).
Open the terminal in your remote VSCode/Cursor session and run:
# 1. Load environment variables and helpers
source ~/.bashrc
# 2. Enable MLflow tracing (optional but recommended)
claude-tracing-enable
# 3. Start Claude Code
claudeFirst-time setup tips:
- Claude will ask for file permissions; use
Shift+Tabto auto-allow edits in the current directory. - If you need to refresh credentials, run
claude-refresh-token.
VSCode and Cursor automatically forward ports. For example, to run a Streamlit app:
- Create
app.py:import streamlit as st st.title("Databricks Remote App") st.write("Running on cluster!")
- Run it:
streamlit run app.py --server.port 8501
- Click "Open in Browser" in the popup notification to view it at
localhost:8501.
You don't need to configure a virtual environment. Databricks manages it for you.
- In the remote terminal, find the python path:
echo $DATABRICKS_VIRTUAL_ENV # Output example: /local_disk0/.ephemeral_nfs/envs/pythonEnv-xxxx/bin/python
- In VSCode/Cursor, open the Command Palette and select Python: Select Interpreter.
- Paste the path from above.
To keep your agent running even if you disconnect:
# Start a new session
tmux new -s claude-session
# Detach (Ctrl+B, then D)
# Reattach later
tmux attach -t claude-sessionThis allows you to leave long-running tasks (like "Build a data pipeline") executing on the cluster while you are offline.
Since /Workspace doesn't have native Repos integration, use standard git commands:
# Navigate to your workspace directory
cd /Workspace/Users/<your-email>/
# Option 1: Clone an existing repository
git clone https://github.com/your-org/your-repo.git
cd your-repo
# Option 2: Initialize a new repository
mkdir my-project && cd my-project
git init
git remote add origin https://github.com/your-org/your-repo.git
# Configure git (first time only)
git config user.name "Your Name"
git config user.email "your.email@company.com"
# Regular git workflow
git add .
git commit -m "Your commit message"
git push origin mainGit Authentication Options:
-
Personal Access Token (PAT) - Recommended:
# GitHub: Create at https://github.com/settings/tokens # Use token as password when prompted git clone https://github.com/your-org/repo.git
-
SSH Keys:
# Generate SSH key on the cluster ssh-keygen -t ed25519 -C "your.email@company.com" # Add to GitHub: Copy output and add at https://github.com/settings/keys cat ~/.ssh/id_ed25519.pub # Clone using SSH git clone git@github.com:your-org/repo.git
-
Git Credential Manager:
# Store credentials to avoid repeated prompts git config --global credential.helper store
| Command | Purpose |
|---|---|
check-claude |
Verify Claude CLI installation and configuration |
claude-debug |
Show detailed Claude configuration |
claude-refresh-token |
Regenerate Claude settings from environment |
claude-token-status |
Check token freshness and auto-refresh status |
claude-tracing-enable |
Enable MLflow tracing for Claude sessions |
claude-tracing-status |
Check tracing status |
claude-tracing-disable |
Disable tracing |
| Command | Purpose |
|---|---|
git-workspace-init |
Interactive setup for git in /Workspace (clone or init) |
git-workspace-check |
Verify location and check for uncommitted/unpushed changes |
git-workspace-setup-auth |
Configure git authentication (PAT, SSH, or credential helper) |
These helpers warn you if working in /Repos and ensure your work is backed up in git.
| Command | Purpose |
|---|---|
claude-vscode-setup |
Show Remote SSH setup instructions |
claude-vscode-env |
Get Python interpreter path for IDE |
claude-vscode-check |
Verify Remote SSH configuration |
claude-vscode-config |
Generate settings.json snippet |
For air-gapped or restricted network environments, use the separate offline module: adb-coding-assistants-cluster-offline. See the Offline Installation Guide for detailed instructions.
cluster_mode = "SINGLE_NODE"
num_workers = 0
node_type_id = "Standard_D8pds_v6"cluster_mode = "STANDARD"
num_workers = null # Enable autoscaling
min_workers = 2
max_workers = 8
node_type_id = "Standard_D8pds_v6"This example uses Databricks unified authentication. Authentication can be provided via:
-
Azure CLI (recommended for local development):
az login terraform apply
-
Environment Variables (recommended for CI/CD):
export DATABRICKS_HOST="https://adb-xxx.azuredatabricks.net" export DATABRICKS_TOKEN="dapi..." terraform apply
-
Configuration Profile:
export DATABRICKS_CONFIG_PROFILE="my-profile" terraform apply
For more details on authentication, see the Databricks unified authentication documentation.
Check cluster event logs in the Databricks UI under Compute → Your Cluster → Event Log.
Common issues:
- Network connectivity to download packages
- Unity Catalog volume permissions
- Insufficient cluster permissions
# Reload bashrc
source ~/.bashrc
# Verify PATH
check-claude# Check environment variables
check-claude
# Regenerate configuration
claude-refresh-token| Name | Version |
|---|---|
| terraform | >= 1.0 |
| azurerm | >=4.31.0 |
| databricks | >=1.81.1 |
| Name | Version |
|---|---|
| azurerm | 4.57.0 |
No modules.
| Name | Type |
|---|---|
| azurerm_client_config.current | data source |
| azurerm_databricks_workspace.this | data source |
| azurerm_resource_group.this | data source |
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| catalog_name | Unity Catalog name for the volume | string |
n/a | yes |
| cluster_name | Name of the Databricks cluster | string |
n/a | yes |
| databricks_resource_id | The Azure resource ID for the Databricks workspace. Format: /subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.Databricks/workspaces/{workspace-name} | string |
n/a | yes |
| autotermination_minutes | Minutes of inactivity before cluster auto-terminates | number |
30 |
no |
| cluster_mode | Cluster mode: STANDARD or SINGLE_NODE | string |
"STANDARD" |
no |
| init_script_source_path | Local path to the init script | string |
null |
no |
| max_workers | Maximum number of workers for autoscaling | number |
3 |
no |
| min_workers | Minimum number of workers for autoscaling | number |
1 |
no |
| mlflow_experiment_name | MLflow experiment name for Claude Code tracing | string |
"/Workspace/Shared/claude-code-tracing" |
no |
| node_type_id | Node type for the cluster. Default is Standard_D8pds_v6 (modern, premium SSD + local NVMe). If unavailable in your region, consider Standard_DS13_v2 as fallback. | string |
"Standard_D8pds_v6" |
no |
| num_workers | Number of worker nodes (null for autoscaling) | number |
null |
no |
| schema_name | Schema name for the volume | string |
"default" |
no |
| spark_version | Databricks Runtime version | string |
"17.3.x-cpu-ml-scala2.13" |
no |
| tags | Custom tags for the cluster | map(string) |
{ |
no |
| volume_name | Volume name to store init scripts | string |
"coding_assistants" |
no |
| Name | Description |
|---|---|
| cluster_id | The ID of the created cluster |
| cluster_name | Name of the created cluster |
| cluster_url | URL to access the cluster in Databricks UI |
| init_script_path | Path to the init script in the volume |
| mlflow_experiment_name | MLflow experiment name for tracing |
| setup_instructions | Instructions for using the cluster |
| volume_full_name | Full name of the volume |
| volume_path | Path to the volume containing init scripts |