Skip to content

Latest commit

 

History

History
363 lines (270 loc) · 14.2 KB

File metadata and controls

363 lines (270 loc) · 14.2 KB

Provisioning Databricks Cluster with Claude Code CLI

This template provides a self-contained deployment of a Databricks cluster pre-configured with Claude Code CLI for AI-assisted development directly on the cluster.

What Gets Deployed

  • Unity Catalog Volume for init script storage
  • Databricks cluster with Claude Code CLI auto-installed on startup
  • MLflow experiment for tracing Claude Code sessions
  • Bash helper functions for easy usage

How to use

  1. Copy terraform.tfvars.example to terraform.tfvars
  2. Update terraform.tfvars with your values:
    • databricks_resource_id: Your Azure Databricks workspace resource ID
    • cluster_name: Name for your cluster
    • catalog_name: Unity Catalog name to use
  3. (Optional) Customize cluster configuration in terraform.tfvars (node type, autoscaling, etc.)
  4. (Optional) Configure your remote backend
  5. Run terraform init to initialize terraform and get provider ready
  6. Run terraform plan to review the resources that will be created
  7. Run terraform apply to create the resources

Prerequisites

  • Databricks workspace with Unity Catalog enabled
  • Unity Catalog with an existing catalog and schema
  • Unity Catalog metastore must have a root storage credential configured (required for volumes)
  • Permission to create clusters
  • (For Azure) Authenticated via az login or environment variables
  • Databricks Runtime 14.3 LTS or higher recommended

Note: If you encounter an error about missing root storage credential, you need to configure the metastore's root storage credential first. See Databricks documentation for details.

Post-Deployment

After the cluster starts, you can connect via SSH to use Claude Code and other development tools.

1. Configure SSH Tunnel

Use the Databricks CLI to set up SSH access to your new cluster:

# Authenticate if needed
databricks auth login --host https://your-workspace-url.cloud.databricks.com

# Set up SSH config (replace 'claude-dev' with your preferred alias)
databricks ssh setup --name claude-dev
# Select your cluster from the list when prompted

This creates an entry in your ~/.ssh/config file.

2. Connect via VSCode or Cursor

  1. Install the Remote - SSH extension in VSCode or Cursor.
  2. Open the Command Palette (Cmd+Shift+P / Ctrl+Shift+P).
  3. Select Remote-SSH: Connect to Host.
  4. Choose claude-dev (or the alias you created).
  5. Select Linux as the platform.
  6. Once connected, open your persistent workspace folder: /Workspace/Users/<your-email>/.

Important: Work Storage Location ⚠️ DO NOT use Databricks Repos (/Repos/...) for active development work. Repos folders can be unreliable for persistent storage and may lose uncommitted changes during cluster restarts or sync operations.

Use /Workspace/Users/<your-email>/ instead. This location provides reliable persistent storage. You can use regular git commands to manage version control (see "Using Git in /Workspace" section below).

3. Launch Claude Code

Open the terminal in your remote VSCode/Cursor session and run:

# 1. Load environment variables and helpers
source ~/.bashrc

# 2. Enable MLflow tracing (optional but recommended)
claude-tracing-enable

# 3. Start Claude Code
claude

First-time setup tips:

  • Claude will ask for file permissions; use Shift+Tab to auto-allow edits in the current directory.
  • If you need to refresh credentials, run claude-refresh-token.

4. Remote Web App Development (Port Forwarding)

VSCode and Cursor automatically forward ports. For example, to run a Streamlit app:

  1. Create app.py:
    import streamlit as st
    st.title("Databricks Remote App")
    st.write("Running on cluster!")
  2. Run it:
    streamlit run app.py --server.port 8501
  3. Click "Open in Browser" in the popup notification to view it at localhost:8501.

5. Using the Databricks Python Interpreter

You don't need to configure a virtual environment. Databricks manages it for you.

  1. In the remote terminal, find the python path:
    echo $DATABRICKS_VIRTUAL_ENV
    # Output example: /local_disk0/.ephemeral_nfs/envs/pythonEnv-xxxx/bin/python
  2. In VSCode/Cursor, open the Command Palette and select Python: Select Interpreter.
  3. Paste the path from above.

6. Persistent Sessions with tmux

To keep your agent running even if you disconnect:

# Start a new session
tmux new -s claude-session

# Detach (Ctrl+B, then D)
# Reattach later
tmux attach -t claude-session

This allows you to leave long-running tasks (like "Build a data pipeline") executing on the cluster while you are offline.

7. Using Git in /Workspace

Since /Workspace doesn't have native Repos integration, use standard git commands:

# Navigate to your workspace directory
cd /Workspace/Users/<your-email>/

# Option 1: Clone an existing repository
git clone https://github.com/your-org/your-repo.git
cd your-repo

# Option 2: Initialize a new repository
mkdir my-project && cd my-project
git init
git remote add origin https://github.com/your-org/your-repo.git

# Configure git (first time only)
git config user.name "Your Name"
git config user.email "your.email@company.com"

# Regular git workflow
git add .
git commit -m "Your commit message"
git push origin main

Git Authentication Options:

  1. Personal Access Token (PAT) - Recommended:

    # GitHub: Create at https://github.com/settings/tokens
    # Use token as password when prompted
    git clone https://github.com/your-org/repo.git
  2. SSH Keys:

    # Generate SSH key on the cluster
    ssh-keygen -t ed25519 -C "your.email@company.com"
    
    # Add to GitHub: Copy output and add at https://github.com/settings/keys
    cat ~/.ssh/id_ed25519.pub
    
    # Clone using SSH
    git clone git@github.com:your-org/repo.git
  3. Git Credential Manager:

    # Store credentials to avoid repeated prompts
    git config --global credential.helper store

Helper Commands

Claude CLI Commands

Command Purpose
check-claude Verify Claude CLI installation and configuration
claude-debug Show detailed Claude configuration
claude-refresh-token Regenerate Claude settings from environment
claude-token-status Check token freshness and auto-refresh status
claude-tracing-enable Enable MLflow tracing for Claude sessions
claude-tracing-status Check tracing status
claude-tracing-disable Disable tracing

Git Workspace Commands

Command Purpose
git-workspace-init Interactive setup for git in /Workspace (clone or init)
git-workspace-check Verify location and check for uncommitted/unpushed changes
git-workspace-setup-auth Configure git authentication (PAT, SSH, or credential helper)

These helpers warn you if working in /Repos and ensure your work is backed up in git.

VS Code/Cursor Remote Commands

Command Purpose
claude-vscode-setup Show Remote SSH setup instructions
claude-vscode-env Get Python interpreter path for IDE
claude-vscode-check Verify Remote SSH configuration
claude-vscode-config Generate settings.json snippet

Offline Installation

For air-gapped or restricted network environments, use the separate offline module: adb-coding-assistants-cluster-offline. See the Offline Installation Guide for detailed instructions.

Configuration Examples

Single-Node Development Cluster

cluster_mode = "SINGLE_NODE"
num_workers  = 0
node_type_id = "Standard_D8pds_v6"

Autoscaling Production Cluster

cluster_mode = "STANDARD"
num_workers  = null  # Enable autoscaling
min_workers  = 2
max_workers  = 8
node_type_id = "Standard_D8pds_v6"

Authentication

This example uses Databricks unified authentication. Authentication can be provided via:

  1. Azure CLI (recommended for local development):

    az login
    terraform apply
  2. Environment Variables (recommended for CI/CD):

    export DATABRICKS_HOST="https://adb-xxx.azuredatabricks.net"
    export DATABRICKS_TOKEN="dapi..."
    terraform apply
  3. Configuration Profile:

    export DATABRICKS_CONFIG_PROFILE="my-profile"
    terraform apply

For more details on authentication, see the Databricks unified authentication documentation.

Troubleshooting

Init Script Fails

Check cluster event logs in the Databricks UI under ComputeYour ClusterEvent Log.

Common issues:

  • Network connectivity to download packages
  • Unity Catalog volume permissions
  • Insufficient cluster permissions

Claude Not Found After Login

# Reload bashrc
source ~/.bashrc

# Verify PATH
check-claude

Authentication Issues

# Check environment variables
check-claude

# Regenerate configuration
claude-refresh-token

Additional Resources

Requirements

Name Version
terraform >= 1.0
azurerm >=4.31.0
databricks >=1.81.1

Providers

Name Version
azurerm 4.57.0

Modules

No modules.

Resources

Name Type
azurerm_client_config.current data source
azurerm_databricks_workspace.this data source
azurerm_resource_group.this data source

Inputs

Name Description Type Default Required
catalog_name Unity Catalog name for the volume string n/a yes
cluster_name Name of the Databricks cluster string n/a yes
databricks_resource_id The Azure resource ID for the Databricks workspace. Format: /subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.Databricks/workspaces/{workspace-name} string n/a yes
autotermination_minutes Minutes of inactivity before cluster auto-terminates number 30 no
cluster_mode Cluster mode: STANDARD or SINGLE_NODE string "STANDARD" no
init_script_source_path Local path to the init script string null no
max_workers Maximum number of workers for autoscaling number 3 no
min_workers Minimum number of workers for autoscaling number 1 no
mlflow_experiment_name MLflow experiment name for Claude Code tracing string "/Workspace/Shared/claude-code-tracing" no
node_type_id Node type for the cluster. Default is Standard_D8pds_v6 (modern, premium SSD + local NVMe). If unavailable in your region, consider Standard_DS13_v2 as fallback. string "Standard_D8pds_v6" no
num_workers Number of worker nodes (null for autoscaling) number null no
schema_name Schema name for the volume string "default" no
spark_version Databricks Runtime version string "17.3.x-cpu-ml-scala2.13" no
tags Custom tags for the cluster map(string)
{
"Environment": "dev",
"Purpose": "coding-assistants"
}
no
volume_name Volume name to store init scripts string "coding_assistants" no

Outputs

Name Description
cluster_id The ID of the created cluster
cluster_name Name of the created cluster
cluster_url URL to access the cluster in Databricks UI
init_script_path Path to the init script in the volume
mlflow_experiment_name MLflow experiment name for tracing
setup_instructions Instructions for using the cluster
volume_full_name Full name of the volume
volume_path Path to the volume containing init scripts