Skip to content

Conversation

@hiroyukinakazato-db
Copy link

@hiroyukinakazato-db hiroyukinakazato-db commented Sep 30, 2025

Changes

This PR adds Switch, an LLM-powered transpiler, as an optional component in Lakebridge's install-transpile workflow. Switch installation is controlled by the new --include-llm-transpiler flag.

What does this PR do?

Implements complete Switch transpiler integration with idempotent installation, workspace deployment, job management, and resource configuration including Unity Catalog Volume for Switch.

Relevant implementation details

CLI Integration:

  • Add --include-llm-transpiler flag to install-transpile command (default: false)
  • Switch installation is opt-in, allowing users to choose between LSP-only or LLM-powered transpilation
  • Bladebridge and Morpheus continue to install by default

SwitchInstaller Implementation:

  • Extends TranspilerInstaller with unified constructor signature
  • WheelInstaller pattern for PyPI package management (databricks-switch-plugin)
  • Workspace deployment: uploads Switch package from site-packages to /Users/{user}/.lakebridge/switch/
  • Job creation: creates LAKEBRIDGE_Switch job with NotebookTask for parallel LLM processing
  • Resource configuration: prompts for catalog, schema, and volume for Switch
  • Idempotent behavior: supports reinstallation after uninstall with full recovery

Uninstall Integration:

  • Removes Switch job from workspace via InstallState
  • Logs manual cleanup instructions for validation schema and Switch resources (catalog, schema, volume)
  • Integrated into databricks labs uninstall lakebridge workflow

Caveats/things to watch out for when reviewing:

  • Opt-in by default: Switch is NOT installed by default. Users must explicitly specify --include-llm-transpiler flag to install Switch
  • User agent: Added include-llm-transpiler, "true" to user agent
  • Pylint configuration: Updated max-args from 12 to 13 in pyproject.toml to accommodate new CLI parameter
  • Resource lifecycle: Catalog, schema, and volume for Switch are prompted during install but require manual cleanup after uninstall
  • Job management: Switch job is tracked in InstallState and managed independently from Reconciliation jobs

Linked issues

Resolves #2048

Console Output

~ ❯ databricks labs install lakebridge@<branch_name>                                                                                                                                        
10:12:38     INFO [d.l.lakebridge.install] Detected installed transpilers: ['Bladebridge', 'Morpheus']
10:12:38     INFO [d.l.lakebridge.install] Checking for Bladebridge upgrades...
10:12:39     INFO [d.l.l.transpiler.installers] databricks-bb-plugin v0.1.19 already installed
10:12:39     INFO [d.l.lakebridge.install] Checking for Morpheus upgrades...
10:12:39     INFO [d.l.l.transpiler.installers] Databricks databricks-morph-plugin transpiler v0.6.10 already installed
10:12:39     INFO [d.labs.lakebridge] Successfully Setup Lakebridge Components Locally
10:12:39     INFO [d.labs.lakebridge] For more information, please visit https://databrickslabs.github.io/lakebridge/
~ ❯ databricks labs lakebridge install-transpile --include-llm-transpiler true                                                                                                                  
10:15:18     INFO [d.l.l.transpiler.installers] Databricks databricks-morph-plugin transpiler v0.6.10 already installed
10:15:18     INFO [d.l.l.transpiler.installers] databricks-bb-plugin v0.1.19 already installed
10:15:18     INFO [d.l.lakebridge.install] Configuring lakebridge `transpile`.
10:15:19     INFO [d.l.lakebridge.install] Couldn't find existing `transpile` installation
10:15:19  WARNING [d.l.lakebridge.install] Installation is not interactive, skipping configuration of transpilers.
10:15:19  WARNING [d.l.lakebridge.install] feature/llm-transpile is not a valid version.
10:15:31     INFO [d.l.lakebridge.install] Installing Switch transpiler to workspace.
10:15:32     INFO [d.l.l.deployment.switch] Copying resources to /Users/<>/.lakebridge/switch in workspace.......
10:16:27     INFO [d.l.l.deployment.switch] Completed Copying resources to /Users/<>/.lakebridge/switch in workspace...
10:16:28     INFO [d.l.l.deployment.switch] Setting up Switch job in workspace...
10:16:29     INFO [d.l.l.deployment.switch] Creating new Switch job
10:16:34     INFO [d.l.l.deployment.switch] Switch job created/updated: https://<workspacename>/jobs/<job_id>
10:16:34     INFO [d.l.lakebridge.install] Installation completed successfully! Please refer to the documentation for the next steps.

Functionality

  • added relevant user documentation
  • added new CLI command
  • modified existing command: databricks labs lakebridge install-transpile (adds --include-llm-transpiler flag and Switch installation support)
  • modified existing command: databricks labs uninstall lakebridge (adds Switch job and resource cleanup)

Tests

  • manually tested
  • added unit tests
  • added integration tests

- Add Switch installer with resource configuration and job creation
- Implement uninstall functionality with proper cleanup
- Add comprehensive test coverage for SwitchInstaller
- Improve path handling and type-safe configuration
- Add include-llm-transpiler option for flexible installation
@hiroyukinakazato-db hiroyukinakazato-db added enhancement New feature or request feat/cli actions that are visible to the user labels Sep 30, 2025
@github-actions
Copy link

github-actions bot commented Sep 30, 2025

✅ 51/51 passed, 9 flaky, 3m48s total

Flaky tests:

  • 🤪 test_validate_non_empty_tables (12ms)
  • 🤪 test_validate_mixed_checks (198ms)
  • 🤪 test_validate_successful_schema_check (194ms)
  • 🤪 test_validate_invalid_schema_path (1ms)
  • 🤪 test_validate_invalid_schema_check (1ms)
  • 🤪 test_transpile_teradata_sql_non_interactive[True] (18.071s)
  • 🤪 test_transpile_teradata_sql (19.651s)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[True] (3.125s)
  • 🤪 test_transpile_teradata_sql_non_interactive[False] (7.494s)

Running from acceptance #2883

Implement SwitchInstaller to integrate Switch transpiler with Lakebridge:
- Install Switch package to local virtual environment and deploy to workspace
- Create and manage Databricks job for Switch transpilation
- Configure Switch resources (catalog, schema, volume) interactively
- Support job-level parameters with JobParameterDefinition for flexibility
- Handle installation state and job lifecycle management
- Add comprehensive test suite covering installation, job management, and configuration
The SwitchInstaller was failing to find the config when the config.yml
used "Switch" (capitalized) as the name, while the code only checked
for "switch" (lowercase). This caused job creation to fail with a
"config.yml not found" error.

Updated _get_switch_job_parameters() to check both the display name
(capitalized) and transpiler ID (lowercase) to handle both cases.
@hiroyukinakazato-db hiroyukinakazato-db marked this pull request as ready for review October 8, 2025 02:14
@hiroyukinakazato-db hiroyukinakazato-db requested a review from a team as a code owner October 8, 2025 02:14
wheel_name = self._PYPI_PACKAGE_NAME.replace("-", "_")
return wheel_name in artifact.name and artifact.suffix == ".whl"

def install(self, artifact: Path | None = None) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we have separated things out: the installer installs things locally, deployer deploys to the workspace. You need something similar to recon deployer that way you don't need to have workspace client inside TranspilerInstaller

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Updated in 8439314 - created SwitchDeployment for workspace operations (following ReconDeployment pattern) and removed workspace dependencies from TranspilerInstaller.

Separates Switch transpiler's local installation logic from workspace
deployment, following established patterns (BladebridgeInstaller for
local installation, ReconDeployment for workspace deployment).

Key changes:
- Add SwitchDeployment class (~260 lines) for workspace operations
- Simplify SwitchInstaller to match BladebridgeInstaller pattern (~20 lines)
- Add include_llm and switch_resources fields to TranspileConfig
- Update WorkspaceInstallation to use SwitchDeployment
- Refactor tests to avoid protected member access using fixture separation
- Group Switch-related tests in TestSwitchInstallation class
Copy link
Collaborator

@gueniai gueniai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 122 to 123
except (RuntimeError, ValueError, InvalidParameterValue) as e:
logger.error(f"Failed to create/update Switch job: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path will lead to confusion, because we return normally. Subsequent logs include:

  • DEBUG: Switch deployment completed.
  • INFO: Installation completed successfully! Please refer to the documentation for the next steps.

Neither are true. The easiest fix right now is to return a boolean indicating success. (Alternatively, we could re-raise the exception.)

Note: we also currently lose the stacktrace on failure. Maybe logger.exception() instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feat/cli actions that are visible to the user

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants