-
Notifications
You must be signed in to change notification settings - Fork 78
Add llm-transpile command with Switch runner and integration tests
#2078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hiroyukinakazato-db
wants to merge
111
commits into
main
Choose a base branch
from
feature/llm-transpile
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+556
−1
Open
Changes from 85 commits
Commits
Show all changes
111 commits
Select commit
Hold shift + click to select a range
dd726b5
feat: integrate Switch transpiler with Lakebridge installer
hiroyukinakazato-db febb62d
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db fa26b4c
fix: remove undefined URLError from exception handling
hiroyukinakazato-db 6511e20
refactor: streamline SwitchInstaller deployment logic and update tests
hiroyukinakazato-db 33ea7de
refactor: simplify SwitchInstaller test structure and improve assertions
hiroyukinakazato-db d0c63c3
Merge remote-tracking branch 'origin/main' into feature/switch-instal…
hiroyukinakazato-db 7cb9ea9
feat: add Switch transpiler installer for Lakebridge integration
hiroyukinakazato-db 467dea9
fix: support case-insensitive config lookup in SwitchInstaller
hiroyukinakazato-db 57298b0
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 09c0eb8
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 8439314
refactor: separate Switch installation from workspace deployment
hiroyukinakazato-db 5f66f3f
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db fae9880
feat: add llm-transpile command with Switch integration
hiroyukinakazato-db 2ee157f
refactor: encapsulate Switch package path resolution in SwitchDeployment
hiroyukinakazato-db 9dc4b04
refactor: encapsulate Switch package path resolution in SwitchDeployment
hiroyukinakazato-db 7637234
test: update Switch installation tests for refactored interface
hiroyukinakazato-db b736965
test: update Switch installation tests for refactored interface
hiroyukinakazato-db bacd5f6
fix: update error messages to include 'true' flag for install-transpi…
hiroyukinakazato-db 729cb0d
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 21b6629
Merge branch 'main' into feature/llm-transpile
hiroyukinakazato-db 42ce0df
fix: exclude wait_for_completion from Switch job parameters
hiroyukinakazato-db 81c32e5
fix: exclude wait_for_completion from Switch job parameters
hiroyukinakazato-db 13bcc15
chore: update Switch wheel with wait_for_completion fix
hiroyukinakazato-db 8dcf8f3
feat: add E2E test for Switch transpiler with environment variable co…
hiroyukinakazato-db 83678b8
feat: enhance E2E testing for Switch with resource management and uni…
hiroyukinakazato-db f698470
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 22cadc9
Defaults in `labs.yml` are strings.
asnare b3d2441
Update flag description to use placeholder syntax.
asnare ac7e2a4
Disable flag pending completion of integration.
asnare ee5c892
chore: merge main into feature/llm-transpile
hiroyukinakazato-db f0426e1
Leave pylint's max-args as-is.
asnare 934c2e8
Remove unnecessary include_llm arguments.
asnare 74923cc
Refactor Switch installation.
asnare 084f90f
upgrade to latest switch plugin
sundarshankar89 61f796f
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 0c1d1d5
fixed package dependencies
sundarshankar89 6aeea25
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 468f8de
added additional configuration for making switch
sundarshankar89 6a57570
Latest Switch
sundarshankar89 2c3d153
Sorted List for FMAPI
sundarshankar89 f41dee8
setting logging level
sundarshankar89 950c1b8
setting logging level
sundarshankar89 6ca78ed
setting logging level
sundarshankar89 ba65df4
setting logging level
sundarshankar89 2e2abcb
setting logging level
sundarshankar89 bcbe4df
make default as first choice
sundarshankar89 486250f
fix tests
sundarshankar89 fc1ddca
fix tests
sundarshankar89 42c9c4e
fixes few bugs
sundarshankar89 1831076
update databricks-switch-plugin dependency to version 0.1.4
hiroyukinakazato-db 01a0c87
Review Comments
sundarshankar89 ccce0f2
Review Comments
sundarshankar89 ac382d6
Merge branch 'main' into feature/llm-transpile
sundarshankar89 4600583
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 7f0eaa4
Rebased from switch installer integration
sundarshankar89 0e22abe
Rebased from switch installer integration
sundarshankar89 43cc0f5
Intermediate check in
sundarshankar89 bb7c3d6
Intermediate check in
sundarshankar89 bd70638
Intermediate check in
sundarshankar89 0eb1570
Intermediate check in
sundarshankar89 eae5997
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 9823201
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 eb46f24
initial tests
sundarshankar89 34c9f8f
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 c54e68f
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 23df37b
added tests for configurator
sundarshankar89 4a0bf49
added tests for installer
sundarshankar89 c49c5b3
added tests for installer
sundarshankar89 fab0e87
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 078a0bc
added tests for switch
sundarshankar89 1e10b60
added flag to fail if users use regular transpile after installing sw…
sundarshankar89 394ad9d
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 69f93b2
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 aeff475
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 7a336f9
added additional tests improved cuj
sundarshankar89 2602577
added user agent extra
sundarshankar89 b5bcbcd
Merge branch 'main' into feature/switch-installer-integration
asnare 6132f04
removed interactive prompt with include_llm_transpiler
sundarshankar89 fb2ae71
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 53077ec
removed interactive prompt with include_llm_transpiler
sundarshankar89 186ab59
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 508d6c1
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 363a31d
execute llm transpile
sundarshankar89 c01d605
execute llm transpile
sundarshankar89 ebf969a
execute llm transpile
sundarshankar89 eb1d497
Merge branch 'main' into feature/switch-installer-integration
asnare 52d6613
Log why the configuration questionnaire won't happen.
asnare 11903f9
Logging tweaks.
asnare a62194d
Fix incorrect comment.
asnare 24ad1b3
Style tweak, mark a method as static.
asnare 4083dd6
Merge branch 'feature/switch-installer-integration' into feature/llm-…
asnare 59fe56e
Merge branch 'main' into feature/switch-installer-integration
gueniai ab3851e
Merge branch 'feature/switch-installer-integration' into feature/llm-…
gueniai 5a26f86
addressed review comments
sundarshankar89 efc3c64
Merge remote-tracking branch 'origin/feature/llm-transpile' into feat…
sundarshankar89 1a64612
addressed review comments
sundarshankar89 6ac2e68
addressed review comments
sundarshankar89 9799a23
addressed review comments
sundarshankar89 ddf7f1a
update latest switch plugin
sundarshankar89 0323ae0
Merge branch 'feature/switch-installer-integration' into feature/llm-…
sundarshankar89 cff62b1
removed duplicate
sundarshankar89 a5e06e3
Avoid raising DatabricksError ourselves.
asnare 3ce1ffd
Avoid raising DatabricksError ourselves.
asnare df7b15c
Merge branch 'feature/switch-installer-integration' into feature/llm-…
asnare a659992
Update llm-transpile CLI description and help text.
asnare 459d941
Foundational -> Foundation
asnare 1aab9bf
Improve logging if there as issue locating the Switch job.
asnare c9bfda0
Simplify logic for computing the path where sources to convert are up…
asnare 8ec00ca
Merge branch 'main' into feature/llm-transpile
asnare eefda07
Remove unnecessary exception wrapping.
asnare e0814c3
Reformat the legal disclaimer for LLM use, and adjust grammar.
asnare File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,3 +22,4 @@ remorph_transpile/ | |
| /linter/src/main/antlr4/library/gen/ | ||
| .databricks-login.json | ||
| .mypy_cache | ||
| .env | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,9 +36,11 @@ | |
| from databricks.labs.lakebridge.transpiler.lsp.lsp_engine import LSPEngine | ||
| from databricks.labs.lakebridge.transpiler.repository import TranspilerRepository | ||
| from databricks.labs.lakebridge.transpiler.sqlglot.sqlglot_engine import SqlglotEngine | ||
| from databricks.labs.lakebridge.transpiler.switch_runner import SwitchRunner | ||
| from databricks.labs.lakebridge.transpiler.transpile_engine import TranspileEngine | ||
|
|
||
| from databricks.labs.lakebridge.transpiler.transpile_status import ErrorSeverity | ||
| from databricks.labs.switch.lsp import get_switch_dialects | ||
|
|
||
|
|
||
| # Subclass to allow controlled access to protected methods. | ||
|
|
@@ -92,7 +94,7 @@ def _remove_warehouse(ws: WorkspaceClient, warehouse_id: str): | |
|
|
||
|
|
||
| @lakebridge.command | ||
| def transpile( # pylint: disable=too-many-arguments | ||
| def transpile( | ||
| *, | ||
| w: WorkspaceClient, | ||
| transpiler_config_path: str | None = None, | ||
|
|
@@ -240,6 +242,14 @@ def use_transpiler_config_path(self, transpiler_config_path: str | None) -> None | |
| ) | ||
| self._config = dataclasses.replace(self._config, transpiler_config_path=transpiler_config_path) | ||
|
|
||
| # Switch is installed inside "/Users/<>/.lakebridge/transpilers/Switch/lsp/config.yml | ||
| if ( | ||
| self._config.transpiler_config_path is not None | ||
| and Path(self._config.transpiler_config_path).parent.parent.name == "Switch" | ||
| ): | ||
| msg = "Switch transpiler is not supported through `transpile` run `llm-transpile` instead." | ||
| raise RuntimeError(msg) | ||
|
|
||
sundarshankar89 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| def use_source_dialect(self, source_dialect: str | None) -> None: | ||
| if source_dialect is not None: | ||
| # Defer validation: depends on the transpiler config path, we'll deal with this later. | ||
|
|
@@ -730,6 +740,7 @@ def install_transpile( | |
| w: WorkspaceClient, | ||
| artifact: str | None = None, | ||
| interactive: str | None = None, | ||
| include_llm_transpiler: bool = False, | ||
| transpiler_repository: TranspilerRepository = TranspilerRepository.user_home(), | ||
| ) -> None: | ||
| """Install or upgrade the Lakebridge transpilers.""" | ||
|
|
@@ -738,9 +749,15 @@ def install_transpile( | |
| ctx.add_user_agent_extra("cmd", "install-transpile") | ||
| if artifact: | ||
| ctx.add_user_agent_extra("artifact-overload", Path(artifact).name) | ||
| if include_llm_transpiler: | ||
| ctx.add_user_agent_extra("include-llm-transpiler", "true") | ||
| # Decision was made not to prompt when include_llm_transpiler is set, and we expect users to use llm-transpile and pass all the arguments | ||
| is_interactive = False | ||
| user = w.current_user | ||
| logger.debug(f"User: {user}") | ||
| transpile_installer = installer(w, transpiler_repository, is_interactive=is_interactive) | ||
| transpile_installer = installer( | ||
| w, transpiler_repository, is_interactive=is_interactive, include_llm=include_llm_transpiler | ||
| ) | ||
| transpile_installer.run(module="transpile", artifact=artifact) | ||
|
|
||
|
|
||
|
|
@@ -818,6 +835,133 @@ def analyze( | |
| logger.debug(f"User: {ctx.current_user}") | ||
|
|
||
|
|
||
| def _validate_llm_transpile_args( | ||
| input_source: str | None, | ||
| output_ws_folder: str | None, | ||
| source_dialect: str | None, | ||
| prompts: Prompts, | ||
| ) -> tuple[str, str, str]: | ||
|
|
||
| _switch_dialects = get_switch_dialects() | ||
|
|
||
| # Validate presence after attempting to source from config | ||
| if not input_source: | ||
| input_source = prompts.question("Enter input SQL path") | ||
| if not output_ws_folder: | ||
| output_ws_folder = prompts.question("Enter output workspace folder must start with /Workspace/") | ||
| if not source_dialect: | ||
| source_dialect = prompts.choice("Select the source dialect", sorted(_switch_dialects)) | ||
|
|
||
| # Validate input_source path exists (local path) | ||
| if not Path(input_source).exists(): | ||
| raise_validation_exception(f"Invalid path for '--input-source': Path '{input_source}' does not exist.") | ||
|
|
||
| # Validate output_ws_folder is a workspace path | ||
| if not str(output_ws_folder).startswith("/Workspace/"): | ||
| raise_validation_exception( | ||
| f"Invalid value for '--output-ws-folder': workspace output path must start with /Workspace/. Got: {output_ws_folder!r}" | ||
| ) | ||
|
|
||
| if source_dialect not in _switch_dialects: | ||
| raise_validation_exception( | ||
| f"Invalid value for '--source-dialect': {source_dialect!r} must be one of: {', '.join(sorted(_switch_dialects))}" | ||
| ) | ||
|
|
||
| return input_source, output_ws_folder, source_dialect | ||
|
|
||
|
|
||
| @lakebridge.command | ||
| def llm_transpile( | ||
| *, | ||
| w: WorkspaceClient, | ||
| input_source: str | None = None, | ||
| output_ws_folder: str | None = None, | ||
| source_dialect: str | None = None, | ||
| catalog_name: str | None = None, | ||
| schema_name: str | None = None, | ||
| volume: str | None = None, | ||
| foundational_model: str | None = None, | ||
sundarshankar89 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ctx: ApplicationContext | None = None, | ||
| ) -> None: | ||
| """Transpile source code to Databricks using LLM Transpiler (Switch)""" | ||
| if ctx is None: | ||
| ctx = ApplicationContext(w) | ||
| del w | ||
| ctx.add_user_agent_extra("cmd", "llm-transpile") | ||
| user = ctx.current_user | ||
| logger.debug(f"User: {user}") | ||
|
|
||
| logger.info( | ||
| """Please read and accept the following comments before proceeding:\n | ||
| This Feature leverages a large language model (LLM) to analyse and convert your provided content, code and data.\n | ||
| You consent to your content being transmitted to, processed by, and returned from the LLM hosted by Databricks foundational models or other external models you may configure during the runtime.\n | ||
| The outputs of the LLM are generated automatically without human review, and may contain inaccuracies or errors. \n | ||
| You are responsible for reviewing and validating all outputs before relying on them for any critical or production use. \n | ||
| By running this feature you accept these conditions. | ||
| """ | ||
| ) | ||
sundarshankar89 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| prompts = ctx.prompts | ||
| resource_configurator = ctx.resource_configurator | ||
|
|
||
| # If CLI args are missing, try to read them from config.yml | ||
| input_source, output_ws_folder, source_dialect = _validate_llm_transpile_args( | ||
| input_source, | ||
| output_ws_folder, | ||
| source_dialect, | ||
| prompts, | ||
| ) | ||
|
|
||
| if catalog_name is None: | ||
| catalog_name = resource_configurator.prompt_for_catalog_setup(default_catalog_name="lakebridge") | ||
|
|
||
| if schema_name is None: | ||
| schema_name = resource_configurator.prompt_for_schema_setup(catalog=catalog_name, default_schema_name="switch") | ||
|
|
||
| if volume is None: | ||
| volume = resource_configurator.prompt_for_volume_setup( | ||
| catalog=catalog_name, schema=schema_name, default_volume_name="switch_volume" | ||
| ) | ||
|
|
||
| resource_configurator.has_necessary_access(catalog_name, schema_name, volume) | ||
|
|
||
| if foundational_model is None: | ||
| foundational_model = resource_configurator.prompt_for_foundation_model_choice() | ||
|
|
||
| job_list = ctx.install_state.jobs | ||
| if "Switch" not in job_list: | ||
| raise RuntimeError( | ||
| "Switch Job ID not found. " | ||
| "Please run 'databricks labs lakebridge install-transpile --include-llm-transpiler true' first." | ||
| ) | ||
| logger.debug("Switch job ID found in InstallState") | ||
| job_id = int(job_list["Switch"]) | ||
|
|
||
| try: | ||
| ctx.add_user_agent_extra("transpiler_source_dialect", source_dialect) | ||
| job_runner = SwitchRunner(ctx.workspace_client, ctx.installation) | ||
| volume_input_path = job_runner.upload_to_volume( | ||
| local_path=Path(input_source), | ||
| catalog=catalog_name, | ||
| schema=schema_name, | ||
| volume=volume, | ||
| ) | ||
|
|
||
| response = job_runner.run( | ||
| volume_input_path=volume_input_path, | ||
| output_ws_folder=output_ws_folder, | ||
| source_tech=source_dialect, | ||
| catalog=catalog_name, | ||
| schema=schema_name, | ||
| volume=volume, | ||
sundarshankar89 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| foundational_model=foundational_model, | ||
| job_id=job_id, | ||
| ) | ||
| json.dump(response, sys.stdout, indent=2) | ||
| except Exception as ex: | ||
| raise RuntimeError(ex) from ex | ||
|
||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| app = lakebridge | ||
| logger = app.get_logger() | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.