Releases: databrickslabs/lakebridge
v0.10.13
Analyzer
- Added defensive code to prevent analyzer crashes on DataStage files with empty array references - Fixes an issue where the DataStage analyzer would crash when encountering empty array references
Converters
Morpheus
General
-
Enhanced name representation consistency - Major refactoring that replaces String representations with Expression types for table names, column names, and constraints across IR nodes, improving SQL/PySpark code generation accuracy
-
Fixed DBT parsing issues - Resolved template parsing problems by changing template markers to
!#Jinja0001#!format and improving whitespace handling for proper tokenization
TSQL (Synapse/SQL Server)
-
Support for dual OUTPUT clauses in TSQL INSERT/DELETE/UPDATE statements - Enhanced T-SQL parser to handle complex statements with multiple OUTPUT clauses (OUTPUT ... INTO ... OUTPUT ...) with comprehensive test coverage
-
Fixed TSQL DECLARE statement handling - Refactored DECLARE statement processing by moving logic to dedicated visitor methods and properly marking unsupported statements for future implementation
-
Improved BLOCK structure parsing for BEGIN and BEGIN TRY statements - Updated parser grammar to support flexible scripting blocks and transaction handling, allowing zero or more statements in control flow constructs
-
Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation
Snowflake
-
Fixed Snowflake connection tests - Internal improvements for database connection test reliability
-
Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation
BladeBridge
General
- Automatically creates and cleans up temporary folders for embedded SQL conversion in wrapper scripts - Improves workflow management by implicitly creating temp folders and cleaning them up once conversion is complete
MSSQL (SQL Server)
-
Enhanced table variable and temporary table conversion - Added support for table variable conversion to temporary tables and improved string handling with logic to convert double single quotes to double quotes
-
Fixed semicolon placement in nested select statements - Resolved issue where semicolons appeared before comments in nested select statements
-
Improved MS SQL procedure handling - Added LIMIT 1 for Set in select statements, enhanced function mappings, fixed string concatenation, and removed unsupported constraints
Reconcile
No updates in this release.
Documentation
No updates in this release.
Contributors: @gueniai, @sundarshankar89
v0.10.12
Analyzer
- New installation verification command - Introduced a new command to verify successful installation of the Lakebridge Analyzer, displaying usage and available flags for report file paths, source directories, and source technologies
Converters
General
-
Enhanced transpile command - Updated
transpilecommand to support--overrides-pathand--target-technologyarguments for greater flexibility and customization -
Improved error handling - Enhanced handling of parsing errors during code transpilation to output transpiled code instead of original input, providing clearer outcomes when issues arise
-
Refactored naming conventions - Renamed transpiler
product_nametotranspiler_idthroughout the codebase for improved consistency and clarity
Morpheus
TSQL
-
Enhanced TSQL support - Added support for DENY statements, EXEC statement syntax improvements, COLLATION in CREATE TABLE column definitions, and WINDOW clause functionality
-
Improved ALTER DATABASE support - Enhanced support for all options on ALTER DATABASE SET statements and multiple LOG file specifications in ALTER DATABASE ADD LOG
-
Better JOIN functionality - Added support for all join hints (MERGE, HASH, LOOP, REDUCE, REPLICATE, REDISTRIBUTE) in JOIN constructs
-
Enhanced COPY INTO support - Fixed syntax for COPY INTO commands and added extended column definitions support in TSQL mode
-
Improved DELETE operations - Added transformation rule to translate
INtoEXISTSwhen needed in DELETE statement WHERE clauses
Snowflake
-
COPY INTO improvements - Refactored and standardized grammar rules for COPY INTO commands, consolidating stage location handling
-
UPDATE FROM enhancements - Added tests for UPDATE FROM statements to verify correct transpilation to MERGE INTO statements
General
-
Enhanced permission handling - Added support for column-specific privileges and improved handling of column-specific permissions
-
Improved parser functionality - Allowed SCHEMAS keyword to be used as identifier and clarified warning messages for unrecognized functions
BladeBridge
MSSQL
-
Fixed update_to_merge functionality - Improved WITH clause handling and script variable ordering for MSSQL dialects
-
Table variable support - Implemented table variable conversion support for MSSQL dialects
-
DDL operation fixes - Fixed and removed unsupported DDL operations including alter index, switch partitions, and drop constraints
Informatica
-
Power Center improvements - Fixed hanging issue on Linux for Informatica PC conversion by improving block_subst patterns and output flushing
-
Dataframe implementation fixes - Fixed dataframe implementation for pulling data from flat file unconnected lookups in Informatica Power Center
DataStage
- TRUNCATE TABLE support - Added spark.sql_template to resolve TRUNCATE TABLE statement generation when TRUNCATE flag is enabled in DataStage
Reconcile
- Enhanced Databricks schema queries - Fixed Databricks schema query to improve accuracy and reliability of schema reconciliation, with better column name consistency and filtering
Documentation
-
Updated CLI documentation - Refreshed documentation to reflect latest changes in Command Line Interface menus, including new commands and flags such as
transpile,reconcile, andinstall-transpilesubcommands -
Enhanced command documentation - Added detailed documentation for transpile command usage and flags, including optional flags for catalog name, error file path, and source dialect
-
Updated installation guides - Modified installation documentation to include verification examples and updated help flags for new command options
Dependency updates: -
Updated cryptography requirement from <45.1.0,>=44.0.2 to >=44.0.2,<46.1.0 (#2028).
-
Bump databrickslabs/sandbox/acceptance@acceptance/v0.4.2 from 0.4.2 to 0.4.4 (#1833).
Contributors: @asnare, @sundarshankar89, @m-abulazm, @dependabot[bot], @gueniai
v0.10.11
Analyzer
No updates in this release
Converters
General
- Fixed special character handling in filenames by introducing from_uri() helper function for safer URI handling
- Ensured SQL converter returns UTF-8 encoded files for proper character encoding
- Fixed filename to correctly output databricks_conversion_supplements.py supplemental file
- Fixed broken splitter URL by updating directory naming conventions from "Downloads" to "downloads"
- Improved handling of encoding-related errors by catching UnicodeDecodeError and LookupError exceptions during file processing, creating TranspileError with specific encoding-error codes instead of stopping
Morpheus
Snowflake
- Added support for TRUNCATE TABLE statements with proper IR and translation support
- Correctly support $IDENTITY and $ROWGUID system variables
- Refactored and extended grammar and AST support for SQL procedure creation with improved handling of raw string literals
- Enhanced schema reconciliation functionality to support Snowflake arrays, addressing the corner case where Databricks arrays are typed and Snowflake arrays are untyped
TSQL
- Added support for TRUNCATE TABLE statements with proper IR and translation support
- Support full CREATE and ALTER INDEX statements in TSQL parsing, rejecting INDEX CREATE/ALTER statements sensibly instead of raising syntax errors
- Fixed implementation of IF scripting blocks with improvements to SQL parser, grammar enhancements, and enhanced scripting grammar for more robust handling of block statements and conditional branches
- Allow CLUSTERED to be an identifier to improve CREATE TABLE syntax as a CONSTRAINT qualifier
- Support percentage expressions in TSQL options (e.g., OPT = 42%) instead of raising parsing errors
- Added support for REVOKE statements, similar to existing GRANT statement implementation
- Ensure that ROWS and OBJECTS can be used as identifiers even with Jinja templates
- Correctly support $IDENTITY and $ROWGUID system variables
General (Multiple Dialects)
- Support comments on column declarations when generating SQL and renamed legacy builders for consistency
- Refactored IR around CREATE FUNCTION and CREATE PROCEDURE, unifying all ways to create stored procedures under a single CreateStoredProcedure IR node and all ways to create user defined functions under a single CreateUDF IR node
- Implemented grammar and IR placeholders for named windows, introducing initial support for the SQL standard WINDOW clause in parser grammar
BladeBridge
Oracle
- Removed unsupported Oracle DDL constraints (add/create constraint unique) and extraneous TBLPROPERTIES from converted output
MSSQL
- Added handle_xml_nodes function for MS SQL processing
- Fixed multiple MSSQL issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, and parameter data types
Synapse
- Fixed multiple Synapse issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, parameter data types, error handling in stored procedures, and Synapse-specific features (e.g., table distribution)
Teradata
- Added Teradata function mappings including ZEROIFNULL, TEMPORAL_TIMESTAMP, TRYCAST, ANY, FIRST, NULLIFZERO, DECODE with different parameter counts, and HASHAMP
- Removed collect statistics and lock table statements
DataStage
- Implemented DataStage Checksum component translation to SparkSQL equivalent and fixed Pyspark checksum translation to use MD5() instead of SHA2()
Reconcile
- Added handling for special characters in reconcile aggregate, enhancing the library to handle special characters in column names by properly delimiting identifiers in SQL queries
- Fixed deploy reconcile jobs by updating wheel file handling, simplifying deployment process to use single wheel path, and fixing broken documentation links
Documentation
- Fixed download link in docs (reconcile automation) by replacing broken markdown link with JSX link utilizing useBaseUrl hook
General
- Implemented new describe-transpile CLI subcommand that describes installed transpilers, including their versions, configuration paths, and supported source dialects
- Switched from urllib to requests library for making HTTP calls to PyPI and Maven Central, with default 60-second timeout and improved error handling
- Work around DATABRICKS_HOST normalization issue during install and uninstall by introducing new Lakebridge subclass with appropriate workspace client
Dependency updates
- Bump Databricks SDK Version to 0.67.0 by @goodwillpunning in #2062
- Bump sigstore/gh-action-sigstore-python from 3.0.0 to 3.0.1 by @dependabot[bot] in #1753
Special thanks to @BrianDeacon for his contribution to fix #1858
Contributors: @asnare, @m-abulazm, @ihor-ki, @goodwillpunning, @sundarshankar89, @dependabot[bot]
v0.10.10
Analyzer
- Large XML file chunking optimization: Now the analyzer is able to handle large XML files (up to 1TB in size)
Converters
General
- Non-interactive transpiler installation: Introduced support for non-interactive installation mode with new
interactiveoption that can detect environment context, enabling automated installations without user input while preserving existing configurations. Resolves #2013
Morpheus
-
Enhanced GRANT statement support: Implemented comprehensive GRANT statement support by creating dedicated
permission.g4grammar file with IR definitions and translation rules for permission-related statements -
Improved error handling: Rewrote print function to properly handle newlines and added extensive unit tests for error annotation, including block and FIXME comments. Resolves #2030
-
Enhanced LSP server behavior: Improved LSP server to append original text to error messages when transpilation fails, eliminating need for client-side response manipulation
-
Standardized dialect options: Aligned dialect options to present
synapseandmssqlto users for consistency with bladebridge -
Fixed Lateral Column Alias handling: Enhanced dealiasing for Lateral Column Aliases (LCAs) in WHERE clauses under CASE...WHEN expressions. Resolves #1767
-
Enhanced GROUP BY/aggregation function dealiasing: Implemented dealiasing for Lateral Column Aliases in GROUP BY clauses and aggregation functions where LCA references are unsupported. Resolves (#956) and (#954)
-
Optimized Snowflake transformations: Reordered transformation rules to ensure
TransformWithinGroupprocesses all cases before the call mapper. Resolves #1231
BladeBridge
-
Enhanced merge statement handlers: Improved merge statement processing to fix backtick handling, update operations without WHERE clauses, procedure conversions, IF-THEN-SET blocks, and various delimiter and mapping issues
-
Fixed view creation with WITH clauses: Corrected CREATE VIEW functionality to properly handle WITH clause statements
-
Oracle script improvements: Resolved variable declaration issues in Oracle scripts containing exception handling blocks
-
SQL Server function mapping: Added function mappings for Microsoft SQL Server functions including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments
-
Fixed variable declarations: Resolved variable declaration issues in Oracle scripts that contain exception handling blocks
-
MSSQL Server Enhanced function mappings: Added comprehensive function mappings including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments
Reconcile
-
Improved logging for aggregate reconciliation: Enhanced logging functionality to provide more accurate messages by replacing warning logs with informational messages when aggregate details rules are empty, indicating successful reconciliation with no details to store. Resolves #2040
-
Refactored aggregate query building: Simplified code using
AggregateQueryBuilderclass to generate queries for both source and target in a more concise and efficient manner
Documentation
No updates in this release
Dependency updates:
- Bump actions/setup-python from 5 to 6 (#1988).
Contributors: @m-abulazm, @asnare, @dependabot[bot], @sundarshankar89
v0.10.9
Analyzer
- Fixed bug where Analyzer would crash with large DDL files
- Adjusted calculation of complexity for TSQL queries to make it more accurate
Transpilers
Morpheus
-
T-SQL Updates
- Advanced Statement Support: Added parsing for
CREATE CERTIFICATE,CREATE LOGIN,PRINTcommands, andEXECUTE AS LOGINstatements - SET Command Enhancements: Support for complex assignment operators (
+=,-=,*=,/=,%=,&=,^=,|=) commonly used in T-SQL scripts - CREATE EXTERNAL TABLE: Improved parsing with flexible syntax for external table definitions and location specifications
- GRANT/REVOKE Statements: Comprehensive support for T-SQL security statements with clear Unity Catalog migration guidance
- DROP Commands: Enhanced handling of
DROP SENSITIVITYand other specialized DROP variants - Improved Error Reporting: SQL output now includes
FIXMEcomments with detailed error messages for unsupported constructs
- Advanced Statement Support: Added parsing for
-
Snowflake Updates
- Analytics Functions: Full parsing support for
MATCH_RECOGNIZEclause with pattern analysis capabilities for complex analytical queries - Time Travel Queries: Enhanced handling of
CHANGES,AT, andBEFOREclauses for historical data access patterns - REGEXP_INSTR Function: Complete implementation supporting all 7 parameters (vs Databricks' 2), providing accurate behavioral translation
- Table-Valued Functions: Support for parsing inline table-valued functions commonly used in Snowflake
- GRANT/REVOKE Statements: Full support for Snowflake's complex privilege management syntax including roles and shares
- DROP Commands: Enhanced parsing for
DROP SENSITIVITYand related data governance statements - Improved Error Reporting: SQL output now includes
FIXMEcomments with detailed error messages for unsupported constructs
- Analytics Functions: Full parsing support for
Dependency updates:
- Bump actions/checkout from 4 to 5 (#1928).
- Bump actions/upload-pages-artifact from 3 to 4 (#1964).
- Bump mermaid from 11.6.0 to 11.10.1 in /docs/lakebridge (#1956).
Contributors: @dependabot[bot], @asnare, @m-abulazm
v0.10.8
Transpilers
General
- SQL Validation Enhancement: Improved SQL validator to check only SQL outputs with enhanced error handling and support for various transpile results (#1949)
- Error Handling Improvements: Added static error lookups for specific cases like unresolved routines and columns, with more readable exception messages
- MIME Support: New functionality to support both MIME and non-MIME transpile results, including validation and output file management
- LSP Server Integration: Log level now passed to Language Server Protocol (LSP) server via environment variable for greater flexibility (#1967)
- Transpiler Auto-Upgrade: Enhanced installer to automatically upgrade existing Lakebridge transpilers during CLI upgrade process (#1978)
- Source Dialect Handling: Fixed missing transpile source dialect handling to ensure correct assignment in configuration objects (#1985)
Morpheus
-
Enhanced Snowflake Conversion support:
- Support for parsing ILIKE, EXCLUDE, REPLACE, RENAME with * LHS
- Full support for EXCLUDE and RENAME clauses and all combinations
- Fixed REPLACE function with optional third argument
- Enhanced OBJECT_DELETE to accept 2 or more arguments
- Accurate translation of Snowflake's REGEXP_REPLACE
-
Parser Improvements:
- Allow lists of generic options with optional commas
- EXTERNAL can now be used as an ID despite being documented as reserved
- Support for DROP RULE syntax in TSQL
- Allow DBT Jinja macros within JSON literals
- Fixed bugs around DBT elseif and comment nodes
-
Error Handling: Upgraded SimpleError with support status and simplified user-facing parse error messages
-
Integration Alignment: Updated error handling to align with BladeBridge, now returning
UNRESOLVED_ROUTINEerrors consistently (#1998)
BladeBridge
-
XML Source Processing:
- Automatic detection of XML sources with proper encoding preservation
- Maintains UTF-8 encoding while respecting XML-specific encoding declarations
- Prevents XML parser failures from encoding mismatches
-
SQL Scripting Enhancements:
- Fixed nested comment handling in SQL scripts
- Improved custom configuration handling for first-match processing
- Removed unnecessary begin/end enclosures in pre/post SQL blocks
-
Teradata Updates: Enhanced
convert_update_to_mergefunctionality -
Oracle Updates:
- Replaced list partitioning with
CLUSTER BYstatements - Removed unsupported
CREATE INDEXandALTER INDEXstatements - Fixed
CREATE PROCEDUREsignature generation with proper exception handling
- Replaced list partitioning with
-
DataStage Updates:
- Added support for
TRUNCATE TABLEspecifications (#1903) - Fixed column name handling when dataframe columns match job parameters
- Enabled single-pass processing of shared containers
- Resolved dataset component path issues for proper PySpark code generation
- Added support for
Reconcile
- Schema Normalization: Added feature flag for identifier normalization with optional
normalizeparameter inget_schemamethod for flexible handling of different data source configurations (#1953)
Enhanced Connection Support
- Snowflake Security: Added support for encrypted PEM private keys with
pem_private_key_passwordfield for secure authentication (#1869) - JDBC URL Handling: Improved JDBC URL arguments handling with enhanced error handling and logging
- Connection Properties: Enhanced SecretsMixin class with new
_get_secret_or_nonemethod for better secret value retrieval - Error Handling: Introduced new exceptions like
InvalidSnowflakePemPrivateKeyfor better error management
Documentation
Comprehensive Documentation Updates
- MS SQL and Synapse: Enhanced documentation for reconcile connections including default secret naming conventions and required connection properties (#1954)
- Connection Configuration: Added clear YAML format examples for MS SQL connection properties covering user, password, host, port, database, encryption, and trust server certificate
- BladeBridge Updates: Minor naming correction from "Microsoft MS SQL Server" to "Microsoft SQL Server" while maintaining support for Oracle, Teradata, Netezza, Informatica, and DataStage
- SQL Splitter: Updated documentation to remove RCT references, relocated to main menu with revised terminology using "Lakebridge" consistently (#1952)
- Transpiler Discovery: Updated documentation for pluggable transpiler discovery and execution, introducing Morpheus and BladeBridge as Databricks-provided transpilers
- Installation Process: Updated installation processes from Maven Central and PyPi with new directory structure for manual installations
General
Installation and Maintenance Improvements
- Automated Upgrades: Streamlined installation process with automatic transpiler upgrades during CLI upgrade, eliminating need for separate upgrade commands
- Plugin Management: Improved installation process for plugins like Bladebridge and Morpheus
- Testing Enhancement: Added comprehensive test functions to validate SQL file transpilation with various scenarios including table creation and error handling
Contributors: @m-abulazm, @asnare, @sundarshankar89, @goodwillpunning, @gueniai
v0.10.7
Analyzer
- Improved CLI argument handling and validation - The analyzer's execution process has been significantly enhanced to improve flexibility and user experience. The analyzer now accepts a folder path and source technology type as inputs, generating an Excel report with analysis results for all files and subfolders. The command-line interface has been updated with optional arguments for source directory, report file, and source technology, with interactive prompts to guide users through the analysis process. Enhanced validation includes checks for input folder existence and write access validation for output locations, with better handling of files from cloud-sync folders (#1901).
- Enhanced Informatica analyzer - Improvements have been made to the Informatica analyzer as part of broader project enhancements.
Transpilers
General
- Consistent terminology updates - Updated to use
mssqlandsynapseconsistently throughout the codebase and documentation. The ReconcileConfig class and ReconSourceType class have been updated to reflect consistent terminology, with supported data sources now including "mssql", "synapse", "snowflake", "teradata", "oracle", and "databricks" (#1950). - Enhanced transpiler detection during installation - The installation process now detects existing transpilers and notifies users if upgrades are needed, providing appropriate commands and guidance. Enhanced logging and user agent configuration improve the overall installation experience (#1917).
- Fixed transpiler backup handling - Improved
install-transpileprocess to handle cases where transpiler backups already exist, introducing a new context manager for preserving and restoring paths with better error handling and reliability (#1893).
Morpheus
- Fixed parsing issues with row access policies - Resolved parsing problems with row access policies containing dot-qualified names like
AMC_TEC.RAP_CONT_AREAwhile maintaining proper error handling for unsupported Databricks SQL features. - Added comprehensive test coverage for CREATE VIEW statements - Enhanced test coverage for CREATE VIEW statements with ROW ACCESS POLICY clauses to ensure proper validation and error handling.
- Fixed randomization function translation - Corrected translation between Snowflake's
RANDOM()(64-bit integer) and Databricks'RAND()(double) with proper seed handling and deterministic behavior. - Enhanced temporal format translation - Improved
TO_CHAR/TO_VARCHARfunctions with automatic format conversion andTO_CHARasTO_VARCHARsynonym support.
BladeBridge
Oracle
- Outer Join Conversion: Disabled the call to the subroutine responsible for Oracle outer join conversion due to invalid UNION SELECT syntax in PySpark outputs.
- Procedure Call Handling: New logic was added for generating ETL procedure calls, aligning with Oracle transformation run controls.
Informatica PowerCenter
- PySpark Output Improvements:
- Added handling for pre/post source/target stored procedure calls.
- Removed the
pyspark_data_actioncolumn from target writing. - Improved mapping script generation, now automatically generating mapplets alongside mappings. Mapplet implementation files are placed in a dedicated
shared_functionssubfolder, and mapping scripts incorporate correct import statements for mapplet dependencies. - The converter now returns supplemental files (e.g., DatabricksConversionSupplements.py).
- Notebook Header: When converting from DataStage or Informatica PowerCenter to PySpark or SparkSQL, the output now begins with
# Databricks notebook sourcefor compatibility with Databricks notebook import.
DataStage
- Square Brackets Conversion: Changed logic so SQL statements with square brackets are now replaced with backticks for Databricks compatibility.
- Notebook Header: The PySpark/SparkSQL output now starts with the standard Databricks notebook header for DataStage-to-notebook conversion.
General SQL/Databricks Compatibility
- Table/Column Name Sanitization: Configuration has been added to replace unsupported Databricks characters (
,;{}()\n\t=) in table and column names with a valid character, defaulting to underscore. - DELETE Statement Conversion: Fixed an endless loop caused by previous DELETE conversion rules, and updated logic so DELETE operations are now properly converted to MERGE statements.
- UPDATE Statement Enhancements:
- Updates without a FROM clause are identified and safely converted to MERGE statements.
- Improved handler logic now marks fragments with FROM/MERGE clauses as examined, adding more programmatic safety checks versus relying on regex only.
- Additional patterns added to accurately convert various updates into MERGE.
- Enhanced support for nested IN clauses in WHERE conditions on UPDATE statements, converting them into joins and then merging where appropriate.
- Sub-Selects in MERGE: Fixed handling for sub-selects within MERGE statements (e.g.,
EXISTS (select ...)), with temporary removal of comments for those cases.
Reconcile
- Terminology standardization - Updated Lakebridge Recon tool documentation to replace
remorphwithlakebridgethroughout, including catalog and schema names, table creation, links, and references to notebooks. Configuration documentation updated to reflect the config file location requirement in the.lakebridgedirectory within the Databricks Workspace (#1876).
Documentation
- SQL Splitter utility documentation - Added comprehensive documentation for the SQL Splitter utility, which facilitates processing of large SQL files by splitting them into individual files (one object per file). The tool supports stored procedures, functions, tables, views, and Oracle packages, and is available as a downloadable ZIP file containing executables for Windows, Linux, and MacOS (#1926).
- Consistent terminology updates - Updated documentation to use "MS SQL Server (incl. Synapse)" instead of "SQL Server (incl. Synapse)" and replaced
TSQLwith "MSSQL" for consistency (#1950).
Contributors: @asnare, @m-abulazm, @sundarshankar89, @andresgarciaf, @simone-dbx-labs
Release v0.10.6
Analyzer
- Informatica Workflow Variable Collection
The Informatica analyzer now collects workflow variables, enhancing downstream conversion and mapping flows.
Converters improvements
Morpheus
-
Expanded SQL parser for Snowflake supports full
IF...ELSEIF...ELSEandELSE IFconstructs, recognizesELSEIFas a keyword, and strengthens test coverage. -
Improvements in Snowflake
CREATE PROCEDUREgrammar, including: simplified syntax, handling of optional queries, result set variables, and better exception handling. -
Support for
TEMPORARYas an interchangeable keyword for temporary objects in Snowflake parsing.
BladeBridge
-
Enhanced SQL Scripting for Oracle Procedures
Multiple fixes for procedure conversion, including quoted identifiers, Japanese character support, misplaced/duplicated keywords, improvedSELECT INTO, and more. -
Datastage PXPivot Conversion
Datastage’s vertical pivot (PXPivot) can now be converted to Databricks SQL, broadening ETL migration. -
Synapse and MS SQL Configuration Improvements
- Enhanced fragment breaker for standalone SELECTs
- Improved logic and ordering for variable declarations and set operators
- Bugfixes for
PROC_FINISH, WITH statement handling, and universal ETL+SQL testing
-
Overrides-file Prompt Update
More descriptive and clear prompt for the ‘overrides-file’ option in CLI and documentation. -
Bug Fixes & Minor Enhancements
- Datastage IF/THEN/ELSE and header row handling improvements
- TRY/CATCH and improved
SELECT INTO #tableconversion - Better handling of set operations in SELECT/WITH
- Standardized JSON configuration naming: now uses
base_<source>2databricks_<sql|sparksql|pyspark>.json - DELETE-to-MERGE conversion, more tests, correct semicolon placement, and expanded handling of SQL scripting features
Documentation
- Significant Improvements:
- Expanded BladeBridge and overall configuration docs, with clear instructions for extending logic, using overrides, managing outputs, and troubleshooting.
- Updated guide on reconciling config and leveraging new CLI options.
General
-
Security & Infrastructure Enhancements
- Addressed CVE-2025-7339 (HTTP header manipulation vulnerability) by updating the
on-headersdependency. - Refined handling of output folders, error files, and configuration management for reliability.
- Improved reconcile dashboard deployment reliability—folders without a
dashboard.ymlare no longer deployed. - Suppressed spurious warnings on initial installation; only debug messages are now logged for clean setups.
- Improved encoding handling and end-to-end test coverage for non-UTF-8 files and edge-case encodings.
- Addressed CVE-2025-7339 (HTTP header manipulation vulnerability) by updating the
Contributors: @asnare, @sundarshankar89, @gueniai, @vijaypavann-db, @bishwajit-db, @simone-dbx-labs
v0.10.5
Converters improvements
General
-
XML Encoding Support: The
_process_one_filefunction now detects and correctly handles XML files with internally-specified encoding (e.g., Windows-1252), ensuring successful parsing and conversion of non-UTF-8 files in transformation pipelines. [#1828] -
Test Enhancements: Updates to test cases (
test_transpiles_informatica_with_sparksql,test_transpiles_all_dbt_project_files) were made to increase reliability and provide better logging. [#1828]
Morpheus transpiler
-
Temporary and Transient Table Support Across Dialects:
-
Enhanced Support for T-SQL
SETStatement Options: -
Fix: CTEs in Subqueries:
-
IR Refinement for
CREATECommands:- Introduces a new
CreateCommandnode to better mirror SQL grammar, consolidating and simplifying previous IR structures (e.g., removingReplaceTableandReplaceTableAsSelect)
- Introduces a new
-
CREATE VIEW Implementation:
BladeBridge Transpiler
-
UPDATE to MERGE Logic:
- Conversion logic for
UPDATE...FROMtoMERGEimplemented - Post-processing Improvements:
convert_update_to_mergefunction now ensures statement termination by checking for trailing semicolons.
- Conversion logic for
-
Oracle Data Type Mapping Fixes:
NUMBERwithout precision now maps toDECIMAL(38,18)instead ofDECIMAL(10,0).- Corrects
Timestampmapping and convertsChar(length)toSTRING. SYSTIMESTAMPis now translated toCURRENT_TIMESTAMP()
-
Datastage SET VARIABLE Handling:
Reconcile Improvements
- Use of Existing Warehouse During Configure-Reconcile:
- The reconcile configuration now checks for an existing
warehouse_idin the user's Databricks config. - If present, it uses the existing SQL warehouse (with
CAN_USEpermission) instead of creating a new one. - Logs warehouse details and defers deletion for reusability. [#1825]
- The reconcile configuration now checks for an existing
Documentation updates
-
Databricks Auth Profiles and
--profileOption:- Users can now specify which Databricks workspace to use with the
--profileflag during installation. - Adds command to list available profiles. [#1813]
- Users can now specify which Databricks workspace to use with the
-
Export Instructions for Microsoft SQL Server and Azure Synapse:
- Step-by-step guides added for extracting view, table, and procedure DDLs using:
- SQL Server Management Studio (SSMS),
- Azure Synapse Studio,
- PowerShell via
Export-AzSynapseSqlScriptfor Synapse Serverless.
- Screenshots and Microsoft documentation links included. [#1812]
- Step-by-step guides added for extracting view, table, and procedure DDLs using:
Dependency Updates:
- Updated `databricks-labs-blueprint` version.
- Added `pytest-timeout` for improved test reliability. [[#1828]](https://github.com/databrickslabs/lakebridge/issues/1828)
Contributors: @eri-adepoju, @sundarshankar89, @asnare, @biswadeepupadhyay-db
v0.10.4
- Added Source Tech Override for Analyzer (#1806). The Analyzer command has been enhanced with a
source-techflag, allowing users to specify the Source System Technology to analyze directly in the command line call. - Patch user agent for Infa (#1807). Improved user agent handling for dialects with spaces and added Informatica PC support.
Contributors: @sundarshankar89, @asnare