28 Oct 04:02

gueniai

51b2a05

v0.10.13 Latest

Latest

Analyzer

Added defensive code to prevent analyzer crashes on DataStage files with empty array references - Fixes an issue where the DataStage analyzer would crash when encountering empty array references

Converters

Morpheus

General

Enhanced name representation consistency - Major refactoring that replaces String representations with Expression types for table names, column names, and constraints across IR nodes, improving SQL/PySpark code generation accuracy
Fixed DBT parsing issues - Resolved template parsing problems by changing template markers to !#Jinja0001#! format and improving whitespace handling for proper tokenization

TSQL (Synapse/SQL Server)

Support for dual OUTPUT clauses in TSQL INSERT/DELETE/UPDATE statements - Enhanced T-SQL parser to handle complex statements with multiple OUTPUT clauses (OUTPUT ... INTO ... OUTPUT ...) with comprehensive test coverage
Fixed TSQL DECLARE statement handling - Refactored DECLARE statement processing by moving logic to dedicated visitor methods and properly marking unsupported statements for future implementation
Improved BLOCK structure parsing for BEGIN and BEGIN TRY statements - Updated parser grammar to support flexible scripting blocks and transaction handling, allowing zero or more statements in control flow constructs
Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation

Snowflake

Fixed Snowflake connection tests - Internal improvements for database connection test reliability
Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation

BladeBridge

General

Automatically creates and cleans up temporary folders for embedded SQL conversion in wrapper scripts - Improves workflow management by implicitly creating temp folders and cleaning them up once conversion is complete

MSSQL (SQL Server)

Enhanced table variable and temporary table conversion - Added support for table variable conversion to temporary tables and improved string handling with logic to convert double single quotes to double quotes
Fixed semicolon placement in nested select statements - Resolved issue where semicolons appeared before comments in nested select statements
Improved MS SQL procedure handling - Added LIMIT 1 for Set in select statements, enhanced function mappings, fixed string concatenation, and removed unsupported constraints

Reconcile

No updates in this release.

Documentation

No updates in this release.

Contributors: @gueniai, @sundarshankar89

Contributors

gueniai and sundarshankar89

Assets 4

16 Oct 21:11

gueniai

v0.10.12

2bf8292

v0.10.12

Analyzer

New installation verification command - Introduced a new command to verify successful installation of the Lakebridge Analyzer, displaying usage and available flags for report file paths, source directories, and source technologies

Converters

General

Enhanced transpile command - Updated transpile command to support --overrides-path and --target-technology arguments for greater flexibility and customization
Improved error handling - Enhanced handling of parsing errors during code transpilation to output transpiled code instead of original input, providing clearer outcomes when issues arise
Refactored naming conventions - Renamed transpiler product_name to transpiler_id throughout the codebase for improved consistency and clarity

Morpheus

TSQL

Enhanced TSQL support - Added support for DENY statements, EXEC statement syntax improvements, COLLATION in CREATE TABLE column definitions, and WINDOW clause functionality
Improved ALTER DATABASE support - Enhanced support for all options on ALTER DATABASE SET statements and multiple LOG file specifications in ALTER DATABASE ADD LOG
Better JOIN functionality - Added support for all join hints (MERGE, HASH, LOOP, REDUCE, REPLICATE, REDISTRIBUTE) in JOIN constructs
Enhanced COPY INTO support - Fixed syntax for COPY INTO commands and added extended column definitions support in TSQL mode
Improved DELETE operations - Added transformation rule to translate IN to EXISTS when needed in DELETE statement WHERE clauses

Snowflake

COPY INTO improvements - Refactored and standardized grammar rules for COPY INTO commands, consolidating stage location handling
UPDATE FROM enhancements - Added tests for UPDATE FROM statements to verify correct transpilation to MERGE INTO statements

General

Enhanced permission handling - Added support for column-specific privileges and improved handling of column-specific permissions
Improved parser functionality - Allowed SCHEMAS keyword to be used as identifier and clarified warning messages for unrecognized functions

BladeBridge

MSSQL

Fixed update_to_merge functionality - Improved WITH clause handling and script variable ordering for MSSQL dialects
Table variable support - Implemented table variable conversion support for MSSQL dialects
DDL operation fixes - Fixed and removed unsupported DDL operations including alter index, switch partitions, and drop constraints

Informatica

Power Center improvements - Fixed hanging issue on Linux for Informatica PC conversion by improving block_subst patterns and output flushing
Dataframe implementation fixes - Fixed dataframe implementation for pulling data from flat file unconnected lookups in Informatica Power Center

DataStage

TRUNCATE TABLE support - Added spark.sql_template to resolve TRUNCATE TABLE statement generation when TRUNCATE flag is enabled in DataStage

Reconcile

Enhanced Databricks schema queries - Fixed Databricks schema query to improve accuracy and reliability of schema reconciliation, with better column name consistency and filtering

Documentation

Updated CLI documentation - Refreshed documentation to reflect latest changes in Command Line Interface menus, including new commands and flags such as transpile, reconcile, and install-transpile subcommands
Enhanced command documentation - Added detailed documentation for transpile command usage and flags, including optional flags for catalog name, error file path, and source dialect
Updated installation guides - Modified installation documentation to include verification examples and updated help flags for new command options
Dependency updates:
Updated cryptography requirement from <45.1.0,>=44.0.2 to >=44.0.2,<46.1.0 (#2028).
Bump databrickslabs/sandbox/acceptance@acceptance/v0.4.2 from 0.4.2 to 0.4.4 (#1833).

Contributors: @asnare, @sundarshankar89, @m-abulazm, @dependabot[bot], @gueniai

Contributors

asnare, gueniai, and 3 other contributors

Assets 4

03 Oct 21:53

gueniai

v0.10.11

b8259a7

v0.10.11

Analyzer

No updates in this release

Converters

General

Fixed special character handling in filenames by introducing from_uri() helper function for safer URI handling
Ensured SQL converter returns UTF-8 encoded files for proper character encoding
Fixed filename to correctly output databricks_conversion_supplements.py supplemental file
Fixed broken splitter URL by updating directory naming conventions from "Downloads" to "downloads"
Improved handling of encoding-related errors by catching UnicodeDecodeError and LookupError exceptions during file processing, creating TranspileError with specific encoding-error codes instead of stopping

Morpheus

Snowflake

Added support for TRUNCATE TABLE statements with proper IR and translation support
Correctly support $IDENTITY and $ROWGUID system variables
Refactored and extended grammar and AST support for SQL procedure creation with improved handling of raw string literals
Enhanced schema reconciliation functionality to support Snowflake arrays, addressing the corner case where Databricks arrays are typed and Snowflake arrays are untyped

TSQL

Added support for TRUNCATE TABLE statements with proper IR and translation support
Support full CREATE and ALTER INDEX statements in TSQL parsing, rejecting INDEX CREATE/ALTER statements sensibly instead of raising syntax errors
Fixed implementation of IF scripting blocks with improvements to SQL parser, grammar enhancements, and enhanced scripting grammar for more robust handling of block statements and conditional branches
Allow CLUSTERED to be an identifier to improve CREATE TABLE syntax as a CONSTRAINT qualifier
Support percentage expressions in TSQL options (e.g., OPT = 42%) instead of raising parsing errors
Added support for REVOKE statements, similar to existing GRANT statement implementation
Ensure that ROWS and OBJECTS can be used as identifiers even with Jinja templates
Correctly support $IDENTITY and $ROWGUID system variables

General (Multiple Dialects)

Support comments on column declarations when generating SQL and renamed legacy builders for consistency
Refactored IR around CREATE FUNCTION and CREATE PROCEDURE, unifying all ways to create stored procedures under a single CreateStoredProcedure IR node and all ways to create user defined functions under a single CreateUDF IR node
Implemented grammar and IR placeholders for named windows, introducing initial support for the SQL standard WINDOW clause in parser grammar

BladeBridge

Oracle

Removed unsupported Oracle DDL constraints (add/create constraint unique) and extraneous TBLPROPERTIES from converted output

MSSQL

Added handle_xml_nodes function for MS SQL processing
Fixed multiple MSSQL issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, and parameter data types

Synapse

Fixed multiple Synapse issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, parameter data types, error handling in stored procedures, and Synapse-specific features (e.g., table distribution)

Teradata

Added Teradata function mappings including ZEROIFNULL, TEMPORAL_TIMESTAMP, TRYCAST, ANY, FIRST, NULLIFZERO, DECODE with different parameter counts, and HASHAMP
Removed collect statistics and lock table statements

DataStage

Implemented DataStage Checksum component translation to SparkSQL equivalent and fixed Pyspark checksum translation to use MD5() instead of SHA2()

Reconcile

Added handling for special characters in reconcile aggregate, enhancing the library to handle special characters in column names by properly delimiting identifiers in SQL queries
Fixed deploy reconcile jobs by updating wheel file handling, simplifying deployment process to use single wheel path, and fixing broken documentation links

Documentation

Fixed download link in docs (reconcile automation) by replacing broken markdown link with JSX link utilizing useBaseUrl hook

General

Implemented new describe-transpile CLI subcommand that describes installed transpilers, including their versions, configuration paths, and supported source dialects
Switched from urllib to requests library for making HTTP calls to PyPI and Maven Central, with default 60-second timeout and improved error handling
Work around DATABRICKS_HOST normalization issue during install and uninstall by introducing new Lakebridge subclass with appropriate workspace client

Dependency updates

Bump Databricks SDK Version to 0.67.0 by @goodwillpunning in #2062
Bump sigstore/gh-action-sigstore-python from 3.0.0 to 3.0.1 by @dependabot[bot] in #1753

Special thanks to @BrianDeacon for his contribution to fix #1858

Contributors: @asnare, @m-abulazm, @ihor-ki, @goodwillpunning, @sundarshankar89, @dependabot[bot]

Contributors

asnare, BrianDeacon, and 5 other contributors

Assets 4

25 Sep 03:56

gueniai

v0.10.10

3678301

v0.10.10

Analyzer

Large XML file chunking optimization: Now the analyzer is able to handle large XML files (up to 1TB in size)

Converters

General

Non-interactive transpiler installation: Introduced support for non-interactive installation mode with new interactive option that can detect environment context, enabling automated installations without user input while preserving existing configurations. Resolves #2013

Morpheus

Enhanced GRANT statement support: Implemented comprehensive GRANT statement support by creating dedicated permission.g4 grammar file with IR definitions and translation rules for permission-related statements
Improved error handling: Rewrote print function to properly handle newlines and added extensive unit tests for error annotation, including block and FIXME comments. Resolves #2030
Enhanced LSP server behavior: Improved LSP server to append original text to error messages when transpilation fails, eliminating need for client-side response manipulation
Standardized dialect options: Aligned dialect options to present synapse and mssql to users for consistency with bladebridge
Fixed Lateral Column Alias handling: Enhanced dealiasing for Lateral Column Aliases (LCAs) in WHERE clauses under CASE...WHEN expressions. Resolves #1767
Enhanced GROUP BY/aggregation function dealiasing: Implemented dealiasing for Lateral Column Aliases in GROUP BY clauses and aggregation functions where LCA references are unsupported. Resolves (#956) and (#954)
Optimized Snowflake transformations: Reordered transformation rules to ensure TransformWithinGroup processes all cases before the call mapper. Resolves #1231

BladeBridge

Enhanced merge statement handlers: Improved merge statement processing to fix backtick handling, update operations without WHERE clauses, procedure conversions, IF-THEN-SET blocks, and various delimiter and mapping issues
Fixed view creation with WITH clauses: Corrected CREATE VIEW functionality to properly handle WITH clause statements
Oracle script improvements: Resolved variable declaration issues in Oracle scripts containing exception handling blocks
SQL Server function mapping: Added function mappings for Microsoft SQL Server functions including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments
Fixed variable declarations: Resolved variable declaration issues in Oracle scripts that contain exception handling blocks
MSSQL Server Enhanced function mappings: Added comprehensive function mappings including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments

Reconcile

Improved logging for aggregate reconciliation: Enhanced logging functionality to provide more accurate messages by replacing warning logs with informational messages when aggregate details rules are empty, indicating successful reconciliation with no details to store. Resolves #2040
Refactored aggregate query building: Simplified code using AggregateQueryBuilder class to generate queries for both source and target in a more concise and efficient manner

Documentation

No updates in this release

Dependency updates:

Bump actions/setup-python from 5 to 6 (#1988).

Contributors: @m-abulazm, @asnare, @dependabot[bot], @sundarshankar89

Contributors

asnare, dependabot, and 2 other contributors

Assets 4

12 Sep 20:58

gueniai

v0.10.9

9332501

v0.10.9

Analyzer

Fixed bug where Analyzer would crash with large DDL files
Adjusted calculation of complexity for TSQL queries to make it more accurate

Transpilers

Morpheus

T-SQL Updates
- Advanced Statement Support: Added parsing for CREATE CERTIFICATE, CREATE LOGIN, PRINT commands, and EXECUTE AS LOGIN statements
- SET Command Enhancements: Support for complex assignment operators (+=, -=, *=, /=, %=, &=, ^=, |=) commonly used in T-SQL scripts
- CREATE EXTERNAL TABLE: Improved parsing with flexible syntax for external table definitions and location specifications
- GRANT/REVOKE Statements: Comprehensive support for T-SQL security statements with clear Unity Catalog migration guidance
- DROP Commands: Enhanced handling of DROP SENSITIVITY and other specialized DROP variants
- Improved Error Reporting: SQL output now includes FIXME comments with detailed error messages for unsupported constructs
Snowflake Updates
- Analytics Functions: Full parsing support for MATCH_RECOGNIZE clause with pattern analysis capabilities for complex analytical queries
- Time Travel Queries: Enhanced handling of CHANGES, AT, and BEFORE clauses for historical data access patterns
- REGEXP_INSTR Function: Complete implementation supporting all 7 parameters (vs Databricks' 2), providing accurate behavioral translation
- Table-Valued Functions: Support for parsing inline table-valued functions commonly used in Snowflake
- GRANT/REVOKE Statements: Full support for Snowflake's complex privilege management syntax including roles and shares
- DROP Commands: Enhanced parsing for DROP SENSITIVITY and related data governance statements
- Improved Error Reporting: SQL output now includes FIXME comments with detailed error messages for unsupported constructs

Dependency updates:

Bump actions/checkout from 4 to 5 (#1928).
Bump actions/upload-pages-artifact from 3 to 4 (#1964).
Bump mermaid from 11.6.0 to 11.10.1 in /docs/lakebridge (#1956).

Contributors: @dependabot[bot], @asnare, @m-abulazm

Contributors

asnare, dependabot, and m-abulazm

Assets 4

09 Sep 03:54

gueniai

v0.10.8

4c377cd

v0.10.8

Transpilers

General

SQL Validation Enhancement: Improved SQL validator to check only SQL outputs with enhanced error handling and support for various transpile results (#1949)
Error Handling Improvements: Added static error lookups for specific cases like unresolved routines and columns, with more readable exception messages
MIME Support: New functionality to support both MIME and non-MIME transpile results, including validation and output file management
LSP Server Integration: Log level now passed to Language Server Protocol (LSP) server via environment variable for greater flexibility (#1967)
Transpiler Auto-Upgrade: Enhanced installer to automatically upgrade existing Lakebridge transpilers during CLI upgrade process (#1978)
Source Dialect Handling: Fixed missing transpile source dialect handling to ensure correct assignment in configuration objects (#1985)

Morpheus

Enhanced Snowflake Conversion support:
- Support for parsing ILIKE, EXCLUDE, REPLACE, RENAME with * LHS
- Full support for EXCLUDE and RENAME clauses and all combinations
- Fixed REPLACE function with optional third argument
- Enhanced OBJECT_DELETE to accept 2 or more arguments
- Accurate translation of Snowflake's REGEXP_REPLACE
Parser Improvements:
- Allow lists of generic options with optional commas
- EXTERNAL can now be used as an ID despite being documented as reserved
- Support for DROP RULE syntax in TSQL
- Allow DBT Jinja macros within JSON literals
- Fixed bugs around DBT elseif and comment nodes
Error Handling: Upgraded SimpleError with support status and simplified user-facing parse error messages
Integration Alignment: Updated error handling to align with BladeBridge, now returning UNRESOLVED_ROUTINE errors consistently (#1998)

BladeBridge

XML Source Processing:
- Automatic detection of XML sources with proper encoding preservation
- Maintains UTF-8 encoding while respecting XML-specific encoding declarations
- Prevents XML parser failures from encoding mismatches
SQL Scripting Enhancements:
- Fixed nested comment handling in SQL scripts
- Improved custom configuration handling for first-match processing
- Removed unnecessary begin/end enclosures in pre/post SQL blocks
Teradata Updates: Enhanced convert_update_to_merge functionality
Oracle Updates:
- Replaced list partitioning with CLUSTER BY statements
- Removed unsupported CREATE INDEX and ALTER INDEX statements
- Fixed CREATE PROCEDURE signature generation with proper exception handling
DataStage Updates:
- Added support for TRUNCATE TABLE specifications (#1903)
- Fixed column name handling when dataframe columns match job parameters
- Enabled single-pass processing of shared containers
- Resolved dataset component path issues for proper PySpark code generation

Reconcile

Schema Normalization: Added feature flag for identifier normalization with optional normalize parameter in get_schema method for flexible handling of different data source configurations (#1953)

Enhanced Connection Support

Snowflake Security: Added support for encrypted PEM private keys with pem_private_key_password field for secure authentication (#1869)
JDBC URL Handling: Improved JDBC URL arguments handling with enhanced error handling and logging
Connection Properties: Enhanced SecretsMixin class with new _get_secret_or_none method for better secret value retrieval
Error Handling: Introduced new exceptions like InvalidSnowflakePemPrivateKey for better error management

Documentation

Comprehensive Documentation Updates

MS SQL and Synapse: Enhanced documentation for reconcile connections including default secret naming conventions and required connection properties (#1954)
Connection Configuration: Added clear YAML format examples for MS SQL connection properties covering user, password, host, port, database, encryption, and trust server certificate
BladeBridge Updates: Minor naming correction from "Microsoft MS SQL Server" to "Microsoft SQL Server" while maintaining support for Oracle, Teradata, Netezza, Informatica, and DataStage
SQL Splitter: Updated documentation to remove RCT references, relocated to main menu with revised terminology using "Lakebridge" consistently (#1952)
Transpiler Discovery: Updated documentation for pluggable transpiler discovery and execution, introducing Morpheus and BladeBridge as Databricks-provided transpilers
Installation Process: Updated installation processes from Maven Central and PyPi with new directory structure for manual installations

General

Installation and Maintenance Improvements

Automated Upgrades: Streamlined installation process with automatic transpiler upgrades during CLI upgrade, eliminating need for separate upgrade commands
Plugin Management: Improved installation process for plugins like Bladebridge and Morpheus
Testing Enhancement: Added comprehensive test functions to validate SQL file transpilation with various scenarios including table creation and error handling

Contributors: @m-abulazm, @asnare, @sundarshankar89, @goodwillpunning, @gueniai

Contributors

asnare, gueniai, and 3 other contributors

Assets 4

21 Aug 19:02

gueniai

v0.10.7

9ea1d2c

v0.10.7

Analyzer

Improved CLI argument handling and validation - The analyzer's execution process has been significantly enhanced to improve flexibility and user experience. The analyzer now accepts a folder path and source technology type as inputs, generating an Excel report with analysis results for all files and subfolders. The command-line interface has been updated with optional arguments for source directory, report file, and source technology, with interactive prompts to guide users through the analysis process. Enhanced validation includes checks for input folder existence and write access validation for output locations, with better handling of files from cloud-sync folders (#1901).
Enhanced Informatica analyzer - Improvements have been made to the Informatica analyzer as part of broader project enhancements.

Transpilers

General

Consistent terminology updates - Updated to use mssql and synapse consistently throughout the codebase and documentation. The ReconcileConfig class and ReconSourceType class have been updated to reflect consistent terminology, with supported data sources now including "mssql", "synapse", "snowflake", "teradata", "oracle", and "databricks" (#1950).
Enhanced transpiler detection during installation - The installation process now detects existing transpilers and notifies users if upgrades are needed, providing appropriate commands and guidance. Enhanced logging and user agent configuration improve the overall installation experience (#1917).
Fixed transpiler backup handling - Improved install-transpile process to handle cases where transpiler backups already exist, introducing a new context manager for preserving and restoring paths with better error handling and reliability (#1893).

Morpheus

Fixed parsing issues with row access policies - Resolved parsing problems with row access policies containing dot-qualified names like AMC_TEC.RAP_CONT_AREA while maintaining proper error handling for unsupported Databricks SQL features.
Added comprehensive test coverage for CREATE VIEW statements - Enhanced test coverage for CREATE VIEW statements with ROW ACCESS POLICY clauses to ensure proper validation and error handling.
Fixed randomization function translation - Corrected translation between Snowflake's RANDOM() (64-bit integer) and Databricks' RAND() (double) with proper seed handling and deterministic behavior.
Enhanced temporal format translation - Improved TO_CHAR/TO_VARCHAR functions with automatic format conversion and TO_CHAR as TO_VARCHAR synonym support.

BladeBridge

Oracle

Outer Join Conversion: Disabled the call to the subroutine responsible for Oracle outer join conversion due to invalid UNION SELECT syntax in PySpark outputs.
Procedure Call Handling: New logic was added for generating ETL procedure calls, aligning with Oracle transformation run controls.

Informatica PowerCenter

PySpark Output Improvements:
- Added handling for pre/post source/target stored procedure calls.
- Removed the pyspark_data_action column from target writing.
- Improved mapping script generation, now automatically generating mapplets alongside mappings. Mapplet implementation files are placed in a dedicated shared_functions subfolder, and mapping scripts incorporate correct import statements for mapplet dependencies.
- The converter now returns supplemental files (e.g., DatabricksConversionSupplements.py).
Notebook Header: When converting from DataStage or Informatica PowerCenter to PySpark or SparkSQL, the output now begins with # Databricks notebook source for compatibility with Databricks notebook import.

DataStage

Square Brackets Conversion: Changed logic so SQL statements with square brackets are now replaced with backticks for Databricks compatibility.
Notebook Header: The PySpark/SparkSQL output now starts with the standard Databricks notebook header for DataStage-to-notebook conversion.

General SQL/Databricks Compatibility

Table/Column Name Sanitization: Configuration has been added to replace unsupported Databricks characters (,;{}()\n\t=) in table and column names with a valid character, defaulting to underscore.
DELETE Statement Conversion: Fixed an endless loop caused by previous DELETE conversion rules, and updated logic so DELETE operations are now properly converted to MERGE statements.
UPDATE Statement Enhancements:
- Updates without a FROM clause are identified and safely converted to MERGE statements.
- Improved handler logic now marks fragments with FROM/MERGE clauses as examined, adding more programmatic safety checks versus relying on regex only.
- Additional patterns added to accurately convert various updates into MERGE.
- Enhanced support for nested IN clauses in WHERE conditions on UPDATE statements, converting them into joins and then merging where appropriate.
Sub-Selects in MERGE: Fixed handling for sub-selects within MERGE statements (e.g., EXISTS (select ...)), with temporary removal of comments for those cases.

Reconcile

Terminology standardization - Updated Lakebridge Recon tool documentation to replace remorph with lakebridge throughout, including catalog and schema names, table creation, links, and references to notebooks. Configuration documentation updated to reflect the config file location requirement in the .lakebridge directory within the Databricks Workspace (#1876).

Documentation

SQL Splitter utility documentation - Added comprehensive documentation for the SQL Splitter utility, which facilitates processing of large SQL files by splitting them into individual files (one object per file). The tool supports stored procedures, functions, tables, views, and Oracle packages, and is available as a downloadable ZIP file containing executables for Windows, Linux, and MacOS (#1926).
Consistent terminology updates - Updated documentation to use "MS SQL Server (incl. Synapse)" instead of "SQL Server (incl. Synapse)" and replaced TSQL with "MSSQL" for consistency (#1950).

Contributors: @asnare, @m-abulazm, @sundarshankar89, @andresgarciaf, @simone-dbx-labs

Contributors

asnare, andresgarciaf, and 3 other contributors

Assets 4

01 Aug 21:55

gueniai

v0.10.6

91ee879

Release v0.10.6

Analyzer

Informatica Workflow Variable Collection
The Informatica analyzer now collects workflow variables, enhancing downstream conversion and mapping flows.

Converters improvements

Morpheus

Expanded SQL parser for Snowflake supports full IF...ELSEIF...ELSE and ELSE IF constructs, recognizes ELSEIF as a keyword, and strengthens test coverage.
Improvements in Snowflake CREATE PROCEDURE grammar, including: simplified syntax, handling of optional queries, result set variables, and better exception handling.
Support for TEMPORARY as an interchangeable keyword for temporary objects in Snowflake parsing.

BladeBridge

Enhanced SQL Scripting for Oracle Procedures
Multiple fixes for procedure conversion, including quoted identifiers, Japanese character support, misplaced/duplicated keywords, improved SELECT INTO, and more.
Datastage PXPivot Conversion
Datastage’s vertical pivot (PXPivot) can now be converted to Databricks SQL, broadening ETL migration.
Synapse and MS SQL Configuration Improvements
- Enhanced fragment breaker for standalone SELECTs
- Improved logic and ordering for variable declarations and set operators
- Bugfixes for PROC_FINISH, WITH statement handling, and universal ETL+SQL testing
Overrides-file Prompt Update
More descriptive and clear prompt for the ‘overrides-file’ option in CLI and documentation.
Bug Fixes & Minor Enhancements
- Datastage IF/THEN/ELSE and header row handling improvements
- TRY/CATCH and improved SELECT INTO #table conversion
- Better handling of set operations in SELECT/WITH
- Standardized JSON configuration naming: now uses base_<source>2databricks_<sql|sparksql|pyspark>.json
- DELETE-to-MERGE conversion, more tests, correct semicolon placement, and expanded handling of SQL scripting features

Documentation

Significant Improvements:
- Expanded BladeBridge and overall configuration docs, with clear instructions for extending logic, using overrides, managing outputs, and troubleshooting.
- Updated guide on reconciling config and leveraging new CLI options.

General

Security & Infrastructure Enhancements
- Addressed CVE-2025-7339 (HTTP header manipulation vulnerability) by updating the on-headers dependency.
- Refined handling of output folders, error files, and configuration management for reliability.
- Improved reconcile dashboard deployment reliability—folders without a dashboard.yml are no longer deployed.
- Suppressed spurious warnings on initial installation; only debug messages are now logged for clean setups.
- Improved encoding handling and end-to-end test coverage for non-UTF-8 files and edge-case encodings.

Contributors: @asnare, @sundarshankar89, @gueniai, @vijaypavann-db, @bishwajit-db, @simone-dbx-labs

Contributors

asnare, gueniai, and 4 other contributors

Assets 4

16 Jul 17:02

gueniai

v0.10.5

031d8fe

v0.10.5

Converters improvements

General

XML Encoding Support: The _process_one_file function now detects and correctly handles XML files with internally-specified encoding (e.g., Windows-1252), ensuring successful parsing and conversion of non-UTF-8 files in transformation pipelines. [#1828]
Test Enhancements: Updates to test cases (test_transpiles_informatica_with_sparksql, test_transpiles_all_dbt_project_files) were made to increase reliability and provide better logging. [#1828]

Morpheus transpiler

Temporary and Transient Table Support Across Dialects:
- Adds parsing and SQL generation for TEMPORARY, TRANSIENT, VOLATILE, and other table types.
- Databricks currently treats TRANSIENT tables as TEMPORARY (still in private preview); READ ONLY not yet supported.
Enhanced Support for T-SQL SET Statement Options:
- Parsers now recognize SET OPTION ON|OFF and generate structured error messages for unsupported options.
- Adds support for finer-grained parsing of T-SQL options like SET ANSI_NULLS, SET ARITHABORT, etc.
Fix: CTEs in Subqueries:
- Corrects issue where WITH clauses inside DDLs (e.g. CREATE TABLE AS) were previously ignored by not invoking the correct visitor.
IR Refinement for CREATE Commands:
- Introduces a new CreateCommand node to better mirror SQL grammar, consolidating and simplifying previous IR structures (e.g., removing ReplaceTable and ReplaceTableAsSelect)
CREATE VIEW Implementation:
- Implements the createView grammar and logic with visitor methods and meaningful error messages for unsupported options.

BladeBridge Transpiler

UPDATE to MERGE Logic:
- Conversion logic for UPDATE...FROM to MERGE implemented
- Post-processing Improvements: convert_update_to_merge function now ensures statement termination by checking for trailing semicolons.
Oracle Data Type Mapping Fixes:
- NUMBER without precision now maps to DECIMAL(38,18) instead of DECIMAL(10,0).
- Corrects Timestamp mapping and converts Char(length) to STRING.
- SYSTIMESTAMP is now translated to CURRENT_TIMESTAMP()
Datastage SET VARIABLE Handling:
- Updates SET VARIABLE component transformation to behave like standard column expressions and prepends SELECT as required.

Reconcile Improvements

Use of Existing Warehouse During Configure-Reconcile:
- The reconcile configuration now checks for an existing warehouse_id in the user's Databricks config.
- If present, it uses the existing SQL warehouse (with CAN_USE permission) instead of creating a new one.
- Logs warehouse details and defers deletion for reusability. [#1825]

Documentation updates

Databricks Auth Profiles and --profile Option:
- Users can now specify which Databricks workspace to use with the --profile flag during installation.
- Adds command to list available profiles. [#1813]
Export Instructions for Microsoft SQL Server and Azure Synapse:
- Step-by-step guides added for extracting view, table, and procedure DDLs using:
  - SQL Server Management Studio (SSMS),
  - Azure Synapse Studio,
  - PowerShell via Export-AzSynapseSqlScript for Synapse Serverless.
- Screenshots and Microsoft documentation links included. [#1812]

Dependency Updates:

- Updated `databricks-labs-blueprint` version.
- Added `pytest-timeout` for improved test reliability. [[#1828]](https://github.com/databrickslabs/lakebridge/issues/1828)

Contributors: @eri-adepoju, @sundarshankar89, @asnare, @biswadeepupadhyay-db

Contributors

asnare, eri-adepoju, and 2 other contributors

Assets 4

07 Jul 18:24

gueniai

v0.10.4

bc1a518

v0.10.4

Added Source Tech Override for Analyzer (#1806). The Analyzer command has been enhanced with a source-tech flag, allowing users to specify the Source System Technology to analyze directly in the command line call.
Patch user agent for Infa (#1807). Improved user agent handling for dialects with spaces and added Informatica PC support.

Contributors: @sundarshankar89, @asnare

Contributors

asnare and sundarshankar89

Assets 4

Releases: databrickslabs/lakebridge

v0.10.13

Analyzer

Converters

Morpheus

General

TSQL (Synapse/SQL Server)

Snowflake

BladeBridge

General

MSSQL (SQL Server)

Reconcile

Documentation

Contributors

Uh oh!

v0.10.12

Analyzer

Converters

General

Morpheus

TSQL

Snowflake

General

BladeBridge

MSSQL

Informatica

DataStage

Reconcile

Documentation

Contributors

Uh oh!

v0.10.11

Analyzer

Converters

General

Morpheus

Snowflake

TSQL

General (Multiple Dialects)

BladeBridge

Oracle

MSSQL

Synapse

Teradata

DataStage

Reconcile

Documentation

General

Dependency updates

Contributors

Uh oh!

v0.10.10

Analyzer

Converters

General

Morpheus

BladeBridge

Reconcile

Documentation

Dependency updates:

Contributors

Uh oh!

v0.10.9

Analyzer

Transpilers

Morpheus

Dependency updates:

Contributors

Uh oh!

v0.10.8

Transpilers

General

Morpheus

BladeBridge

Reconcile

Documentation

General

Contributors

Uh oh!

v0.10.7