-
Notifications
You must be signed in to change notification settings - Fork 78
Open
Labels
bb converterIssues related to BB converterIssues related to BB converter
Description
Is there an existing issue for this?
- I have searched the existing issues
Category of Bug / Issue
Converter bug
Current Behavior
Severity: CRITICAL ⚠️
Affected Jobs: ALL (100%)
Problem
DataStage job parameters in format #PARAM_NAME# are not substituted in:
- SQL statements (TRUNCATE, SELECT)
- Target table names in
saveAsTable() - Dynamic column names
Examples
Example 1: TRUNCATE statement (Job JOB_NAME)
DataStage XML:
<Property Name="SQLStatement">TRUNCATE TABLE #TGT_TABLE#</Property>Transpiled (WRONG):
spark.sql(rf"""TRUNCATE TABLE #TGT_TABLE#""")Root Cause
The transpiler doesn't recognize DataStage parameter syntax #PARAM# and convert it to Python f-string {PARAM}.
Expected Behavior
Expected:
spark.sql(f"""TRUNCATE TABLE {TGT_TABLE}""")Steps To Reproduce
1. Create a minimal DataStage job XML with parameters:
Save as test_job_params.xml:
<?xml version="1.0" encoding="UTF-8"?>
<DSExport>
<Job Identifier="TestJob" DateModified="2024-01-01" TimeModified="12:00:00">
<Record Identifier="V0" Type="CustomStage">
<Property Name="Name">TestJob</Property>
<Property Name="Parameters">
<Property Name="TGT_TABLE" Type="String" Default="MY_TABLE"/>
</Property>
<Property Name="BeforeSQL">TRUNCATE TABLE #TGT_TABLE#</Property>
<Collection Name="Stages">
<SubRecord>
<Property Name="Name">TARGET_STAGE</Property>
<Property Name="TableName">#TGT_TABLE#</Property>
</SubRecord>
</Collection>
</Record>
</Job>
</DSExport>2. Run transpilation:
databricks labs lakebridge transpile --input-source test_job_params.xml --output-folder transpiled --debug3. Observe output in generated .py file
I'm using Datastage as a source and pyspark as a target
Relevant log output or Exception details
Logs Confirmation
- I ran the command line with
--debug - I have attached the
lsp-server.logunder USER_HOME/.databricks/labs/remorph-transpilers/<converter_name>/lib/lsp-server.log
Sample Query
Operating System
macOS
Version
latest via Databricks CLI
Metadata
Metadata
Assignees
Labels
bb converterIssues related to BB converterIssues related to BB converter