feat(ingest/tableau): log Initial SQL + lineage and warn when none parses#17881
Open
treff7es wants to merge 1 commit into
Open
feat(ingest/tableau): log Initial SQL + lineage and warn when none parses#17881treff7es wants to merge 1 commit into
treff7es wants to merge 1 commit into
Conversation
…rses Add diagnostics to the Tableau Initial SQL path so a run with missing Initial SQL lineage can be debugged directly: - Per data source, log (at DEBUG) the dataset URN, the raw Initial SQL (repr, so newline boundaries are visible), and the upstream URNs it produced. Run ingestion with --debug to see it. - Emit a structured report warning + counter (num_initial_sql_connections_without_statements) when a non-empty Initial SQL splits into zero statements and therefore yields no lineage. This is otherwise silent (it is not a parse failure), and is the signal that the SQL lost its line breaks (e.g. `--` line comments with no terminating newline collapse the whole script into one comment). - Add num_initial_sql_statements_parsed for visibility into how many statements were extracted. No behavior change to lineage emission. Adds unit coverage for the zero-statements case and asserts statement counts on the existing multi-statement test.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Connector Tests ResultsAll connector tests passed for commit To skip connector tests, add the Autogenerated by the connector-tests CI pipeline. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds diagnostics to the Tableau Initial SQL lineage path so a run that produces no Initial-SQL lineage can be debugged directly from the ingestion output, and so the most common silent-failure mode becomes visible.
Motivation: investigating a case where a data source's Initial SQL was captured as the
initialSqlcustom property but produced no upstream lineage, with nothing in the report explaining why. The cause turned out to be Initial SQL whose line breaks were lost: a leading--line comment with no terminating newline collapses the entire script into a single comment, sosplit_statementsyields zero statements. This is not a parse failure, so it incremented no counter and emitted no warning — completely silent.Changes
repr(), so newline boundaries are visible), and the upstream URNs it produced. Run ingestion with--debugto see exactly what the connector received and what lineage it derived.num_initial_sql_connections_without_statements) when a non-empty Initial SQL splits into zero statements. The warning context reportshas_newline/has_line_comment, which immediately tells you whether the SQL arrived with its line breaks intact.num_initial_sql_statements_parsedfor visibility into how many statements were extracted across all connections.No change to lineage emission behavior.
Testing
test_get_initial_sql_lineage_flattened_comment_warns_and_countscovers the zero-statements case (raw SQL still captured, no upstreams, counter incremented, not counted as a parse failure).num_initial_sql_statements_parsedand that the zero-statements counter stays 0.metadata-ingestion:lintFixand targetedmypypass on the changed files; fulltests/unit/tableau/test_tableau_initial_sql.pysuite passes (47 tests).Checklist