fix: enforce 255-character identifier length limit for Databricks relations#1391
fix: enforce 255-character identifier length limit for Databricks relations#1391psaikaushik wants to merge 4 commits intodatabricks:mainfrom
Conversation
|
HI @psaikaushik can you select |
sd-db
left a comment
There was a problem hiding this comment.
In addition to the issues raised, let us also add an entry in CHANGELOG.md on the fix. Thanks again for the PR.
Just allowed edit for maintainers and also rebased on the main. Thanks @sd-db |
…ations Add relation_max_name_length() returning 255 and a __post_init__ validation to DatabricksRelation, following the same pattern used by the Postgres adapter. This catches overly-long table names (commonly generated by store_failures for tests with verbose names) at relation creation time with a clear error message instead of a cryptic runtime DatabricksExecutionError from the SQL engine. Closes databricks#1309
8494a15 to
6688bce
Compare
|
@sd-db : Please take a look at the PR now. Addressed all the comments. Thanks! |
0c3add9 to
2e759c2
Compare
…ength validation (databricks#1309) Address review feedback: - Remove verbose comment block from MAX_CHARACTERS_IN_IDENTIFIER constant - Remove redundant inline comments from __post_init__ (implied by the code) - Simplify the error message (no repetition of the limit value) - Add TestIdentifierLengthValidation unit tests covering valid length, too-long raises DbtRuntimeError, no-type skips check, None identifier - Add CHANGELOG entry under 1.11.7 https://claude.ai/code/session_017ZAXMwLSnqz4FqvTt5H7D1
2e759c2 to
a8b866b
Compare
| f"is longer than {self.relation_max_name_length()} characters. " | ||
| f"Databricks has a maximum identifier length of " | ||
| f"{self.relation_max_name_length()} characters. " | ||
| f"If this is a store_failures table, consider using a shorter " |
There was a problem hiding this comment.
We have hardcoded store_failures here but the validator fires on any oversized identifier (models/seeds/snapshots)
|
|
||
|
|
||
| class TestIdentifierLengthValidation: | ||
| def test_relation_max_name_length(self): |
There was a problem hiding this comment.
This test is failing, also we should remove this test as this is not required.
| "Use a shorter test name or configure a custom alias." | ||
| ) | ||
|
|
||
| def relation_max_name_length(self): |
There was a problem hiding this comment.
imo best to make this a classmethod
There was a problem hiding this comment.
Also add a return type annotation -> int
| - Fix `workflow_job` Python model submission method failing with dictionary attribute error ([#1360](https://github.com/databricks/dbt-databricks/issues/1360)) | ||
| - Fix `TestWorkflowJob` functional test that was unreachable on all profiles due to incorrect skip list, wrong model fixture, and invalid `max_retries` parameter ([#1360](https://github.com/databricks/dbt-databricks/issues/1360)) | ||
| - Fix column order mismatch in microbatch and replace_where incremental strategies by using INSERT BY NAME syntax ([#1338](https://github.com/databricks/dbt-databricks/issues/1338)) | ||
| - Validate relation identifier length at creation time and raise a clear error when it exceeds Databricks' 255-character limit, preventing confusing runtime failures when `store_failures: true` generates long table names ([#1309](https://github.com/databricks/dbt-databricks/issues/1309)) |
| raise DbtRuntimeError( | ||
| f"Relation name '{self.identifier}' is longer than " | ||
| f"{self.relation_max_name_length()} characters. " | ||
| "Use a shorter test name or configure a custom alias." |
There was a problem hiding this comment.
post_init is valid for all relations and not just test. I think a better comment is probably something like - Databricks has a maximum identifier length of 255 characters. Use a shorter name or configure a custom alias.
- Add @classmethod decorator and -> int return type to relation_max_name_length() - Broaden error message to not mention store_failures specifically; the validator fires on any oversized identifier (models, seeds, snapshots, etc.) - Remove test_relation_max_name_length test (failing and not required per review) - Move CHANGELOG entry from 1.11.7 to new 1.11.8 section https://claude.ai/code/session_017ZAXMwLSnqz4FqvTt5H7D1
87ebdcf to
75a59a3
Compare
|
@sd-db : Addressed all the comments. Please take a look. thanks! |
Error message called relation_max_name_length() three times and stated the limit redundantly. Align with reviewer's exact suggested wording: "Databricks has a maximum identifier length of N characters. Use a shorter name or configure a custom alias." Update test match pattern accordingly. https://claude.ai/code/session_017ZAXMwLSnqz4FqvTt5H7D1
Summary
Enforces the 255-character identifier length limit for Databricks relations at relation creation time, preventing a cryptic runtime
DatabricksExecutionErrorwhenstore_failuresgenerates overly-long table names.Closes #1309
Problem
When
store_failures: trueis enabled, dbt generates table names by concatenating the test name, model name, and arguments. For deep schema tests with verbose names, this can exceed Databricks' 255-character limit for table names, causing:This error is unhelpful — it doesn't explain what the 255-character limit is, which test/model caused it, or how to fix it.
Fix
Added
relation_max_name_length()and__post_init__validation toDatabricksRelation, following the exact same pattern used by the Postgres adapter:Before:
After:
Checklist