Skip to content

Conversation

@dataders
Copy link
Contributor

Summary

  • Adds a new changeset rule that removes duplicate column definitions in YAML schema files
  • Keeps the last occurrence to match dbt's runtime behavior ("only the last definition will be used")
  • Gated behind --behavior-change flag since removing duplicates may lose config information

Closes #283

Changes

  • Add DUPLICATE_COLUMN_DEFINITION_DEPRECATION type to deprecations.py
  • Add changeset_remove_duplicate_column_definitions function and deduplicate_columns_list helper
  • Handle columns in models, seeds, snapshots, model versions, and source tables
  • Register in behavior_change_rules (requires --behavior-change flag)
  • Add comprehensive test suite (11 tests)
  • Update README with new deprecation coverage

Test plan

  • All existing tests pass (109 tests)
  • New tests pass (11 tests for duplicate column removal)
  • Manual test with YAML file containing duplicate columns
  • Verify only runs with --behavior-change flag

🤖 Generated with Claude Code

Addresses #283. Adds a new changeset rule that removes duplicate column
definitions in YAML schema files, keeping the last occurrence to match
dbt's runtime behavior ("only the last definition will be used").

The rule is gated behind the --behavior-change flag since removing
duplicates may lose config information (descriptions, tests,
masking_policy, etc.) that differs between duplicate definitions.

Changes:
- Add DUPLICATE_COLUMN_DEFINITION_DEPRECATION type
- Add changeset_remove_duplicate_column_definitions function
- Handle columns in models, seeds, snapshots, model versions, and
  source tables
- Add comprehensive test suite (11 tests)
- Update README with new deprecation coverage

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@dataders dataders requested a review from chayac as a code owner January 14, 2026 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

resolve redundancy of column keys from dbt yaml

2 participants