Skip to content

Duplicate tags in ChangeEvent notifications for Entity tag updates #23840

@manerow

Description

@manerow

Affected module

Backend

Describe the bug

ChangeEvent notifications for table column tag updates report duplicate tags in the ChangeDescription. When updating tags on a table column, the fieldsAdded array
contains duplicate entries of the same tag, even though the entity itself stores only one instance.

To Reproduce

  1. Update tags on a table column (e.g., change from PII.Sensitive to PII.None)
  2. Observe the ChangeDescription in the generated ChangeEvent
  3. The fieldsAdded array shows duplicate tag entries:
fieldsAdded=[
  FieldChange[name=columns.xxx.tags, newValue=[
    {"tagFQN":"PII.None","name":"None",...},
    {"tagFQN":"PII.None","name":"None",...}  // duplicate
  ]]
]

Expected behavior

Each tag should appear only once in the ChangeDescription fieldsAdded array, matching the actual entity state.

Root cause

EntityRepository.updateTags() (line 4020-4067) pre-calculates addedTags and deletedTags, then passes these pre-populated lists to recordListChange().
However, recordListChange() is designed to calculate diffs itself by adding to the provided lists, causing duplicates.
Other implementations like GlossaryTermRepository correctly pass empty lists to recordListChange().

Version

  • OpenMetadata version: main branch

Metadata

Metadata

Assignees

Type

Projects

Status

In Review / QA 👀

Status

In Review / QA 👀

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions