Skip to content

[LFXV2-1223] Fix delete object ID corruption#47

Merged
mauriciozanettisalomao merged 3 commits intolinuxfoundation:mainfrom
mauriciozanettisalomao:feat/lfxv2-1223-eventing-handlers-indexer-access
Mar 16, 2026
Merged

[LFXV2-1223] Fix delete object ID corruption#47
mauriciozanettisalomao merged 3 commits intolinuxfoundation:mainfrom
mauriciozanettisalomao:feat/lfxv2-1223-eventing-handlers-indexer-access

Conversation

@mauriciozanettisalomao
Copy link
Contributor

@mauriciozanettisalomao mauriciozanettisalomao commented Mar 13, 2026

Overview

Jira ticket https://linuxfoundation.atlassian.net/browse/LFXV2-1223

  • Fix: Remove base64 decoding for delete actions — it was silently corrupting object IDs for certain object types
  • Observability: Add debug logging to decodeTransactionData to surface action/type context during base64 decoding

Bug Fix — Delete Action Corrupts Object ID

What happened

decodeTransactionData applied base64 decoding to all three action types (create, update, delete). For delete actions, transaction.Data is always a plain string object ID — not a base64-encoded payload.

When an object ID happened to be composed entirely of base64-legal characters and had a length that is a multiple of 4, base64.StdEncoding.DecodeString succeeded silently and overwrote the original ID with garbage bytes — causing the delete to target the wrong document in OpenSearch.

UUIDs (which contain -) and most short numeric IDs are immune by accident, which masked the bug during testing.

Observable impact

The corruption surfaced in janitor logs where object_ref contained garbled bytes instead of a valid ID:

{"time":"2026-03-12T16:40:12.651285-03:00","level":"INFO","msg":"Single document found, no janitor action needed",
 "component":"cleanup_repository","object_ref":"groupsio_member:׏|\ufffd\ufffd4","document_id":"groupsio_member:׏|<22>4"}

The janitor located a document using the corrupted ID, but it was never the intended target.

Which IDs are affected

Format Example Vulnerable?
UUID 550e8400-e29b-41d4-a716-446655440000 No — contains -, not valid base64
Short numeric (length not multiple of 4) 15554981610 (11 chars) No — DecodeString errors on odd length
Alphanumeric, length multiple of 4 groupsio_member IDs Yes — decodes silently, ID corrupted

Fix

Removed the case s.isDeleteAction branch from decodeTransactionData. Base64 decoding now only runs for create and update actions, which are the only ones that carry a base64-encoded JSON payload. Delete actions pass through untouched.

// Before — delete branch silently corrupted plain-string IDs
switch {
case s.isCreateAction(transaction) || s.isUpdateAction(transaction):
    ...
    transaction.Data = data
case s.isDeleteAction(transaction):
    transaction.Data = string(decodedData) // BUG
}

// After — delete actions are not decoded
if s.isCreateAction(transaction) || s.isUpdateAction(transaction) {
    ...
    transaction.Data = data
}

…ehensive unit tests

Jira Ticket: https://linuxfoundation.atlassian.net/browse/LFXV2-1223

Assisted by [Claude Code](https://claude.ai/code)

Signed-off-by: Mauricio Zanetti Salomao <mauriciozanetti86@gmail.com>
Jira Ticket: https://linuxfoundation.atlassian.net/browse/LFXV2-1223

Assisted by [Claude Code](https://claude.ai/code)

Signed-off-by: Mauricio Zanetti Salomao <mauriciozanetti86@gmail.com>
@mauriciozanettisalomao mauriciozanettisalomao requested a review from a team as a code owner March 13, 2026 13:40
Copilot AI review requested due to automatic review settings March 13, 2026 13:40
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Walkthrough

The changes modify transaction data decoding in the indexer service to handle base64 decoding only for create/update actions (removing delete action handling), and introduce helper functions in the GroupsIO enricher to extract group names and aliases. Comprehensive tests are added for both components.

Changes

Cohort / File(s) Summary
Transaction Data Decoding
internal/domain/services/indexer_service.go, internal/domain/services/indexer_service_test.go
Refactored base64 decoding logic to only apply to create/update actions (removed delete-action path). Added comprehensive test suite covering decoded data handling scenarios, JSON parsing, and error conditions.
GroupsIO Enricher
internal/enrichers/groupsio_mailing_list_enricher.go, internal/enrichers/groupsio_mailing_list_enricher_test.go
Added helper functions extractGroupName and extractGroupNameAndAliases to extract and prioritize group names with fallback logic. Wired these providers into enricher configuration. Added test suite validating sort name, alias extraction, whitespace trimming, and error handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'LFXV2-1223 Fix delete object ID corruption' accurately reflects the primary fix in the changeset: removing base64 decoding for delete actions to prevent ID corruption. This is the main change across the indexer service modifications.
Description check ✅ Passed The pull request description comprehensively details the bug fix (delete action ID corruption), the observability improvement (debug logging), and the feature addition (GroupsIO enricher), all directly reflected in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a delete-action data decoding bug that could silently corrupt object IDs, enhances GroupsIO mailing list enrichment to prefer group_name for naming/sorting, and adds observability plus tests around transaction data decoding.

Changes:

  • Prevent delete transactions from mutating transaction.Data during base64 decode (and add decoding debug context).
  • Prefer group_name for SortName and NameAndAliases in GroupsIOMailingListEnricher.
  • Add unit tests covering decodeTransactionData behavior and new GroupsIO naming behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
internal/enrichers/groupsio_mailing_list_enricher.go Adds custom sort-name and alias extraction using group_name.
internal/enrichers/groupsio_mailing_list_enricher_test.go Adds tests validating group_name prioritization and trimming behavior.
internal/domain/services/indexer_service.go Updates decodeTransactionData behavior and adds debug logging around decoding.
internal/domain/services/indexer_service_test.go Adds unit tests for decodeTransactionData across create/update/delete scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/enrichers/groupsio_mailing_list_enricher_test.go (1)

90-101: Consider adding NameAndAliases assertion for this case.

This test case verifies that SortName falls back to "Fallback" when group_name is empty, but it doesn't verify NameAndAliases. Based on extractGroupNameAndAliases logic, when group_name is empty and name is "Fallback", the expected NameAndAliases should be ["Fallback"].

The current assertion at line 129-131 skips validation when expectedBody.NameAndAliases is nil, which could hide unexpected behavior.

🔧 Suggested enhancement
 		{
 			name: "empty group_name falls back to name",
 			parsedData: map[string]any{
 				"uid":        "ml-empty",
 				"group_name": "",
 				"name":       "Fallback",
 			},
 			expectedBody: &contracts.TransactionBody{
-				ObjectID: "ml-empty",
-				SortName: "Fallback",
+				ObjectID:       "ml-empty",
+				SortName:       "Fallback",
+				NameAndAliases: []string{"Fallback"},
 			},
 		},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/enrichers/groupsio_mailing_list_enricher_test.go` around lines 90 -
101, The test case "empty group_name falls back to name" should also assert
NameAndAliases is populated; update the test vector for that case to set
expectedBody.NameAndAliases to []string{"Fallback"} and add an assertion in the
test's validation block (the same place that currently checks SortName/ObjectID)
to compare the actual body's NameAndAliases against expectedBody.NameAndAliases
so extractGroupNameAndAliases behavior is verified rather than implicitly
skipped when expectedBody.NameAndAliases is nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/enrichers/groupsio_mailing_list_enricher_test.go`:
- Around line 90-101: The test case "empty group_name falls back to name" should
also assert NameAndAliases is populated; update the test vector for that case to
set expectedBody.NameAndAliases to []string{"Fallback"} and add an assertion in
the test's validation block (the same place that currently checks
SortName/ObjectID) to compare the actual body's NameAndAliases against
expectedBody.NameAndAliases so extractGroupNameAndAliases behavior is verified
rather than implicitly skipped when expectedBody.NameAndAliases is nil.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 43f7325e-a8d0-4f3a-b8ac-d8c2fd9a7856

📥 Commits

Reviewing files that changed from the base of the PR and between 63392bf and 19b67e1.

📒 Files selected for processing (4)
  • internal/domain/services/indexer_service.go
  • internal/domain/services/indexer_service_test.go
  • internal/enrichers/groupsio_mailing_list_enricher.go
  • internal/enrichers/groupsio_mailing_list_enricher_test.go

asithade
asithade previously approved these changes Mar 13, 2026
… functions and simplifying default enricher initialization

Jira Ticket: https://linuxfoundation.atlassian.net/browse/LFXV2-1223

Assisted by [Claude Code](https://claude.ai/code)

Signed-off-by: Mauricio Zanetti Salomao <mauriciozanetti86@gmail.com>
@mauriciozanettisalomao mauriciozanettisalomao changed the title [LFXV2-1223] Fix delete object ID corruption and enrich GroupsIO mailing list group name [LFXV2-1223] Fix delete object ID corruption Mar 16, 2026
@mauriciozanettisalomao mauriciozanettisalomao merged commit b6b29a0 into linuxfoundation:main Mar 16, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants