Skip to content

fix:[NEXT-1906] Tagging inconsistency in ES#2382

Merged
arunpaladin merged 1 commit into
masterfrom
fix/shipper-tagging-issue
Nov 26, 2025
Merged

fix:[NEXT-1906] Tagging inconsistency in ES#2382
arunpaladin merged 1 commit into
masterfrom
fix/shipper-tagging-issue

Conversation

@arunpaladin
Copy link
Copy Markdown
Collaborator

@arunpaladin arunpaladin commented Nov 26, 2025

fix:NEXT-1906 Tagging inconsistency in ES

Summary by CodeRabbit

  • Refactor
    • Optimized internal data processing pipeline performance characteristics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 26, 2025

Walkthrough

Three instances of parallelStream() calls in EntityManager.java were replaced with sequential stream() calls: in overridesMap construction and in two locations within prepareDocs that process entities and tags. The sequential logic per element remains unchanged; only concurrency behavior is altered.

Changes

Cohort / File(s) Summary
Parallel to Sequential Stream Conversion
jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java
Replaced three parallelStream() calls with stream(): overridesMap construction, entities processing in prepareDocs, and tags processing in prepareDocs. Sequential execution now used instead of parallel; per-element logic unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Verify that removing parallelism in overridesMap construction does not introduce performance bottlenecks for large override datasets
  • Confirm that serializing entity processing in prepareDocs aligns with downstream component expectations and does not affect correctness
  • Ensure tag processing serialization does not negatively impact throughput for data shipment operations

Poem

🐰 Hops through streams both old and new,

Where parallel once flew askew,

Now sequential, calm and true,

One thread at a time—what shall we do?

The data flows, the tasks are through! 🌊

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete, containing only the title and issue link without addressing required template sections like Problem, Solution, testing details, or checklist items. Expand the description to include Problem statement, Solution explanation, testing methodology, and completed checklist items from the template to provide proper context for reviewers.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly references the specific issue being fixed (NEXT-1906) and identifies the area of concern (tagging inconsistency in ES), directly summarizing the main change.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/shipper-tagging-issue

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5f71490 and fda0742.

📒 Files selected for processing (1)
  • jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ershad-paladin
Repo: PaladinCloud/CE PR: 2335
File: jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java:170-175
Timestamp: 2024-10-23T12:07:34.286Z
Learning: In the shipper code, the `EntityManager` is not multi-threaded, so thread-safety concerns are not applicable.
📚 Learning: 2024-10-23T12:07:34.286Z
Learnt from: ershad-paladin
Repo: PaladinCloud/CE PR: 2335
File: jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java:170-175
Timestamp: 2024-10-23T12:07:34.286Z
Learning: In the shipper code, the `EntityManager` is not multi-threaded, so thread-safety concerns are not applicable.

Applied to files:

  • jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java
🧬 Code graph analysis (1)
jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java (1)
jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/util/Util.java (1)
  • Util (41-205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Shipper-Build
  • GitHub Check: SonarCloud-Build
  • GitHub Check: Analyze (java)
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (3)
jobs/pacman-data-shipper/src/main/java/com/tmobile/cso/pacman/datashipper/entity/EntityManager.java (3)

164-165: Sequential stream appropriately prevents potential ordering inconsistencies.

The change from parallelStream() to stream() for building the overridesMap is correct. While groupingBy is generally safe with parallel streams, sequential processing ensures deterministic behavior and consistency with the overall fix.


268-380: Critical fix: Sequential stream prevents race conditions on shared mutable state.

The change from parallelStream() to stream() correctly addresses concurrency issues:

  1. Line 324: The shared accountIdNameMap (declared as HashMap on line 46) is modified within the loop using putIfAbsent(). With parallel streams, concurrent modifications to this non-thread-safe HashMap would cause race conditions and data corruption.

  2. Lines 348-352: Multiple tags matching the same entity would result in concurrent modifications to the same entityInfo map's entries (via put()), which is not thread-safe.

  3. Lines 319-321: Concurrent RDS calls from parallel threads could cause database connection pool exhaustion or contention.

This fix directly addresses the tagging inconsistency issue mentioned in the PR objective by ensuring sequential, deterministic processing.


348-353: Essential fix: Sequential tag processing prevents concurrent map modifications.

The change from parallelStream() to stream() is crucial for preventing the tagging inconsistency:

With parallel streams, multiple tags matching the same entity could trigger concurrent entityInfo.put() calls (line 351) on the same map. Since Map.put() is not thread-safe, this causes:

  • Lost tag updates
  • Inconsistent tag data in Elasticsearch
  • Non-deterministic behavior depending on thread scheduling

Sequential processing ensures each tag is applied atomically and in a deterministic order, directly resolving the tagging inconsistency issue described in the PR objective.

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • JIRA integration encountered authorization issues. Please disconnect and reconnect the integration in the CodeRabbit UI.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@arunpaladin arunpaladin merged commit 6e60f75 into master Nov 26, 2025
60 checks passed
@arunpaladin arunpaladin deleted the fix/shipper-tagging-issue branch November 26, 2025 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants