Skip to content

[HCMPRE-2741] Central instance support for MDMS V2 service #739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

holashchand
Copy link
Collaborator

@holashchand holashchand commented May 9, 2025

Summary by CodeRabbit

  • New Features

    • Introduced tenant-specific message publishing and schema resolution, enabling tenant-aware data operations and error handling for invalid tenant IDs.
  • Bug Fixes

    • Enhanced error reporting for invalid tenant IDs during data searches.
  • Refactor

    • Updated internal queries and repository logic to support dynamic schema prefixes and tenant context.
  • Chores

    • Improved database migration script to support migrations across multiple schemas.
    • Updated dependency versions for improved stability.

Copy link

coderabbitai bot commented May 9, 2025

Walkthrough

This update introduces tenant-aware functionality to the MDMS service. It modifies repository and producer classes to handle tenant-specific topic publishing and schema resolution, adds error handling for invalid tenant IDs, and updates SQL queries to support dynamic schema prefixes. The migration script is enhanced to support multi-schema Flyway migrations.

Changes

File(s) Change Summary
core-services/mdms-v2/pom.xml Updated project version to 2.9.1-SNAPSHOT; upgraded services-common dependency from 2.0.0-SNAPSHOT to 2.9.0-SNAPSHOT; removed commented duplicate dependency entry.
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/errors/ErrorCodes.java Added new constant INVALID_TENANT_ID_ERR_CODE for error handling of invalid tenant IDs.
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/producer/Producer.java Updated push method to include tenant ID parameter; uses MultiStateInstanceUtil to resolve tenant-specific Kafka topic; added logging and injected MultiStateInstanceUtil.
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/MdmsDataRepositoryImpl.java Injected MultiStateInstanceUtil; updated create and update methods to pass tenant ID to producer; enhanced searchV2 and search methods to replace schema placeholders with tenant ID and handle invalid tenant exceptions.
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/SchemaDefinitionDbRepositoryImpl.java Made fields final; injected MultiStateInstanceUtil; updated create to pass tenant ID to producer; enhanced search with tenant-aware schema replacement, exception handling, and improved logging.
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/MdmsDataQueryBuilder.java,
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/MdmsDataQueryBuilderV2.java,
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/SchemaDefinitionQueryBuilder.java
Modified SQL query constants to use SCHEMA_REPLACE_STRING placeholder for dynamic schema prefixing instead of hardcoded schema names.
core-services/mdms-v2/src/main/resources/db/migrate.sh Replaced single-schema Flyway migration script with a Bash script that iterates over multiple schemas from environment variable; dynamically constructs schema-specific URLs; added debug echo statements.
core-services/mdms-v2/CHANGELOG.md Updated changelog with entries for versions 2.9.1, 2.9.0, and 1.3.3 describing tenant-specific topics, schema resolution, multi-schema migration, dependency upgrades, and central instance integration.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant MdmsDataRepositoryImpl
    participant MultiStateInstanceUtil
    participant Producer
    participant DB

    Client->>MdmsDataRepositoryImpl: create(MdmsRequest)
    MdmsDataRepositoryImpl->>Producer: push(tenantId, topic, value)
    Producer->>MultiStateInstanceUtil: resolveTopic(tenantId, topic)
    MultiStateInstanceUtil-->>Producer: tenantSpecificTopic
    Producer-->>MdmsDataRepositoryImpl: (message sent)
    MdmsDataRepositoryImpl-->>Client: (acknowledgement)

    Client->>MdmsDataRepositoryImpl: search(criteria)
    MdmsDataRepositoryImpl->>MultiStateInstanceUtil: replaceSchemaPlaceholder(query, tenantId)
    alt valid tenantId
        MultiStateInstanceUtil-->>MdmsDataRepositoryImpl: queryWithSchema
        MdmsDataRepositoryImpl->>DB: execute(queryWithSchema)
        DB-->>MdmsDataRepositoryImpl: results
        MdmsDataRepositoryImpl-->>Client: results
    else invalid tenantId
        MultiStateInstanceUtil-->>MdmsDataRepositoryImpl: throw InvalidTenantIdException
        MdmsDataRepositoryImpl-->>Client: CustomException(INVALID_TENANT_ID_ERR_CODE)
    end
Loading

Poem

🐇
Tenants now have their own space,
With schemas and topics in every case.
Queries adapt, and errors are clear,
Multi-schema migrations bring cheer!
In the warren of data, so tidy and bright,
Each tenant’s records now hop just right.
— A happy CodeRabbit at night!

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 375c775 and d252653.

📒 Files selected for processing (1)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/producer/Producer.java (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/producer/Producer.java
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@holashchand holashchand closed this May 9, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
core-services/mdms-v2/src/main/resources/db/migrate.sh (1)

3-6: Quote variables and validate required inputs
Currently debug statements and assignments are unquoted and may break if values contain spaces or are unset. Consider quoting and validating SCHEMA_NAME. Remove or gate debug echo behind a verbose flag if not needed in production.

-echo "the baseurl : $DB_URL"
-schemasetter="?currentSchema="
-schemas=$SCHEMA_NAME
-echo "the schemas : $schemas"
+echo "Base URL: ${baseurl}"
+schemasetter="?currentSchema="
+schemas="${SCHEMA_NAME:?Environment variable SCHEMA_NAME must be set}"
+echo "Schemas to migrate: ${schemas}"
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/MdmsDataRepositoryImpl.java (2)

82-88: Duplicate try/catch blocks – extract to a helper

searchV2 and search contain identical try/catch logic for schema substitution. Duplicated code increases maintenance cost and risk of divergence.

-private String replaceSchema(String query, String tenantId){
-    try{
-        return multiStateInstanceUtil.replaceSchemaPlaceholder(query, tenantId);
-    }catch(InvalidTenantIdException e){
-        throw new CustomException(INVALID_TENANT_ID_ERR_CODE, e.getMessage());
-    }
-}

Then invoke:

query = replaceSchema(query, mdmsCriteriaV2.getTenantId());

98-105: Consider lowering query log level & include parameters separately

Logging the full, resolved SQL at INFO can expose multi-tenant schema names and potentially sensitive literals in prod logs. A safer pattern is:

log.debug("MDMS query: {} | params: {}", query, preparedStmtList);

This keeps prod logs clean while retaining valuable data for debugging when DEBUG is enabled.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa563a9 and 3f5d67a.

📒 Files selected for processing (9)
  • core-services/mdms-v2/pom.xml (1 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/errors/ErrorCodes.java (1 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/producer/Producer.java (1 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/MdmsDataRepositoryImpl.java (5 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/SchemaDefinitionDbRepositoryImpl.java (4 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/MdmsDataQueryBuilder.java (1 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/MdmsDataQueryBuilderV2.java (1 hunks)
  • core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/SchemaDefinitionQueryBuilder.java (2 hunks)
  • core-services/mdms-v2/src/main/resources/db/migrate.sh (1 hunks)
🔇 Additional comments (10)
core-services/mdms-v2/pom.xml (1)

90-90:

❓ Verification inconclusive

Ensure the new snapshot version is published and compatible
Double-check that org.egov.services:services-common:2.9.0-SNAPSHOT is published to your snapshot Nexus repository and resolves correctly without introducing breaking changes in downstream modules.


🏁 Script executed:

#!/usr/bin/env bash
# Verify artifact existence in Nexus snapshots
curl -sI https://nexus-repo.digit.org/nexus/content/repositories/snapshots/org/egov/services/services-common/2.9.0-SNAPSHOT/

Length of output: 336


Snapshot Artifact Verification Completed
The Nexus snapshot repository returns HTTP 200 for org.egov.services:services-common:2.9.0-SNAPSHOT, so the artifact is indeed published.

Please proceed to validate compatibility with downstream modules by:

  • Running local builds or integration tests against this new snapshot.
  • Checking for any API changes in services-common that might impact consumers.
core-services/mdms-v2/src/main/java/org/egov/infra/mdms/errors/ErrorCodes.java (1)

28-28: LGTM: Addition of error code for invalid tenant ID

The new error code constant follows the existing naming conventions and will be used for tenant validation in repository implementations.

core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/SchemaDefinitionQueryBuilder.java (2)

12-13: LGTM: Added import for schema placeholder

Appropriate import for the dynamic schema replacement constant.


22-22: LGTM: Replaced hardcoded schema with dynamic placeholder

Good implementation of tenant-aware schema resolution by replacing the hardcoded schema prefix with the dynamic placeholder.

core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/MdmsDataQueryBuilderV2.java (2)

13-14: LGTM: Added import for schema placeholder

Appropriate import for the dynamic schema replacement constant.


22-22: LGTM: Replaced hardcoded schema with dynamic placeholder

Good implementation of tenant-aware schema resolution by replacing the hardcoded schema reference with the dynamic placeholder.

core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/querybuilder/MdmsDataQueryBuilder.java (2)

9-10: LGTM: Added import for schema placeholder

Appropriate import for the dynamic schema replacement constant.


15-15: LGTM: Replaced hardcoded schema with dynamic placeholder

Good implementation of tenant-aware schema resolution by replacing the hardcoded schema reference with the dynamic placeholder.

core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/SchemaDefinitionDbRepositoryImpl.java (1)

56-58: Propagation of tenant ID in create looks good
The call now includes the tenant while publishing, matching the new Producer.push signature.

core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/MdmsDataRepositoryImpl.java (1)

63-72: Good alignment with tenant-aware producer
Both create and update now pass tenantId. This prevents cross-tenant data on Kafka.

Comment on lines +1 to +2
#!/bin/bash
baseurl=$DB_URL
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enforce strict mode and improve portability
Add a portable shebang and enable strict error handling at the top of the script to fail fast and catch unbound variables.
Apply this diff:

-#!/bin/bash
-baseurl=$DB_URL
+#!/usr/bin/env bash
+set -euo pipefail
+# Base URL for Flyway migrations (fail if unset)
+baseurl="${DB_URL:?Environment variable DB_URL must be set}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#!/bin/bash
baseurl=$DB_URL
#!/usr/bin/env bash
set -euo pipefail
# Base URL for Flyway migrations (fail if unset)
baseurl="${DB_URL:?Environment variable DB_URL must be set}"

Comment on lines +7 to +11
for schemaname in ${schemas//,/ }
do
echo "the schema name : ${baseurl}${schemasetter}${schemaname}"
flyway -url=${baseurl}${schemasetter}${schemaname} -table=$SCHEMA_TABLE -user=$FLYWAY_USER -password=$FLYWAY_PASSWORD -locations=$FLYWAY_LOCATIONS -baselineOnMigrate=true -outOfOrder=true migrate
done
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Quote parameters in the loop and Flyway command
To avoid word-splitting and globbing issues, split the comma-separated list in a robust way, quote all expansions, and format the Flyway command for readability and safety.

-for schemaname in ${schemas//,/ }
-do
-    echo "the schema name : ${baseurl}${schemasetter}${schemaname}"
-    flyway -url=${baseurl}${schemasetter}${schemaname}  -table=$SCHEMA_TABLE -user=$FLYWAY_USER -password=$FLYWAY_PASSWORD -locations=$FLYWAY_LOCATIONS -baselineOnMigrate=true -outOfOrder=true migrate
-done
+# Split schemas on comma into an array
+IFS=',' read -r -a schema_array <<< "${schemas}"
+for schemaname in "${schema_array[@]}"; do
+  target_url="${baseurl}${schemasetter}${schemaname}"
+  echo "Migrating schema '${schemaname}' with URL: '${target_url}'"
+  flyway \
+    -url="${target_url}" \
+    -table="${SCHEMA_TABLE}" \
+    -user="${FLYWAY_USER}" \
+    -password="${FLYWAY_PASSWORD}" \
+    -locations="${FLYWAY_LOCATIONS}" \
+    -baselineOnMigrate=true \
+    -outOfOrder=true \
+    migrate
+done

Comment on lines +26 to 30
public void push(String tenantId, String topic, Object value) {
String updatedTopic = multiStateInstanceUtil.getStateSpecificTopicName(tenantId, topic);
log.info("The Kafka topic for the tenantId : {} is : {}", tenantId, updatedTopic);
this.kafkaTemplate.send(updatedTopic, value);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden the push contract (null / blank checks & async error handling)

  1. tenantId and topic are accepted blindly. If either arrives as null/blank, getStateSpecificTopicName may throw or silently produce an invalid topic.
  2. kafkaTemplate.send(..) is async and returns a ListenableFuture. At present, failures will be swallowed.

Consider:

+Assert.hasText(tenantId, "tenantId must not be blank");
+Assert.hasText(topic, "topic must not be blank");
+
+ListenableFuture<SendResult<String, Object>> future =
+        kafkaTemplate.send(updatedTopic, value);
+future.addCallback(
+        success -> log.debug("Message sent to {}", updatedTopic),
+        ex      -> log.error("Failed to publish to {}", updatedTopic, ex));

This keeps the producer robust and surfaces publish failures upstream.

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 70 to 77
String query = schemaDefinitionQueryBuilder.getSchemaSearchQuery(schemaDefCriteria, preparedStatementList);
log.info("Schema definition search query: " + query);
try {
query = multiStateInstanceUtil.replaceSchemaPlaceholder(query, schemaDefCriteria.getTenantId());
} catch (InvalidTenantIdException e) {
throw new CustomException(INVALID_TENANT_ID_ERR_CODE, e.getMessage());
}
log.info("Schema definition search query: {}", query);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify placeholder replacement & guard against SQL-injection

replaceSchemaPlaceholder performs a string substitution of the schema name. If tenantId can be user-controlled, this opens the door to SQL–injection via schema names (e.g., foo; DROP TABLE X;). Unless MultiStateInstanceUtil rigorously validates against a strict whitelist/regex, we should either:

  1. Restrict tenantId to [a-zA-Z0-9_]+ before substitution, or
  2. Use JDBC setCatalog() / setSchema() or a prepared-statement param rather than string concat.

Also, logging the fully-rendered query (log.info) may leak internal schema names in higher log levels. Consider downgrading to debug.


🏁 Script executed:

#!/bin/bash
# Locate the MultiStateInstanceUtil class and its implementation
rg -n "class MultiStateInstanceUtil" -A20 -C10

# Locate the replaceSchemaPlaceholder method to inspect tenantId validation
rg -n "replaceSchemaPlaceholder" -A10 -B5

Length of output: 66168


🏁 Script executed:

#!/bin/bash
# Display the implementation of replaceSchemaPlaceholder in MultiStateInstanceUtil
rg -n "replaceSchemaPlaceholder" -A20 -B5 core-services/libraries/services-common/src/main/java/org/egov/common/utils/MultiStateInstanceUtil.java

Length of output: 1138


Sanitize tenantId before schema substitution & lower log level
The current replaceSchemaPlaceholder implementation (MultiStateInstanceUtil.java:51–66) does an unchecked regex replace of "{schema}" using a substring of any tenantId. A malicious tenantId like
foo; DROP TABLE users; --.tenantx
could inject SQL into the query. At minimum, enforce a strict whitelist (e.g. [A-Za-z0-9_]+) on the schema segment before replacement, or—preferably—use JDBC’s setSchema()/setCatalog() API or a parameterized approach rather than string replacement.

Additionally, logging the fully rendered query at INFO level leaks internal schema names; change this to DEBUG in:
• core-services/mdms-v2/src/main/java/org/egov/infra/mdms/repository/impl/SchemaDefinitionDbRepositoryImpl.java (line 76).

@holashchand holashchand reopened this May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant