Skip to content

fix: [NEXT-1924] Add metrics and fix issue of description being too long in some cases#2386

Merged
arunpaladin merged 1 commit into
masterfrom
NEXT-1924-add-metrics
Jan 26, 2026
Merged

fix: [NEXT-1924] Add metrics and fix issue of description being too long in some cases#2386
arunpaladin merged 1 commit into
masterfrom
NEXT-1924-add-metrics

Conversation

@kevin-paladin
Copy link
Copy Markdown
Collaborator

@kevin-paladin kevin-paladin commented Jan 22, 2026

Description

In addition to adding metrics to track performance, an issue with a far too long description has been fixed.
This description issue prevented the event from being sent - that description is now truncated to 1000 characters.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Datadog metrics integration to monitor notification service invocations, success rates, and errors.
    • Implemented automatic description validation to prevent oversized payloads from failing.
  • Chores

    • Updated AWS SDK and added Micrometer monitoring dependencies for enhanced observability.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 22, 2026

Walkthrough

Integrates Datadog metrics observability into the notification service by adding AWS SDK and Micrometer dependencies, introducing a new PaladinMetrics utility class that retrieves Datadog credentials from AWS Secrets Manager, and instrumenting SendNotification to emit counters for invocation, success, and error events.

Changes

Cohort / File(s) Summary
Maven Configuration
lambda-functions/notification-invoke-service/pom.xml
Adds aws.java.sdk.version property and dependencyManagement block importing AWS SDK BOM and Micrometer BOM; introduces secretsmanager and micrometer-registry-datadog dependencies with slf4j-api exclusion.
Metrics Infrastructure
lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/PaladinMetrics.java
New utility class for Datadog metric emission; provides initialize() to configure DatadogMeterRegistry with credentials from AWS Secrets Manager, close() for cleanup, and incrementCount() to emit counters with optional tags.
Metrics Integration
lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/SendNotification.java
Instruments request handler with PaladinMetrics; adds counters for invoke, success, and error events; introduces trimDescription() helper to limit description field length; enhances error handling and logging for metric tracking.

Sequence Diagram

sequenceDiagram
    participant Handler as SendNotification
    participant Metrics as PaladinMetrics
    participant SecretsManager as AWS Secrets Manager
    participant Datadog as DatadogMeterRegistry
    
    Handler->>Metrics: initialize(tags)
    Metrics->>SecretsManager: getSecret(datadog-credentials)
    SecretsManager-->>Metrics: JSON credentials
    Metrics->>Metrics: parseJSON & extract apiKey
    Metrics->>Datadog: configure registry with apiKey & tags
    Datadog-->>Metrics: ready
    
    Handler->>Metrics: incrementCount("invoke", tags)
    Metrics->>Datadog: emit counter
    
    Handler->>Handler: sendPayload with trimDescription()
    
    alt Success
        Handler->>Metrics: incrementCount("send-success", tags)
    else Error
        Handler->>Metrics: incrementCount("send-error", tags)
    end
    
    Metrics->>Datadog: emit counter
    
    Handler->>Metrics: close()
    Metrics->>Datadog: shutdown registry
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • arunpaladin

Poem

🐰 Hop, hop, metrics flow,
Datadog tags that brightly glow,
Secrets fetched from AWS deep,
Counters increment, promises keep!
📊✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete; it lacks several required template sections including Problem, Solution, Type of change, How Has This Been Tested, and Checklist items. Complete the PR description template by adding Problem, Solution, explicit type of change selection, test details, and verification checklist items.
Docstring Coverage ⚠️ Warning Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: adding metrics and fixing the description length issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • JIRA integration encountered authorization issues. Please disconnect and reconnect the integration in the CodeRabbit UI.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In
`@lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/PaladinMetrics.java`:
- Around line 25-67: The initialize method can throw (getDatadogSecret, JSON
parsing, missing keys) and close()/incrementCount() assume registry is non-null;
make metrics best-effort by catching Throwable inside initialize (around
getDatadogSecret and registry construction), log the error and leave registry
null instead of propagating; in close() guard registry with if (registry !=
null) before calling close and then set registry = null; in
incrementCount(String name, String ...tags) return early if registry is null (or
catch exceptions from registry.counter()/increment and log them) so metric
failures never throw back into the Lambda; update references to registry,
initialize, close, and incrementCount accordingly.

In
`@lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/SendNotification.java`:
- Around line 82-83: The logging call uses obj.toString() which throws NPE if
obj is null; update the block around PaladinMetrics.incrementCount and
LOGGER.error to guard against null by converting obj safely (e.g., use
String.valueOf(obj) or Objects.toString(obj, "null")) and compute length only
from that safe string (or use (obj==null?0:obj.toString().length())); replace
direct obj.toString() usages in the LOGGER.error call with the safe
representation to avoid NPE in SendNotification.
- Around line 102-117: The trimDescription method currently casts and calls Gson
methods unsafely; wrap the initial parse in a try-catch (catch
JsonSyntaxException/RuntimeException) and return the original jsonStr on any
failure, and replace direct casts with safe checks: after parsing check
JsonElement.isJsonObject() before calling getAsJsonObject() (refer to JsonParser
and jo), check that jo.get("payload") exists and isJsonObject() before using
payloadObj, and verify descObj.isJsonPrimitive() and
getAsJsonPrimitive().isString() before calling getAsString(); only then truncate
using maxDescriptionLength and LOGGER.warn, otherwise return jsonStr unchanged.

Comment on lines +25 to +67
public static void initialize(String ...tags) {
JsonObject secret = getDatadogSecret();

DatadogConfig config = new DatadogConfig() {
@Override
public String apiKey() {
return secret.get("apiKey").getAsString();
}

@Override
public String applicationKey() {
return secret.get("applicationKey").getAsString();
}

@Override
public Duration step() {
return Duration.ofSeconds(10);
}

@Override
public String get(String k) {
return null;
}
};

registry = new DatadogMeterRegistry(config, Clock.SYSTEM);

Tags commonTags = Tags.of(tags);
if (secret.get("env") != null) {
commonTags = commonTags.and("env", secret.get("env").getAsString());
}
registry.config().commonTags(commonTags);
}

public static void close() {
registry.close();
registry = null;
}

public static void incrementCount(String name, String ...tags) {
Counter counter = registry.counter(name, tags);
counter.increment();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent metrics failures from breaking the notification flow.

initialize() can throw (Secrets Manager, JSON parsing, missing keys), and close()/incrementCount() assume registry is non‑null. This can fail the Lambda or throw NPEs, even though metrics should be best‑effort.

🐛 Proposed fix (best‑effort metrics with null guards)
 public static void initialize(String ...tags) {
-    JsonObject secret = getDatadogSecret();
-
-    DatadogConfig config = new DatadogConfig() {
-        `@Override`
-        public String apiKey() {
-            return secret.get("apiKey").getAsString();
-        }
-        `@Override`
-        public String applicationKey() {
-            return secret.get("applicationKey").getAsString();
-        }
-        `@Override`
-        public Duration step() {
-            return Duration.ofSeconds(10);
-        }
-        `@Override`
-        public String get(String k) {
-            return null;
-        }
-    };
-
-    registry = new DatadogMeterRegistry(config, Clock.SYSTEM);
-
-    Tags commonTags = Tags.of(tags);
-    if (secret.get("env") != null) {
-        commonTags = commonTags.and("env", secret.get("env").getAsString());
-    }
-    registry.config().commonTags(commonTags);
+    try {
+        JsonObject secret = getDatadogSecret();
+        DatadogConfig config = new DatadogConfig() {
+            `@Override`
+            public String apiKey() { return secret.get("apiKey").getAsString(); }
+            `@Override`
+            public String applicationKey() { return secret.get("applicationKey").getAsString(); }
+            `@Override`
+            public Duration step() { return Duration.ofSeconds(10); }
+            `@Override`
+            public String get(String k) { return null; }
+        };
+
+        DatadogMeterRegistry newRegistry = new DatadogMeterRegistry(config, Clock.SYSTEM);
+        Tags commonTags = Tags.of(tags);
+        if (secret.has("env") && !secret.get("env").isJsonNull()) {
+            commonTags = commonTags.and("env", secret.get("env").getAsString());
+        }
+        newRegistry.config().commonTags(commonTags);
+        registry = newRegistry;
+    } catch (RuntimeException ex) {
+        logger.error("Metrics initialization failed; continuing without metrics.", ex);
+        registry = null;
+    }
 }
 
 public static void close() {
-    registry.close();
-    registry = null;
+    if (registry != null) {
+        registry.close();
+        registry = null;
+    }
 }
 
 public static void incrementCount(String name, String ...tags) {
-    Counter counter = registry.counter(name, tags);
-    counter.increment();
+    if (registry == null) {
+        return;
+    }
+    Counter counter = registry.counter(name, tags);
+    counter.increment();
 }
🤖 Prompt for AI Agents
In
`@lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/PaladinMetrics.java`
around lines 25 - 67, The initialize method can throw (getDatadogSecret, JSON
parsing, missing keys) and close()/incrementCount() assume registry is non-null;
make metrics best-effort by catching Throwable inside initialize (around
getDatadogSecret and registry construction), log the error and leave registry
null instead of propagating; in close() guard registry with if (registry !=
null) before calling close and then set registry = null; in
incrementCount(String name, String ...tags) return early if registry is null (or
catch exceptions from registry.counter()/increment and log them) so metric
failures never throw back into the Lambda; update references to registry,
initialize, close, and incrementCount accordingly.

Comment on lines +82 to +83
PaladinMetrics.incrementCount(NOTIFICATION_INVOKE_SVC + "-send-error");
LOGGER.error("Failed sending message: length={}; {}", obj.toString().length(), obj, ex);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid NPE when logging failed list items.

obj.toString() will throw if the list contains a null element, masking the real error.

💡 Suggested fix
-    LOGGER.error("Failed sending message: length={}; {}", obj.toString().length(), obj, ex);
+    String objStr = String.valueOf(obj);
+    LOGGER.error("Failed sending message: length={}; {}", objStr.length(), objStr, ex);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
PaladinMetrics.incrementCount(NOTIFICATION_INVOKE_SVC + "-send-error");
LOGGER.error("Failed sending message: length={}; {}", obj.toString().length(), obj, ex);
PaladinMetrics.incrementCount(NOTIFICATION_INVOKE_SVC + "-send-error");
String objStr = String.valueOf(obj);
LOGGER.error("Failed sending message: length={}; {}", objStr.length(), objStr, ex);
🤖 Prompt for AI Agents
In
`@lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/SendNotification.java`
around lines 82 - 83, The logging call uses obj.toString() which throws NPE if
obj is null; update the block around PaladinMetrics.incrementCount and
LOGGER.error to guard against null by converting obj safely (e.g., use
String.valueOf(obj) or Objects.toString(obj, "null")) and compute length only
from that safe string (or use (obj==null?0:obj.toString().length())); replace
direct obj.toString() usages in the LOGGER.error call with the safe
representation to avoid NPE in SendNotification.

Comment on lines +102 to 117
private String trimDescription(String jsonStr) {
JsonObject jo = (JsonObject) JsonParser.parseString(jsonStr);
JsonObject payloadObj = (JsonObject) jo.get("payload");
if (payloadObj != null) {
JsonElement descObj = payloadObj.get("description");
if (descObj != null && descObj.getAsJsonPrimitive().isString()) {
String description = descObj.getAsString();
if (description.length() > maxDescriptionLength) {
LOGGER.warn("Truncating description length={}", description.length());
payloadObj.addProperty("description", description.substring(0, maxDescriptionLength));
return jo.toString();
}
}
}
return jsonStr;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Gson JsonParser.parseString invalid JSON exception behavior

💡 Result:

Short answer
JsonParser.parseString is lenient by default and will accept some non‑standard input (e.g., bare values, single quotes, unquoted names in lenient mode), so it does not always throw on "invalid" JSON. It will throw com.google.gson.JsonSyntaxException / JsonParseException (or JsonIOException for I/O problems) when it detects malformed JSON that cannot be parsed (multiple top‑level elements or trailing data, badly formed structure, etc.). (javadoc.io)

Details / practical notes

  • Javadoc: parseString declares it throws JsonSyntaxException/JsonParseException; the doc explicitly says parsing is performed in lenient mode and that exceptions are thrown for multiple top‑level elements or trailing data. (javadoc.io)
  • Troubleshooting: older Gson defaults to lenient parsing (so malformed JSON may be accepted). Newer Gson releases add APIs to enforce strictness (GsonBuilder.setStrictness / JsonReader.setStrictness) to make parsing strict. If you need strict validation, use a strict JsonReader or configure strictness. (google.github.io)
  • Community guidance: many recommend using JsonReader (in strict mode) to validate JSON rather than relying on JsonParser.parseString. (stackoverflow.com)

If you want, tell me your Gson version and example input and I’ll show whether parseString will accept it and how to force strict parsing.

Citations:


🌐 Web query:

Gson JsonElement type checking methods isJsonObject isJsonPrimitive

💡 Result:

Short explanation and examples:

  • JsonElement.isJsonObject() returns true if the element is a JsonObject; use it before calling getAsJsonObject() (otherwise getAsJsonObject() throws IllegalStateException). [1]
  • JsonElement.isJsonPrimitive() returns true if the element is a JsonPrimitive (string, number, boolean); use it before getAsJsonPrimitive() or getAsString()/getAsInt() etc. [1][2]

Example:

JsonElement el = JsonParser.parseString(jsonString);

if (el.isJsonObject()) {
    JsonObject obj = el.getAsJsonObject();
    // use obj
} else if (el.isJsonPrimitive()) {
    JsonPrimitive prim = el.getAsJsonPrimitive();
    if (prim.isBoolean()) { boolean b = prim.getAsBoolean(); }
    else { String s = prim.getAsString(); }
}

References: Gson JsonElement Javadoc. [1] [2]

Sources:
[1] Gson JsonElement Javadoc (com.google.gson.JsonElement).
[2] Gson JsonPrimitive / JsonElement methods in Javadoc.


🏁 Script executed:

# Find the actual file and examine the context around trimDescription
find . -name "SendNotification.java" -type f

Repository: PaladinCloud/CE

Length of output: 158


🏁 Script executed:

# Search for usage of trimDescription to understand how it's called
rg "trimDescription" --type java -B 2 -A 2

Repository: PaladinCloud/CE

Length of output: 2469


🏁 Script executed:

# Check error handling around trimDescription calls - get more context
rg "trimDescription" --type java -B 5 -A 5

Repository: PaladinCloud/CE

Length of output: 5274


🏁 Script executed:

# Look for the full method containing trimDescription calls to understand try-catch blocks
sed -n '100,200p' lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/SendNotification.java

Repository: PaladinCloud/CE

Length of output: 870


Guard trimDescription against non‑JSON or non‑object payloads.

Unchecked casts and unsafe Gson method calls will throw exceptions on malformed JSON or non-object payloads. While both call sites wrap trimDescription() in try-catch, exceptions prevent the SNS message from being published—the notification silently fails to send.

Root causes:

  • (JsonObject) JsonParser.parseString(jsonStr) throws JsonSyntaxException on malformed JSON; throws ClassCastException if the root is not an object.
  • (JsonObject) jo.get("payload") throws ClassCastException if the payload is not a JSON object.
  • descObj.getAsJsonPrimitive().isString() throws IllegalStateException if the element is not a primitive.

Use defensive type checks (isJsonObject(), isJsonPrimitive()) and wrap parsing in try-catch to handle edge cases gracefully, returning the original string on failure.

Proposed fix (defensive parsing)
 private String trimDescription(String jsonStr) {
-    JsonObject jo = (JsonObject) JsonParser.parseString(jsonStr);
-    JsonObject payloadObj = (JsonObject) jo.get("payload");
-    if (payloadObj != null) {
-        JsonElement descObj = payloadObj.get("description");
-        if (descObj != null && descObj.getAsJsonPrimitive().isString()) {
-            String description = descObj.getAsString();
-            if (description.length() > maxDescriptionLength) {
-                LOGGER.warn("Truncating description length={}", description.length());
-                payloadObj.addProperty("description", description.substring(0, maxDescriptionLength));
-                return jo.toString();
-            }
-        }
-    }
-    return jsonStr;
+    try {
+        JsonElement root = JsonParser.parseString(jsonStr);
+        if (!root.isJsonObject()) {
+            return jsonStr;
+        }
+        JsonObject jo = root.getAsJsonObject();
+        JsonElement payloadEl = jo.get("payload");
+        if (payloadEl != null && payloadEl.isJsonObject()) {
+            JsonObject payloadObj = payloadEl.getAsJsonObject();
+            JsonElement descObj = payloadObj.get("description");
+            if (descObj != null && descObj.isJsonPrimitive() && descObj.getAsJsonPrimitive().isString()) {
+                String description = descObj.getAsString();
+                if (description.length() > maxDescriptionLength) {
+                    LOGGER.warn("Truncating description length={}", description.length());
+                    payloadObj.addProperty("description", description.substring(0, maxDescriptionLength));
+                    return jo.toString();
+                }
+            }
+        }
+    } catch (RuntimeException ex) {
+        LOGGER.warn("Skipping description trimming due to invalid JSON payload", ex);
+    }
+    return jsonStr;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private String trimDescription(String jsonStr) {
JsonObject jo = (JsonObject) JsonParser.parseString(jsonStr);
JsonObject payloadObj = (JsonObject) jo.get("payload");
if (payloadObj != null) {
JsonElement descObj = payloadObj.get("description");
if (descObj != null && descObj.getAsJsonPrimitive().isString()) {
String description = descObj.getAsString();
if (description.length() > maxDescriptionLength) {
LOGGER.warn("Truncating description length={}", description.length());
payloadObj.addProperty("description", description.substring(0, maxDescriptionLength));
return jo.toString();
}
}
}
return jsonStr;
}
private String trimDescription(String jsonStr) {
try {
JsonElement root = JsonParser.parseString(jsonStr);
if (!root.isJsonObject()) {
return jsonStr;
}
JsonObject jo = root.getAsJsonObject();
JsonElement payloadEl = jo.get("payload");
if (payloadEl != null && payloadEl.isJsonObject()) {
JsonObject payloadObj = payloadEl.getAsJsonObject();
JsonElement descObj = payloadObj.get("description");
if (descObj != null && descObj.isJsonPrimitive() && descObj.getAsJsonPrimitive().isString()) {
String description = descObj.getAsString();
if (description.length() > maxDescriptionLength) {
LOGGER.warn("Truncating description length={}", description.length());
payloadObj.addProperty("description", description.substring(0, maxDescriptionLength));
return jo.toString();
}
}
}
} catch (RuntimeException ex) {
LOGGER.warn("Skipping description trimming due to invalid JSON payload", ex);
}
return jsonStr;
}
🤖 Prompt for AI Agents
In
`@lambda-functions/notification-invoke-service/src/main/java/com/paladincloud/SendNotification.java`
around lines 102 - 117, The trimDescription method currently casts and calls
Gson methods unsafely; wrap the initial parse in a try-catch (catch
JsonSyntaxException/RuntimeException) and return the original jsonStr on any
failure, and replace direct casts with safe checks: after parsing check
JsonElement.isJsonObject() before calling getAsJsonObject() (refer to JsonParser
and jo), check that jo.get("payload") exists and isJsonObject() before using
payloadObj, and verify descObj.isJsonPrimitive() and
getAsJsonPrimitive().isString() before calling getAsString(); only then truncate
using maxDescriptionLength and LOGGER.warn, otherwise return jsonStr unchanged.

@arunpaladin arunpaladin merged commit fae36d9 into master Jan 26, 2026
60 checks passed
@arunpaladin arunpaladin deleted the NEXT-1924-add-metrics branch January 26, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants