CMR-11272: Replace Clojurewerkz#2418
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR removes the Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
indexer-app/src/cmr/indexer/data/index_set_elasticsearch.clj (1)
150-155:⚠️ Potential issue | 🟠 MajorMerge connection HTTP options into the index deletion request.
Lines 151-155 make a raw
client/deletecall without merging the connection's HTTP options, inconsistent with howelasticsearch.cljhandles the same operation. This omits configured timeouts, retry policies, and connection managers, reducing reliability during load.Proposed fix
(when (esi-helper/exists? conn index-name) (let [admin-token (:admin-token config) + http-opts (:http-opts conn) response (client/delete (str (:uri conn) "/" index-name) - {:headers {"Authorization" admin-token - "Confirm-delete-action" "true" - :client-id t-config/cmr-client-id} - :throw-exceptions false}) + (merge http-opts + {:headers (merge (:headers http-opts) + {"Authorization" admin-token + "Confirm-delete-action" "true" + :client-id t-config/cmr-client-id}) + :throw-exceptions false})) status (:status response)]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@indexer-app/src/cmr/indexer/data/index_set_elasticsearch.clj` around lines 150 - 155, The delete call is missing the connection's HTTP options so timeouts/retries/connection managers are not applied; update the client/delete invocation in index_set_elasticsearch.clj to merge (:http-options conn) into the request map (while keeping :headers with admin-token, "Confirm-delete-action", :client-id and :throw-exceptions false) so the final request combines connection-level options with the per-call options—mirror the pattern used in elasticsearch.clj when calling client/delete and ensure you reference conn, :http-options, index-name and client/delete.
🧹 Nitpick comments (2)
elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj (2)
234-236: 💤 Low valueConsider using
when-notfor clarity.The current
if-notwithout an else branch relies onthrow-service-errorto never return. Usingwhen-notwould make the intent clearer that this is a guard clause.Proposed refactor
- (if-not index-found-in-description - (errors/throw-service-error :internal-error (format "Given task id %s is not tracking the given index %s because description in task [%s] did not include the index. Mismatch on task id with index error." reindex-task-id index description))) + (when-not index-found-in-description + (errors/throw-service-error :internal-error (format "Given task id %s is not tracking the given index %s because description in task [%s] did not include the index. Mismatch on task id with index error." reindex-task-id index description))) full-status)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj` around lines 234 - 236, Replace the guard written with if-not by using when-not to make intent explicit: in the block that checks index-found-in-description and calls errors/throw-service-error with the formatted message (which currently mentions reindex-task-id, index, and description), change the form "(if-not index-found-in-description (errors/throw-service-error ...))" to "(when-not index-found-in-description (errors/throw-service-error ...))" so the code clearly expresses a guard/early-exit before returning full-status.
110-137: 💤 Low valueNo status validation for
putanddeleteresponses.Both functions have
:throw-exceptions falsebut return the decoded response regardless of HTTP status. On write failures (e.g., version conflicts, mapping errors), callers will receive an error body rather than having an exception thrown. If this matches the old Elastisch behavior, consider adding a code comment to document this intentional design choice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj` around lines 110 - 137, The put and delete functions currently return decoded responses even on non-2xx HTTP status because they set :throw-exceptions false; update both functions (put and delete) to validate the HTTP response status after the http call: if the response status is not in the 200–299 range, raise/throw an informative exception (e.g., ex-info) that includes the status and response body so callers get a clear failure instead of an error payload, otherwise return the decoded response; alternatively, if this behavior is intentional to preserve legacy Elastisch semantics, add a clear code comment inside put and delete explaining that non-2xx responses are intentionally returned (and why) so reviewers understand it is deliberate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@elastic-utils-lib/src/cmr/elastic_utils/connect.clj`:
- Around line 62-78: get-elastic-health currently swallows exceptions and
returns a string on failure which makes wait-for-healthy-elastic (which only
checks :timed_out) assume success; change the catch in get-elastic-health to
propagate a structured error (e.g. call errors/internal-error! with a
descriptive message and the exception details or rethrow the exception) so
callers always get a failure instead of a plain string, and ensure
wait-for-healthy-elastic (which calls get-elastic-health) only treats a
successful map response as valid (i.e. validate the response is a map and
contains :timed_out) and raise an internal-error! when the health call fails or
returns an unexpected type.
In `@elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj`:
- Around line 95-108: doc-get currently treats any non-404 status as a
successful document and decodes the body, which can make 500s look like missing
docs; update doc-get to explicitly branch on the HTTP status: if status is 200
(or 2xx) call decode-response and return the document, if status is 404 return
nil, otherwise surface the error to callers (e.g., throw an ex-info or return an
error map containing status and decoded body) so functions like get-document and
tests cannot mistake server errors for "not found"; use the existing
url-with-path and decode-response helpers when decoding the error body so the
payload is included in the raised error/returned error map.
- Around line 82-93: The scroll function currently posts without disabling
clj-http exceptions and without checking status; update the http/post call in
scroll to merge {:throw-exceptions false} into the request options (same pattern
used in search/count-query/delete-by-query), capture the returned response map
(e.g., resp), check the :status for non-2xx cases and return a structured error
response rather than letting clj-http throw, and pass that response into
decode-response so callers get consistent behavior; reference the scroll
function and the http/post call and mirror the status handling used by
search/count-query/delete-by-query.
In `@elastic-utils-lib/src/cmr/elastic_utils/es_index_helper.clj`:
- Around line 48-50: The current predicate treats any non-200 as "missing" by
returning false; update the logic around the client/head call (using
url-with-path, client/head, (:http-opts conn) and index-name) so that it returns
true for status 200, returns false only for status 404, and for any other status
throws or propagates an error (e.g. throw or ex-info with the response) so
transient 401/403/5xx failures are surfaced; apply the same change to the
analogous helper at the other block (lines ~121-130).
In `@indexer-app/src/cmr/indexer/data/elasticsearch.clj`:
- Around line 324-327: The error handler currently calls json/decode on any
string body (in the let binding that defines body/status/parsed-body) which will
throw for non-JSON responses; replace that direct json/decode call with the
existing safe parsing pattern used elsewhere (e.g., call the safe-parse-json
helper from transmit/http_helper.clj or wrap json/decode in a try/catch that
returns the original string on failure) so parsed-body never throws on plain
text/HTML; update the let where parsed-body is computed to use safe-parse-json
(or equivalent try/catch) instead of (json/decode body true).
---
Outside diff comments:
In `@indexer-app/src/cmr/indexer/data/index_set_elasticsearch.clj`:
- Around line 150-155: The delete call is missing the connection's HTTP options
so timeouts/retries/connection managers are not applied; update the
client/delete invocation in index_set_elasticsearch.clj to merge (:http-options
conn) into the request map (while keeping :headers with admin-token,
"Confirm-delete-action", :client-id and :throw-exceptions false) so the final
request combines connection-level options with the per-call options—mirror the
pattern used in elasticsearch.clj when calling client/delete and ensure you
reference conn, :http-options, index-name and client/delete.
---
Nitpick comments:
In `@elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj`:
- Around line 234-236: Replace the guard written with if-not by using when-not
to make intent explicit: in the block that checks index-found-in-description and
calls errors/throw-service-error with the formatted message (which currently
mentions reindex-task-id, index, and description), change the form "(if-not
index-found-in-description (errors/throw-service-error ...))" to "(when-not
index-found-in-description (errors/throw-service-error ...))" so the code
clearly expresses a guard/early-exit before returning full-status.
- Around line 110-137: The put and delete functions currently return decoded
responses even on non-2xx HTTP status because they set :throw-exceptions false;
update both functions (put and delete) to validate the HTTP response status
after the http call: if the response status is not in the 200–299 range,
raise/throw an informative exception (e.g., ex-info) that includes the status
and response body so callers get a clear failure instead of an error payload,
otherwise return the decoded response; alternatively, if this behavior is
intentional to preserve legacy Elastisch semantics, add a clear code comment
inside put and delete explaining that non-2xx responses are intentionally
returned (and why) so reviewers understand it is deliberate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fe4bf198-cff5-4883-b81b-ee04103e29f5
📒 Files selected for processing (17)
bootstrap-app/src/cmr/bootstrap/data/rebalance_util.cljelastic-utils-lib/project.cljelastic-utils-lib/src/cmr/elastic_utils/connect.cljelastic-utils-lib/src/cmr/elastic_utils/es_helper.cljelastic-utils-lib/src/cmr/elastic_utils/es_index_helper.cljelastic-utils-lib/src/cmr/elastic_utils/index_util.cljelastic-utils-lib/src/cmr/elastic_utils/search/es_debug.cljelastic-utils-lib/src/cmr/elastic_utils/search/es_index.cljelastic-utils-lib/src/cmr/elastic_utils/search/es_query_to_elastic.cljelastic-utils-lib/src/cmr/elastic_utils/search/es_wrapper.cljindexer-app/int-test/cmr/indexer/test/services/index_fields.cljindexer-app/int-test/cmr/indexer/test/utility.cljindexer-app/int-test/cmr/indexer/test/valid_data_crud_tests.cljindexer-app/src/cmr/indexer/data/collection_granule_aggregation_cache.cljindexer-app/src/cmr/indexer/data/elasticsearch.cljindexer-app/src/cmr/indexer/data/index_set_elasticsearch.cljsystem-int-test/test/cmr/system_int_test/search/misc/autocomplete_collection_facet_search_test.clj
| (try | ||
| (let [url (str (:uri conn) "/_cluster/health") | ||
| response (client/get url (merge (:http-opts conn) | ||
| {:query-params params | ||
| :accept :json}))] | ||
| (json/decode (:body response) true)) | ||
| (catch Exception e | ||
| (format "Unable to get elasticsearch cluster health, caught exception: %s" | ||
| (.getMessage e)))))) | ||
|
|
||
| (defn wait-for-healthy-elastic | ||
| "Waits for the elasticsearch cluster health to reach yellow. Pass in a elasticsearch store that | ||
| has a :conn key with the elastisch connection" | ||
| [elastic-store] | ||
| (when (:timed_out (admin/cluster-health | ||
| (:conn elastic-store) {:wait_for_status "yellow" :timeout "3s"})) | ||
| (when (:timed_out (get-elastic-health (:conn elastic-store) | ||
| {:wait_for_status "yellow" :timeout "3s"})) | ||
| (errors/internal-error! "Timed out waiting for elasticsearch to reach a healthy state"))) |
There was a problem hiding this comment.
Don’t swallow cluster health failures during startup.
Lines 68-70 return a string on request failure, and Lines 76-78 only inspect :timed_out. If the health call fails entirely, wait-for-healthy-elastic will treat that as success and continue with an unavailable cluster.
Suggested fix
(defn wait-for-healthy-elastic
"Waits for the elasticsearch cluster health to reach yellow. Pass in a elasticsearch store that
has a :conn key with the elastisch connection"
[elastic-store]
- (when (:timed_out (get-elastic-health (:conn elastic-store)
- {:wait_for_status "yellow" :timeout "3s"}))
- (errors/internal-error! "Timed out waiting for elasticsearch to reach a healthy state")))
+ (let [health (get-elastic-health (:conn elastic-store)
+ {:wait_for_status "yellow" :timeout "3s"})]
+ (cond
+ (string? health)
+ (errors/internal-error! health)
+
+ (:timed_out health)
+ (errors/internal-error! "Timed out waiting for elasticsearch to reach a healthy state"))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@elastic-utils-lib/src/cmr/elastic_utils/connect.clj` around lines 62 - 78,
get-elastic-health currently swallows exceptions and returns a string on failure
which makes wait-for-healthy-elastic (which only checks :timed_out) assume
success; change the catch in get-elastic-health to propagate a structured error
(e.g. call errors/internal-error! with a descriptive message and the exception
details or rethrow the exception) so callers always get a failure instead of a
plain string, and ensure wait-for-healthy-elastic (which calls
get-elastic-health) only treats a successful map response as valid (i.e.
validate the response is a map and contains :timed_out) and raise an
internal-error! when the health call fails or returns an unexpected type.
There was a problem hiding this comment.
Code rabbit found an existing issue, not something that you created. Do we care about this? Is this just used for testing or really connecting upon startup? I see, this is called from reset. Not sure that we really care here.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all usages of wait-for-healthy-elastic
rg -n "wait-for-healthy-elastic" --type clj -C3
Length of output: 28
🏁 Script executed:
#!/bin/bash
# Find all usages of wait-for-healthy-elastic
rg -n "wait-for-healthy-elastic" -C3
Length of output: 2271
🏁 Script executed:
#!/bin/bash
# Check the index_util.clj context around the call
rg -n "wait-for-healthy-elastic" -A5 -B10 elastic-utils-lib/src/cmr/elastic_utils/index_util.clj
# Also check dev-system control.clj broader context
rg -n "reset" -C5 dev-system/src/cmr/dev_system/control.clj | head -60
Length of output: 3391
@eereiter, good question — it's actually called in both places:
-
Dev/test path —
dev-system/control.cljLines 122–123, inside thePOST /resethandler used during integration testing. You're right that this is relatively low-risk: if ES is unreachable the tests will fail regardless, and an incorrect 200 from/resetwould be obvious. -
Production/startup path —
elastic-utils-lib/src/cmr/elastic_utils/index_util.cljLine 181, called after creating a new index during index initialization. This one is more meaningful — if the health check silently swallows an exception here, the code continues to set up aliases and refresh the index even if ES is in a bad state at that moment.
So the bug does touch production-relevant code via index_util.clj. Whether it's worth fixing now depends on how likely a health-check failure is in practice right after a successful create call. If the team is comfortable with the current behavior (pre-existing risk), it can be deferred. But if you want a quick guard, the suggested fix in my earlier comment would cover both call sites cleanly.
Happy to leave this as a follow-up issue if you'd prefer not to address it in this PR.
| (defn scroll | ||
| "Performs a scroll query, fetching the next page of results from a query given a scroll id" | ||
| [conn scroll-id opts] | ||
| (doc/scroll conn scroll-id opts)) | ||
| (let [url (url-with-path conn "_search" "scroll") | ||
| body (merge {:scroll_id scroll-id} | ||
| (select-keys opts [:scroll]))] | ||
| (decode-response | ||
| (http/post url | ||
| (merge (:http-opts conn) | ||
| {:content-type :json | ||
| :body (json/generate-string body) | ||
| :accept :json}))))) |
There was a problem hiding this comment.
Missing error handling for scroll requests.
Unlike search, count-query, and delete-by-query, this function lacks :throw-exceptions false and status checking. If ES returns a non-2xx status (e.g., 404 for expired scroll), clj-http will throw an exception rather than returning a structured error response. This inconsistency may cause different error behavior than callers expect.
Proposed fix to add consistent error handling
(defn scroll
"Performs a scroll query, fetching the next page of results from a query given a scroll id"
[conn scroll-id opts]
(let [url (url-with-path conn "_search" "scroll")
body (merge {:scroll_id scroll-id}
- (select-keys opts [:scroll]))]
- (decode-response
- (http/post url
- (merge (:http-opts conn)
- {:content-type :json
- :body (json/generate-string body)
- :accept :json})))))
+ (select-keys opts [:scroll]))
+ response (http/post url
+ (merge (:http-opts conn)
+ {:content-type :json
+ :body (json/generate-string body)
+ :accept :json
+ :throw-exceptions false}))
+ status (:status response)]
+ (if (some #{status} [200 201])
+ (decode-response response)
+ (throw (ex-info (str "Scroll failed with status " status)
+ {:status status :body (:body response)})))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj` around lines 82 - 93,
The scroll function currently posts without disabling clj-http exceptions and
without checking status; update the http/post call in scroll to merge
{:throw-exceptions false} into the request options (same pattern used in
search/count-query/delete-by-query), capture the returned response map (e.g.,
resp), check the :status for non-2xx cases and return a structured error
response rather than letting clj-http throw, and pass that response into
decode-response so callers get consistent behavior; reference the scroll
function and the http/post call and mirror the status handling used by
search/count-query/delete-by-query.
There was a problem hiding this comment.
will make the change
There was a problem hiding this comment.
@jmaeng72, sounds good! Once the change is in, the scroll function will have consistent error handling aligned with search, count-query, and delete-by-query. The proposed fix in the review comment should serve as a reference when you make the update.
| (defn doc-get | ||
| "Fetches and returns a document by id or `nil` if it does not exist." | ||
| ([conn index mapping-type id] | ||
| (doc-get conn index mapping-type id nil)) | ||
| ([conn index _mapping-type id opts] | ||
| (let [result (if (empty? opts) | ||
| (rest/get conn (rest/record-url conn index "_doc" id)) | ||
| (rest/get conn (rest/record-url conn index "_doc" id) {:query-params opts}))] | ||
| (when-not (not-found? result) | ||
| result)))) | ||
| (let [url (url-with-path conn index "_doc" id) | ||
| response (http/get url | ||
| (merge (:http-opts conn) | ||
| {:query-params opts | ||
| :accept :json | ||
| :throw-exceptions false})) | ||
| status (:status response)] | ||
| (when-not (= 404 status) | ||
| (decode-response response))))) |
There was a problem hiding this comment.
Error statuses other than 404 are not handled.
The function returns nil on 404 and decodes the response for all other statuses. If Elasticsearch returns a 500 or other error status, the error body will be returned as if it were a valid document. Callers (e.g., get-document in elasticsearch.clj and tests checking the :found field) may silently misinterpret server errors as "document not found."
Proposed fix to handle error statuses
(defn doc-get
"Fetches and returns a document by id or `nil` if it does not exist."
([conn index mapping-type id]
(doc-get conn index mapping-type id nil))
([conn index _mapping-type id opts]
(let [url (url-with-path conn index "_doc" id)
response (http/get url
(merge (:http-opts conn)
{:query-params opts
:accept :json
:throw-exceptions false}))
status (:status response)]
- (when-not (= 404 status)
- (decode-response response)))))
+ (cond
+ (= 404 status) nil
+ (< status 300) (decode-response response)
+ :else (throw (ex-info (str "Document get failed with status " status)
+ {:status status :body (:body response)}))))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@elastic-utils-lib/src/cmr/elastic_utils/es_helper.clj` around lines 95 - 108,
doc-get currently treats any non-404 status as a successful document and decodes
the body, which can make 500s look like missing docs; update doc-get to
explicitly branch on the HTTP status: if status is 200 (or 2xx) call
decode-response and return the document, if status is 404 return nil, otherwise
surface the error to callers (e.g., throw an ex-info or return an error map
containing status and decoded body) so functions like get-document and tests
cannot mistake server errors for "not found"; use the existing url-with-path and
decode-response helpers when decoding the error body so the payload is included
in the raised error/returned error map.
| (let [url (url-with-path conn index-name) | ||
| response (client/head url (merge (:http-opts conn) {:throw-exceptions false}))] | ||
| (= 200 (:status response)))) |
There was a problem hiding this comment.
Only treat 404 as “missing”.
These helpers currently convert any non-200 response into false/[], so 401/403/5xx failures look the same as “index or alias does not exist”. Callers use these predicates to decide whether to create indexes and aliases, so a transient ES failure can trigger the wrong mutating path instead of surfacing the error.
Suggested fix
(defn exists?
"Return true if the given index exists"
[conn index-name]
(let [url (url-with-path conn index-name)
- response (client/head url (merge (:http-opts conn) {:throw-exceptions false}))]
- (= 200 (:status response))))
+ response (client/head url (merge (:http-opts conn) {:throw-exceptions false}))
+ status (:status response)]
+ (case status
+ 200 true
+ 404 false
+ (throw (ex-info (str "Index exists check failed with status " status)
+ {:status status :body (:body response)})))))
(defn get-aliases
"Get index aliases"
[conn index-name]
(let [url (url-with-path conn index-name "_alias")
response (client/get url (merge (:http-opts conn)
{:accept :json
:throw-exceptions false}))
status (:status response)]
- (if (= 404 status)
- []
- (let [resp (decode-response response)
- aliases (keys (get-in resp [(keyword index-name) :aliases]))]
- (mapv name aliases)))))
+ (case status
+ 200 (let [resp (decode-response response)
+ aliases (keys (get-in resp [(keyword index-name) :aliases]))]
+ (mapv name aliases))
+ 404 []
+ (throw (ex-info (str "Get aliases failed with status " status)
+ {:status status :body (:body response)})))))Also applies to: 121-130
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@elastic-utils-lib/src/cmr/elastic_utils/es_index_helper.clj` around lines 48
- 50, The current predicate treats any non-200 as "missing" by returning false;
update the logic around the client/head call (using url-with-path, client/head,
(:http-opts conn) and index-name) so that it returns true for status 200,
returns false only for status 404, and for any other status throws or propagates
an error (e.g. throw or ex-info with the response) so transient 401/403/5xx
failures are surfaced; apply the same change to the analogous helper at the
other block (lines ~121-130).
| (let [body (get-in (ex-data e) [:body]) | ||
| status (:status (ex-data e)) | ||
| parsed-body (if (string? body) (json/decode body true) body)] | ||
| {:error parsed-body :status status}))))) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the elasticsearch.clj file
find . -name "elasticsearch.clj" -type fRepository: nasa/Common-Metadata-Repository
Length of output: 187
🏁 Script executed:
# Read the file to see the actual code at lines 324-327 and surrounding context
cat -n indexer-app/src/cmr/indexer/data/elasticsearch.clj | sed -n '310,340p'Repository: nasa/Common-Metadata-Repository
Length of output: 1738
🏁 Script executed:
# Check what json library is imported in this file
grep -n "require\|import" indexer-app/src/cmr/indexer/data/elasticsearch.clj | head -20Repository: nasa/Common-Metadata-Repository
Length of output: 882
🏁 Script executed:
# Get the beginning of the file to see the requires/imports
head -50 indexer-app/src/cmr/indexer/data/elasticsearch.cljRepository: nasa/Common-Metadata-Repository
Length of output: 2035
🏁 Script executed:
# Search specifically for json library usage and import
grep -n "json" indexer-app/src/cmr/indexer/data/elasticsearch.cljRepository: nasa/Common-Metadata-Repository
Length of output: 267
🏁 Script executed:
# Look for other usages of json/decode in this file to see if there's error handling elsewhere
grep -B 3 -A 3 "json/decode\|json/encode" indexer-app/src/cmr/indexer/data/elasticsearch.cljRepository: nasa/Common-Metadata-Repository
Length of output: 362
🏁 Script executed:
# Check the test file for exception handling tests
cat indexer-app/test/cmr/indexer/test/data/elasticsearch.cljRepository: nasa/Common-Metadata-Repository
Length of output: 10884
🏁 Script executed:
# Look for other exception handling patterns in the codebase involving json/decode
rg -A 5 -B 5 "json/decode" indexer-app/src/cmr/indexer/Repository: nasa/Common-Metadata-Repository
Length of output: 2196
🏁 Script executed:
# Search for any existing error handling or try-catch patterns around json operations in the broader codebase
rg -A 3 -B 3 "json/decode.*catch|catch.*json/decode" .Repository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Check if there are any comments or documentation about exception handling with elasticsearch in the file
grep -B 5 -A 5 "ExceptionInfo\|exception\|error" indexer-app/src/cmr/indexer/data/elasticsearch.clj | head -50Repository: nasa/Common-Metadata-Repository
Length of output: 1971
🏁 Script executed:
# Check what cheshire.core json/decode does with invalid JSON - search for any handling in the codebase
rg "json/decode" -A 3 -B 3 . | grep -A 5 -B 5 "catch\|try"Repository: nasa/Common-Metadata-Repository
Length of output: 16464
🏁 Script executed:
# Look for documentation or comments about json/decode behavior
rg -i "cheshire\|json.*decode" . --type md --type txtRepository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Check the clj-http dependency since exceptions might come from there
grep -r "clj-http" indexer-app/project.cljRepository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Look at the elastic-utils pattern more closely - it has the same code!
cat -n elastic-utils-lib/src/cmr/elastic_utils/index_util.clj | sed -n '1,100p'Repository: nasa/Common-Metadata-Repository
Length of output: 4182
🏁 Script executed:
# Check if there are any integration tests showing non-JSON error responses from Elasticsearch
rg -i "elasticsearch.*error\|es.*error\|html\|text/plain" indexer-app/test/ -A 3 -B 3Repository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Search for documentation or comments about error response formats
rg "body.*string\|response.*format\|non-json\|plain text" . -iRepository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Find the exact location of the json/decode in elastic-utils index_util.clj
rg -n "parsed-body.*json/decode" elastic-utils-lib/src/cmr/elastic_utils/index_util.clj -B 5 -A 5Repository: nasa/Common-Metadata-Repository
Length of output: 571
🏁 Script executed:
# Look at the broader context of that function
grep -n "catch\|try\|defn" elastic-utils-lib/src/cmr/elastic_utils/index_util.clj | tail -20Repository: nasa/Common-Metadata-Repository
Length of output: 619
🏁 Script executed:
# Get the complete function containing the json/decode in index_util.clj
sed -n '130,160p' elastic-utils-lib/src/cmr/elastic_utils/index_util.cljRepository: nasa/Common-Metadata-Repository
Length of output: 1539
🏁 Script executed:
# Check if cheshire's json/decode throws on invalid JSON
rg -A 10 "cheshire" . --type md | head -30Repository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Look at what kind of responses clj-http can return in the body when there's an error
rg "clj-http\|ExceptionInfo" indexer-app/src/cmr/indexer/data/elasticsearch.clj -B 3 -A 3Repository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Search for any comments or documentation about what bodies might look like
grep -r "HTML\|text/plain\|content-type\|response.*format" indexer-app/src/cmr/indexer/data/elasticsearch.cljRepository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Let me check what cheshire documentation says and look for any evidence of json/decode behavior
# First, let's see if there are comments or code showing what happens with malformed JSON
rg "JsonParseException\|ParseException\|json.*error\|decode.*error" . -i --type java --type clj | head -20Repository: nasa/Common-Metadata-Repository
Length of output: 103
🏁 Script executed:
# Check how json/decode is being called in transmit/http_helper.clj since it already has the pattern
cat -n transmit-lib/src/cmr/transmit/http_helper.clj | sed -n '1,50p'Repository: nasa/Common-Metadata-Repository
Length of output: 2008
🏁 Script executed:
# Look at the specific pattern used in search_util.clj more carefully
grep -n "json/decode" system-int-test/src/cmr/system_int_test/utils/search_util.clj -B 5 -A 5Repository: nasa/Common-Metadata-Repository
Length of output: 5175
🏁 Script executed:
# Check if there are any tests for the try-elastic-operation function with non-JSON bodies
find . -path "*/test*" -name "*.clj" -type f -exec grep -l "try-elastic-operation\|try_elastic_operation" {} \;Repository: nasa/Common-Metadata-Repository
Length of output: 57
🏁 Script executed:
# Look for any real examples of ExceptionInfo being thrown with non-JSON bodies
rg "ExceptionInfo.*body\|ex-info.*:body" . -B 3 -A 3 | head -50Repository: nasa/Common-Metadata-Repository
Length of output: 57
Wrap json/decode to handle non-JSON error response bodies.
Line 326 assumes any string body is valid JSON. If Elasticsearch returns plain text, HTML, or other non-JSON content, json/decode throws and breaks the error handling path. This pattern is also present in elastic-utils-lib/src/cmr/elastic_utils/index_util.clj (lines 195-198, in a macro).
The codebase already handles this correctly elsewhere (e.g., transmit-lib/src/cmr/transmit/http_helper.clj with safe-parse-json). Apply the same pattern here.
Proposed fix
(catch clojure.lang.ExceptionInfo e
(let [body (get-in (ex-data e) [:body])
status (:status (ex-data e))
- parsed-body (if (string? body) (json/decode body true) body)]
+ parsed-body (if (string? body)
+ (try
+ (json/decode body true)
+ (catch Exception _
+ body))
+ body)]
{:error parsed-body :status status})))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@indexer-app/src/cmr/indexer/data/elasticsearch.clj` around lines 324 - 327,
The error handler currently calls json/decode on any string body (in the let
binding that defines body/status/parsed-body) which will throw for non-JSON
responses; replace that direct json/decode call with the existing safe parsing
pattern used elsewhere (e.g., call the safe-parse-json helper from
transmit/http_helper.clj or wrap json/decode in a try/catch that returns the
original string on failure) so parsed-body never throws on plain text/HTML;
update the let where parsed-body is computed to use safe-parse-json (or
equivalent try/catch) instead of (json/decode body true).
There was a problem hiding this comment.
🧩 Analysis chain
@eereiter, I'll apply the safe-parsing fix to both affected files now.
- Fix
json/decodeinindexer-app/src/cmr/indexer/data/elasticsearch.cljto wrap it in try/catch - Fix the same unsafe pattern in
elastic-utils-lib/src/cmr/elastic_utils/index_util.clj
🏁 Script executed:
#!/bin/bash
# Get the context around the json/decode in index_util.clj
sed -n '185,210p' elastic-utils-lib/src/cmr/elastic_utils/index_util.clj
Length of output: 1160
@eereiter, here are the exact diffs for both files:
1. indexer-app/src/cmr/indexer/data/elasticsearch.clj
(catch clojure.lang.ExceptionInfo e
(let [body (get-in (ex-data e) [:body])
status (:status (ex-data e))
- parsed-body (if (string? body) (json/decode body true) body)]
+ parsed-body (if (string? body)
+ (try
+ (json/decode body true)
+ (catch Exception _
+ body))
+ body)]
{:error parsed-body :status status})))))2. elastic-utils-lib/src/cmr/elastic_utils/index_util.clj (macro, uses gensym'd names)
(catch clojure.lang.ExceptionInfo e#
(let [body# (get-in (ex-data e#) [:body])
status# (:status (ex-data e#))
- parsed-body# (if (string? body#) (json/decode body# true) body#)]
+ parsed-body# (if (string? body#)
+ (try
+ (json/decode body# true)
+ (catch Exception _
+ body#))
+ body#)]
{:error parsed-body# :status status#}))))Both changes wrap the json/decode call in a try/catch that falls back to the raw string on any parse failure, so plain-text or HTML error responses from Elasticsearch no longer break the error-handling path.
The requested changes have been implemented and a pull request has been created: View PR
There was a problem hiding this comment.
Oh, I didn't mean for code rabbit to fix it. My comment was meant for Jyna to look at it.
There was a problem hiding this comment.
hahahah i love that it responded to you with a rocket emoji
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## CMR-10369-es8-feature #2418 +/- ##
=========================================================
- Coverage 56.55% 56.51% -0.05%
=========================================================
Files 1068 1069 +1
Lines 73580 73646 +66
Branches 2057 2057
=========================================================
+ Hits 41613 41618 +5
- Misses 30051 30112 +61
Partials 1916 1916 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| [nasa-cmr/cmr-common-lib "0.1.1-SNAPSHOT"] | ||
| [nasa-cmr/cmr-transmit-lib "0.1.0-SNAPSHOT"] | ||
| [org.clojure/clojure] | ||
| [ring/ring-jetty-adapter "1.14.2"] |
There was a problem hiding this comment.
are you not going to have to do some merge with your other change for this?
| (esri/refresh (context->conn context es-config/elastic-name)) | ||
| (esri/refresh (context->conn context es-config/gran-elastic-name))) | ||
| (esi-helper/refresh (context->conn context es-config/elastic-name) nil) | ||
| (esi-helper/refresh (context->conn context es-config/gran-elastic-name) nil)) |
There was a problem hiding this comment.
The nil in these two calls refreshes all of the indexes? That is what I think is happening - I just want to confirm.
| (es-util/decode-response | ||
| (client/get url (merge (:http-opts conn) | ||
| {:accept :json | ||
| :throw-exceptions false}))))) No newline at end of file |
There was a problem hiding this comment.
Can you add a new line here.
| (map join-names) | ||
| (filter identity) | ||
| (string/join "/"))] | ||
| (str (:uri conn) "/" path))) No newline at end of file |
| (es-util/decode-response | ||
| (http/post url | ||
| (merge (:http-opts conn) | ||
| {:body (-> (map json/encode operations) |
There was a problem hiding this comment.
can you explain in a comment what we are doing with the \n and why es needs it?
| (when (esi-helper/exists? conn index-name) | ||
| (let [admin-token (:admin-token config) | ||
| response (client/delete (esr/index-url conn index-name) | ||
| response (client/delete (str (:uri conn) "/" index-name) |
There was a problem hiding this comment.
do we have a helper for this?
| documents (map #(esd/create conn "1_autocomplete" "_doc" %) test-values)] | ||
| (let [root (url/elastic-root es-config/elastic-name) | ||
| url (str root "/1_autocomplete/_doc") | ||
| documents (map #(client/post url {:body (json/generate-string %) |
There was a problem hiding this comment.
use mapv here, map is lazy and mapv is eager
| "Performs a bulk operation" | ||
| ([conn operations] (bulk conn operations nil)) | ||
| ([conn operations params] | ||
| (when (not-empty operations) |
| (defn decode-response | ||
| "Decodes the response body from the given response" | ||
| [response] | ||
| (-> response |
There was a problem hiding this comment.
(parse-safely (:body response))) is more succinct.
eereiter
left a comment
There was a problem hiding this comment.
Just fix the code rabbit finding in indexer-app/src/cmr/indexer/data/elasticsearch.clj
|
Overview
What is the objective?
Replace clojurewerkz because it does not work well with elastic 8 and updated versions of the dependency do not support 8
What are the changes?
Replace clojurewerkz with clj-http
What areas of the application does this impact?
All
Required Checklist
Additional Checklist
Summary by CodeRabbit