Add Retry Logic for `ElasticsearchInternalService#chunkedInfer` #127812

jimczi · 2025-05-07T09:44:10Z

This PR introduces basic retry functionality for the internal inference service (ElasticsearchInternalService), which runs on ML nodes. We already use an exponential backoff strategy for retrying failed inference requests to external services. This change extends the same retry mechanism to the internal service, allowing it to automatically retry transient failures. To maintain consistency and reduce complexity, this implementation reuses the existing retry configuration settings: xpack.inference.http.retry.*

Note: This PR is still a draft. Additional tests are needed, but I’d like to gather feedback on the approach before proceeding further.

This PR introduces basic retry functionality for the internal inference service (`ElasticsearchInternalService`), which runs on ML nodes. We already use an exponential backoff strategy for retrying failed inference requests to external services. This change extends the same retry mechanism to the internal service, allowing it to automatically retry transient failures. To maintain consistency and reduce complexity, this implementation reuses the existing retry configuration settings: `xpack.inference.http.retry.*` **Note**: This PR is still a draft. Additional tests are needed, but I’d like to gather feedback on the approach before proceeding further.

elasticsearchmachine · 2025-05-07T09:44:36Z

Hi @jimczi, I've created a changelog YAML for you.

davidkyle

Looks good. This is a straightforward and sensible way to add retry

jimczi requested review from davidkyle and jonathan-buttner May 7, 2025 09:44

jimczi added >feature :ml Machine learning v8.19.0 v9.1.0 labels May 7, 2025

Update docs/changelog/127812.yaml

90f1bfb

github-actions bot deployed to docs-preview May 7, 2025 09:45 View deployment

Merge branch 'main' into retry_elasticsearch_inference_service

d102b0f

github-actions bot deployed to docs-preview May 7, 2025 11:34 View deployment

Merge branch 'main' into retry_elasticsearch_inference_service

7f3e9fd

github-actions bot deployed to docs-preview May 13, 2025 18:13 View deployment

davidkyle reviewed May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Retry Logic for `ElasticsearchInternalService#chunkedInfer` #127812

Add Retry Logic for `ElasticsearchInternalService#chunkedInfer` #127812

jimczi commented May 7, 2025

elasticsearchmachine commented May 7, 2025

davidkyle left a comment

Add Retry Logic for ElasticsearchInternalService#chunkedInfer #127812

Are you sure you want to change the base?

Add Retry Logic for ElasticsearchInternalService#chunkedInfer #127812

Conversation

jimczi commented May 7, 2025

elasticsearchmachine commented May 7, 2025

davidkyle left a comment

Choose a reason for hiding this comment

Add Retry Logic for `ElasticsearchInternalService#chunkedInfer` #127812

Add Retry Logic for `ElasticsearchInternalService#chunkedInfer` #127812