Skip to content

Add metrics for tracking OpenSearch query time and investigate default timeout changes #10

@digitaldogsbody

Description

@digitaldogsbody

Some OpenSearch queries are timing out before returning but we don't have a nice way of measuring the impact of this. The default is currently 10s, and it could be that this is actually a perfectly fine value and the failing queries complete in a fraction of the time when rerun.

Indeed it might even make sense to reduce the timeout - for example, if most queries return in <1s, including the re-run failed ones, then it might be better to timeout sooner and just retry until success.

It could also be that many queries are pushing 10s and raising the timeout a little is a sensible idea that would prevent most failures.

Basically, we need a way to measure the query time and failure rate so that we can make an informed decision instead of a guess, so adding some instrumentation and recording metrics here for a couple of runs would be of great utility.

Metadata

Metadata

Labels

enhancementNew feature or requestinternal qolThis would make our lives easier for development / analysis / etcinvestigationWork that will require investigation before/during implementation

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions