Skip to content

[FEATURE] Scheduling for running evaluations regularly #213

@ajleong623

Description

@ajleong623

Is your feature request related to a problem?

It might be helpful for the users to regularly run search evaluations as the data in the search engine changes. That way, if there is a degradation in search results, it could be quickly detected.

What solution would you like?

I would like to implement the job scheduler plugin in search relevance to be able to regularly run the search evaluation as a task. Maybe as part of the search experiment api, we can pass in an additional parameter to specify a schedule for regularly running the search evaluation.

Defining a schedule

There are a couple ways to define a schedule as documented in job-scheduler. We could either use a cron-schedule or an interval-schedule. We could add too optional object parameter as part of the request, interval or cron and have the users specify the schedule the same way job-scheduler specifies it.

This interval schedule will run the job starting next year every hour (However, the behavior might be strange if the start_time was specified to be a time in the past. This is something I should be looking into more). The example below will run the job every hour at the new year.

{
  "start_time": "2026-01-01T00:00:00.000"
  "unit": "MINUTES"
  "period": 60
}

We could also use the cron schedule. The example below will run the job every hour at the beginning of the next hour.

{
  "expression": "0 * * * *"
  "time_zone": "America/Los_Angeles"
}

Registering jobs

So far, while experimenting, I have found that by implementing the job runner ,we could run a request against the cluster. We would have to initialize the job runner while initializing the SearchRelevancePlugin.

How the jobs are registered is through an IndexingOperationListener. There will be an index for the scheduled jobs, and whenever a job is added as a document into that index, the task to run the job will be added into the scheduled thread pool.

Deletion

Finally, we will have to worry about deletion. When an evaluation is deleted, the corresponding job for that evaluation should be deleted as well. Another option is to have a separate API so that users could delete jobs directly. Since the jobs are stored in an index, we can simple delete using the job_id. job-scheduler already has this functionality.

Potential Concerns

A potential concern might be limiting the number of schedules that can be added and placing a lower bound on how far schedules should be placed apart.

Applying the change

To apply this change and have users notified about potential abnormalities in their search evaluation, we could implement alerting or anomaly-detection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions