-
Notifications
You must be signed in to change notification settings - Fork 48
Description
By default, SM batches (with ranges_parallelism) token ranges sent during vnode repair in order to improve its performance.
It's not done if one of the executions in task execution chain ended due to running out of maintenance window or some error:
func shouldBatchRanges(session gocqlx.Session, clusterID, taskID, runID uuid.UUID) (bool, error) {
prevIDs, err := getAllPrevRunIDs(session, clusterID, taskID, runID)
...
var status string
for _, id := range prevIDs {
err := q.BindMap(qb.M{
"cluster_id": clusterID,
"type": "repair",
"task_id": taskID,
"id": id,
}).Scan(&status)
if err != nil {
return false, errors.Wrap(err, "get prev run status")
}
// Fall back to no-batching when some of the previous runs:
// - finished with error
// - got out of scheduler window
if status == "WAITING" || status == "ERROR" {
return false, nil
}
}
return true, nil
}The problem with batching is that it might negatively impact granularity, so also the progress that repair task can make in a single, short execution. E.g., it would be theoretically possible for repair task scheduled with short spans on maintenance window to progress without batching and fail to progress with batching. Here is the initial conversation about this topic.
In general, we should really prefer batching, as:
- we gain a lot of performance from it
- re-repairing already repaired data is faster
- nobody schedules maintenance windows like that
So in general, there are some reasons to believe that not batching might be better in case of retires / running out of maintenance window, but it's difficult to say whether they are real.
One thing for sure is that the current implementation stops batching too often - it just looks if any of the previous runs finished with an error. This error might be unrelated to the failed vnode repair (e.g. repair might fail due to the initial check that there are no repairs running on the cluster). Also, failing one batch results in not batching different token ranges, which isn't optimal as well.