Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Admin API] Add new rolling restart health endpoints #1019

Open
wants to merge 2 commits into
base: api
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions modules/ROOT/attachments/admin-api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,41 @@ paths:
200:
description: Recommission broker success
content: {}
/v1/broker/pre_restart_probe:
get:
tags:
- Brokers
summary: Check broker pre-restart
description: Check if it is safe to restart this broker.
operationId: pre_restart_probe
parameters:
- name: limit
in: query
required: false
schema:
type: integer
description: "Limit the number of partitions listed for each risk type (default: 128)."
responses:
200:
description: Pre-restart check result. Returns risks associated with restarting the broker, and partitions affected, if any.
content:
application/json:
schema:
$ref: '#/components/schemas/pre_restart_check_result'
/v1/broker/post_restart_probe:
get:
tags:
- Brokers
summary: Check broker post-restart
description: Check if the broker has recovered after a restart.
operationId: post_restart_probe
responses:
200:
description: Post-restart check result
content:
application/json:
schema:
$ref: '#/components/schemas/post_restart_check_result'
/v1/cluster_view:
get:
tags:
Expand Down Expand Up @@ -2249,6 +2284,48 @@ components:
type: string
port:
type: integer
pre_restart_check_result:
type: object
description: Pre-restart check result.
properties:
risks:
$ref: '#/components/schemas/restart_risks'
restart_risks:
description: Partitions affected by the current broker restart, grouped by risk type. Each partition list is truncated according to the optional limit specified in the request.
properties:
rf1_offline:
type: array
items:
type: string
description: Namespace, topic, partition ID
description: Partitions with a replication factor of 1 that have a replica on the current broker.
full_acks_produce_unavailable:
type: array
items:
type: string
description: Namespace, topic, partition ID
description: Partitions that may reject produce requests (with `acks=-1`) if the current broker is restarted.
unavailable:
type: array
items:
type: string
description: Namespace, topic, partition ID
description: Partitions that may reject consume and produce requests if the current broker is restarted.
acks1_data_loss:
type: array
items:
type: string
description: Namespace, topic, partition ID
description: Partitions that may lose data produced (with `acks=1`) if the current broker is restarted.
post_restart_check_result:
type: object
description: Post-restart check result.
properties:
load_reclaimed_pc:
type: integer
description: The load that the broker has reclaimed after restarting, as a percentage of in-sync replicas.
minimum: 0
maximum: 100
broker_locator:
type: object
properties:
Expand Down