Migrate NodeHealthAPI to JAX-RS, flip delegation so V2 owns logic, add replication lag monitoring docs#22
Draft
Migrate NodeHealthAPI to JAX-RS, flip delegation so V2 owns logic, add replication lag monitoring docs#22
Conversation
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Migrate NodeHealthAPI to JAX-RS annotations
Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS annotations
Feb 22, 2026
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
Copilot
AI
changed the title
Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS annotations
Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS; add mock-free integration tests
Feb 22, 2026
…ef guide Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
Copilot
AI
changed the title
Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS; add mock-free integration tests
Migrate NodeHealthAPI from homegrown @EndPoint to JAX-RS; add ref guide link
Feb 22, 2026
Owner
|
Superseded by apache#4171 — retargeted to the upstream repo. |
I used claude for this regressoin test and I don't love how verbose they are. I tried a mock approach first and it was worse.
…he#4170) Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* startSolr don't specify temp dir * newCollection don't specify collection1 * getSolrClient don't specify collection1 * withConfigSet use Path if possible org.apache.solr.SolrTestCaseJ4.getFile should return an absolute file to reduce ambiguity
…wn (apache#4220) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…pache#4196) Renames bench/MiniClusterState.java to bench/SolrBenchState.java, and flattens its structure, which had an inner class. Two lifecycle methods containing "miniCluster" in the name were replaced with "solr" to be generic, and I improved javadocs slightly. This is a preparatory refactoring step on a short journey to solr/benchmark supporting multiple backends (not just MiniSolrCloudCluster). Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rted (apache#4224) This is mostly for tests. It makes sure a replica cannot be elected leader for a very short time while all nodes are shutting down.
- NodeHealthApi: add @QueryParam("maxGenerationLag") Integer maxGenerationLag with @parameter description to healthcheck() - NodeHealth: update healthcheck() to accept and forward maxGenerationLag; remove now-redundant checkNodeHealth() bridge method - HealthCheckHandler: call healthcheck() directly (no more checkNodeHealth()) - NodeApi (generated SolrJ): regenerated - Healthcheck gains setMaxGenerationLag() setter and includes the param in getParams() - NodeHealthStandaloneTest: remove FIXME; test negative-maxGenerationLag via the real V2 HTTP path using NodeApi.Healthcheck.setMaxGenerationLag(-1) Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
- Add new "Monitoring Follower Replication Lag" section to user-managed-index-replication.adoc with V1+V2 API examples, example responses (success and failure), and a warning about using maxGenerationLag=0 in production. - Update implicit-requesthandlers.adoc Health entry: remove the inaccurate "available only in SolrCloud mode" qualifier; add a concise description of both SolrCloud params (requireHealthyCores) and legacy-mode params (maxGenerationLag) with a cross-reference to the new monitoring section. Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…lr into copilot/migrate-node-health-api
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Migrates
org.apache.solr.handler.admin.api.NodeHealthAPIfrom the homegrown@EndPointannotation to standard JAX-RS annotations, following the established V2 API pattern. Includes a full inversion of the delegation model, enum serialization fix, and new reference-guide coverage ofmaxGenerationLagmonitoring.NodeHealthAPI → JAX-RS
NodeHealthApiinterface added tosolr/apiwith@Path("/node/health"),@GET,@Operation(tag:node)NodeHealthResponsemodel added withstatus(NodeStatusenum),message,num_cores_unhealthyfieldsNodeHealthimplementsNodeHealthApiviaJerseyResource; registered throughHealthCheckHandler.getJerseyResources()NodeApi.HealthcheckSolrJ request classDelegation inverted: V2 owns logic, V1 bridges
Previously
NodeHealthAPIdelegated toHealthCheckHandler. Now reversed:NodeHealthowns all business logic: cloud-mode check (ZK liveness, live-nodes,requireHealthyCores), legacy-mode replication-lag check (isWithinGenerationLag,findUnhealthyCores),UNHEALTHY_STATES— all usingNodeHealthResponse/NodeStatusthroughout, noNamedListin business logicHealthCheckHandleris now a thin V1 bridge: extracts params, callsnew NodeHealth(coreContainer).healthcheck(requireHealthyCores, maxGenerationLag), squashes result viaV2ApiUtilsfindUnhealthyCoresmoved toNodeHealthas a public static utility;HealthCheckHandlerretains a@Deprecateddelegation shimEnum serialization fix (
Utils.getReflectWriter)NodeStatus.OKwas serialized as"org.apache.…NodeStatus:OK"through the NamedList/javabin path because enums have no@JsonPropertyfields and fell through to the string-representation fallback. Added an early-exit forEnuminstances that returns((Enum<?>) o).name()so V1 consumers (e.g.,HealthCheckHandlerTestcomparing against"OK") continue to work.Bug fix in
isWithinGenerationLagPre-existing logic error: the condition
generationDiff < maxGenerationLagwas inverted — healthy cores were flagged as lagging. Corrected to> maxGenerationLag; return values adjusted totrue= within acceptable lag,false= lagging too far. Also fixed missing slf4j format arguments in the negative-diffwarncall.Tests
NodeHealthAPITest— mock-based unit tests for cloud and legacy pathsNodeHealthAPITest2— integration tests using realCoreContainer(no mocks)HealthCheckHandlerTestupdated to callNodeHealth.findUnhealthyCores()directlyReference guide
user-managed-index-replication.adoc— new== Monitoring Follower Replication Lagsection:maxGenerationLagsemantics and omission behaviormaxGenerationLag=0in productionimplicit-requesthandlers.adoc— Health handler entry:requireHealthyCores(cloud) andmaxGenerationLag(legacy) with cross-reference to new monitoring section✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.