-
Notifications
You must be signed in to change notification settings - Fork 993
feat: /heath
rpc endpoint
#928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@distractedm1nd looks great so far - maybe we can extend the /health
response to also have a map of services -> running/not running
type HealthResponse struct {
StateService bool `json:"state_service"`
ShareService bool `json:"share_service"`
DASStatus DasStateResponse `json:"das_status"`
HeaderSyncing bool `json:"header_syncing"`
} @renaynay were you thinking something like this? Thoughts:
|
Codecov Report
@@ Coverage Diff @@
## main #928 +/- ##
==========================================
- Coverage 58.74% 58.61% -0.13%
==========================================
Files 128 129 +1
Lines 7623 7650 +27
==========================================
+ Hits 4478 4484 +6
- Misses 2676 2703 +27
+ Partials 469 463 -6
Help us with your feedback. Take ten seconds to tell us how you rate us. |
@distractedm1nd lets do something like (excuse formatting):
So basically call all of these things services (we also don't need share service bc das covers that, and share service is kind of an under the hood thing). We also don't need state service "health" bc those endpoints are accessible via RPC always as long as the core connection has been provided. It's really only a proxy rather than a service. The only time endpoints are limited is if a fraud proof has been received, in which case state update endpoints are blocked, but state reads are fine. Bool would be fine but I think it's better to have Wdyt about this? |
It would also be nice to have a list of available endpoints on the node the way that tendermint does: https://rpc-mamaki.pops.one/ |
Yes, makes sense - two things:
|
@distractedm1nd yes to both 1 and 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with this, however:
- Share and State services running or not give node operator zero info. Starting and stopping happen only at the start and stop of the node. Besides State service which can be stopped in runtime with fraud proofs case, but more on this later.
- Also, why only those services, and why not Fraud?
- Call this
status
endpoint rather thanhealth
(extending on the point above)status
is more about data being returned and implies that the node operator himself understands if the node is running healthy based on the returned data,- A health check implies some additional "smart" logic that evaluates node health itself and only returns a simple true and false. In our case, this logic could be a combination of:
- Checking that header are constantly synced
- Checking that header are constantly sampled
- Checking if any FraudProof exist/received
health
might be extracted as a separate issue
Note that this will need to be reworked with newer PublicAPI over RPC, so I am fine with not making this super clean and changes are only requested for the comments below.
Thank you!
StateService bool `json:"state_service"` | ||
ShareService bool `json:"share_service"` | ||
DASStatus DasStateResponse `json:"das_status"` | ||
HeaderSyncing bool `json:"header_syncing"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should return the Syncer state here as well
dasState := new(DasStateResponse) | ||
dasState.SampleRoutine = h.das.SampleRoutineState() | ||
dasState.CatchUpRoutine = h.das.CatchUpRoutineState() | ||
availResp.DASStatus = *dasState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to allocate response and then copy it again
dasState := new(DasStateResponse) | |
dasState.SampleRoutine = h.das.SampleRoutineState() | |
dasState.CatchUpRoutine = h.das.CatchUpRoutineState() | |
availResp.DASStatus = *dasState | |
availResp.DASStatus = &DasStateResponse{ | |
SampleRoutine: h.das.SampleRoutineState(), | |
CatchUpRoutine: h.das.CatchUpRoutineState(), | |
} |
Not worth pursuing until the refactorings, so removing from the roadmap for now |
Would close #739 ?
If I've understood correctly, the
/health
endpoint is just for checking if the rpc is running - not sure if this is still wanted/necessary though, feel free to delete.Should more health endpoints be created for any other services? For example, with the new
DASState
, the /daser endpoint seems like a great "health check" to me