docs(validate): operator runbook pages (MTN-116)#340
Draft
JohnnyWyles wants to merge 6 commits into
Draft
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
The Osmosis gas_price of 0.0026uosmo predated the dynamic fee market and would get the relayer's txs rejected; bump to 0.03 and note that it must track the base fee (query with `osmosisd query txfees base-fee`). Update the stale `hermes 1.0.0` version example.
59ae08a to
1fa1db5
Compare
Apply verified findings from a full audit of the Validate section: - node-configuration: correct the minimum-gas-prices example to 0.03uosmo (the old 0.025uosmo was below the dynamic base fee and contradicted its own instruction); point operators at the fee-market base-fee query. - validating-mainnet/testnet: fix the create-validator narration (the example uses 400 OSMO, the bullets said 500); correct the invalid wosmongton@osmosis.labs contact domain; describe gas-prices as price-per-gas not gas amount; fix the malformed `query staking validators` flag order; align KEY_NAME usage; lowercase the H1; drop --chain-id from read-only signing-info queries; fix the testnet chain-id note. - joining-mainnet/testnet: add --chain-id to init; reconcile the RAM recommendation (32 GB minimum, 64 GB recommended for validators); add a caution that testnet seeds/peers rotate and must be confirmed against the chain registry. - performance: fix the benchstat example (step 2 checked out master instead of the feature branch); sanitize a leaked host/home path; drop the invalid heap ?seconds param; convert en-dash glyphs to markdown bullets; remove a duplicate. - relayer-guide: match the hermes sample output (1.13.3) to the build tag; replace filler memo_prefix with a self-documenting placeholder. - index: extend the landing prose to cover the operations guides (sync options, node configuration, upgrades, monitoring, security). Intentional TODO(operator) markers (snapshot providers, seeds, sentry topology, backup/DR) are left in place pending operator input.
Fills the two operator stubs that can be sourced authoritatively. - sync-options: list the official snapshots.osmosis.zone and Polkachu snapshot providers; narrow the remaining caution to state-sync RPC servers only. - node-configuration: document the official seeds that `osmosisd init` writes (seed.osmosis.zone, seeds.polkachu.com) and point at the Cosmos chain registry for the current full seed/peer set. The deployment-specific runbooks (monitoring dashboards, sentry topology, backup/failover, upgrade recovery, state-sync RPC servers) remain marked TODO(operator); they are safety-critical and await confirmation from ops.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Validate section: node/validator operational runbooks (MTN-116, under the MTN-111 content umbrella).
Operator review required. These are safety-critical (downtime and double-signing are slashable). I drafted the content that can be grounded in canonical sources (Cosmovisor, pruning/config, Prometheus instrumentation) and left explicit
TODO(operator)markers +:::caution/:::dangeradmonitions for the validator-specific bits that must be confirmed before publishing. Search the diff forTODO(operator)to find every spot needing your input.Review links point at this PR's Vercel preview build.
New pages
The Validate sidebar was reordered into a logical operator flow (install -> join -> upgrades -> sync -> config -> monitoring -> performance -> security -> tmkms -> validating -> relayer).
Safety-critical items to confirm before merge
priv_validator_state.jsonhandling (security)Cross-branch note
node-configurationreferences the network base fee but links the Integrate section generally rather than/integrate/fees(that page is on the unmerged MTN-114 branch, #338). Can be pointed at/integrate/feesonce #338 merges.Accuracy sweep (added after initial review)
A full multi-agent, adversarially-verified audit of the Validate section followed the initial runbooks (live-chain checks where relevant). 8 files changed.
Verified fixes:
node-configuration.mdminimum-gas-pricesexample was0.025uosmo, below the 0.03 base fee and contradicting its own instruction →0.03uosmowith a fee-market query pointer.wosmongton@osmosis.labscontact domain →osmosis.team;gas-pricesdescribed as price-per-gas; malformedquery staking validatorsflag order;KEY_NAMEconsistency; H1 casing;--chain-iddropped from read-onlysigning-infoqueries; testnet chain-id note.--chain-idtoinit; reconciled RAM (32 GB minimum, 64 GB recommended for validators); added a caution that testnet seeds/peers rotate and must be confirmed against the chain registry.masterinstead of the feature branch); sanitized a leaked host/home path; dropped the invalid heap?secondsparam; converted en-dash glyphs to markdown bullets; removed a duplicate line.memo_prefix.Still needs operator input: 6
TODO(operator)markers remain (snapshot providers, seeds/peers, sentry topology, backup/DR runbook, monitoring dashboards, missed-upgrade recovery).Build passes.
Note
Low Risk
Documentation-only changes; no runtime or application code. Residual risk is publishing unverified operator guidance until TODO(operator) items are filled in.
Overview
Adds five new Validate operator runbooks (MTN-116): Chain Upgrades and Cosmovisor, Sync Options, Node Configuration and Maintenance, Monitoring and Alerting, and Validator Security and Recovery. Together they cover governance upgrades, sync/pruning choices,
app.toml/config.tomltuning, Prometheus alerting, and double-signing / sentry / DR guidance, with cross-links into existing install, TMKMS, and performance pages.Sidebar order on existing Validate docs is updated (
sidebar_positiononly on performance, TMKMS, validating mainnet/testnet, relayer) so the section reads as an operator flow: install → join → upgrades → sync → config → monitoring → performance → security → TMKMS → validating → relayer.Several spots are explicitly draft / operator-verify:
TODO(operator)for live seeds, snapshot URLs, Grafana rules, sentry topology, and backup runbooks;:::caution/:::dangeron values that must not be guessed before publish.Reviewed by Cursor Bugbot for commit ad87df8. Bugbot is set up for automated code reviews on this repo. Configure here.