Skip to content

[KLC-2387] validator CPU preflight + benchmark crypto category#43

Draft
fbsobreira wants to merge 6 commits into
developfrom
feature/KLC-2387-validator-cpu-preflight
Draft

[KLC-2387] validator CPU preflight + benchmark crypto category#43
fbsobreira wants to merge 6 commits into
developfrom
feature/KLC-2387-validator-cpu-preflight

Conversation

@fbsobreira
Copy link
Copy Markdown
Member

@fbsobreira fbsobreira commented May 7, 2026

Summary

  • Validator startup CPU preflight (cmd/node/preflight.go): refuses to start validators on hosts that fail a 200 ms SHA-256 self-bench at < 800 MB/s — the floor below which leader-mode TX processing exceeds the protocol's 425 ms hardware-tolerance window.
  • Crypto category for klever-benchmark (cmd/benchmark/crypto.go): SHA-256 (1 KiB / 16 KiB), Blake2b, Keccak-256, Ed25519 verify, plus CPU feature attestation (SHA-NI / AVX-512 IFMA / VAES / GFNI). Hard grade-F veto below 500 MB/s SHA-256.
  • Operator controls: preferences.enforceCpuPreflight (default true, including when omitted from existing configs); KLEVER_SKIP_CPU_CHECK=1 for emergency bypass.
  • Documentation: cmd/node/PREFLIGHT.md (rationale, behavior, override, migration plan) + updates to cmd/benchmark/CLI.md.

Why

A field investigation across the validator fleet found a ~5× spread in smart-contract TX processing time between SHA-NI-equipped and SHA-NI-deficient hosts. Skylake-class Intel Xeon guests took ~600 ms per representative SC TX vs ~120 ms on AMD EPYC peers — well above the 425 ms protocol hardware-tolerance lower bound. The preflight catches this at startup so operators discover the issue before consensus-time outliers manifest in production.

The gate is grounded in measured SHA-256 throughput, not the SHA-NI feature flag — it correctly handles nested virtualisation that hides feature bits while still exposing the underlying ISA, and it does not assume SHA-NI absence is the sole cause of low throughput.

Behavior

State Outcome
Throughput ≥ 800 MB/s Pass — log measurement Info, continue startup
Throughput < 800 MB/s + enforceCpuPreflight: true (default) Refuse to start; clear error with override instructions
Throughput < 800 MB/s + enforceCpuPreflight: false Warn-only, continue startup (fleet-migration escape hatch)
KLEVER_SKIP_CPU_CHECK=1 Skip the gate entirely; loud Warn line on every startup
Unsupported arch (riscv64, 386, …) No-op

The bench is run twice and the maximum is reported, so a single transient throttle event (thermal, hypervisor noisy neighbor) does not falsely refuse a host that would otherwise pass.

Field validation

Cross-compiled the standalone benchmark to amd64 Linux and ran it across 5 fleet hosts:

Host CPU SHA-NI SHA-256 16K MB/s Grade Vetoed Startup gate (≥ 800)?
94.130.96.243 Intel Xeon Skylake-IBRS 244 F ❌ refuse
51.195.253.156 Intel Haswell (no TSX) 249 F ❌ refuse
34.176.51.113 Intel Xeon @ 2.20 GHz 283 F ❌ refuse
65.21.181.98 AMD EPYC 1,474 B ✅ pass
91.99.66.173 AMD EPYC-Genoa 1,674 A ✅ pass

Real-world SHA-NI ratio: ~5.5× faster on EPYC vs Intel guests without SHA-NI — matches the field-investigation findings.

Test plan

  • go test ./cmd/node/... ./cmd/benchmark/... — all passing (12 preflight tests + 9 score tests)
  • go test -race ./cmd/node ./cmd/benchmark — clean for changed packages
  • go vet ./cmd/node/... ./cmd/benchmark/... ./config/... — clean
  • go build ./cmd/node/... ./cmd/benchmark/... ./config/... — clean
  • Live cross-compiled benchmark validated on 5 fleet hosts (3 SHA-NI-deficient correctly vetoed, 2 EPYC correctly passing)
  • JSON output exposes new vetoed / veto_reason fields, verified end-to-end
  • Default config rolls out safely: *bool semantics treat absent YAML key as true (enforce by default) so upgrading operators are not silently downgraded to warn-only

Out of scope / follow-ups

  • BLS pairing (MCL/herumi) is intentionally not measured directly — wiring MCL into the standalone benchmark would force a CGO dependency on operators just grading hardware. AVX-512 IFMA is reported as a proxy for the BLS fast path.
  • Pre-existing gopsutil CGO macOS-SDK deprecation warning is unrelated.

This PR introduces an operator-facing validator CPU preflight at node startup and a new crypto category in the klever-benchmark. Both are operational/diagnostic and do not change consensus rules, transaction processing, state management, KVM, or networking logic.

Blockchain Impact

  • Consensus / Data integrity: Unchanged. No protocol rules, transaction validation, state-machine logic, on-disk formats, or cryptographic consensus parameters were modified.
  • Node stability / availability: The validator CPU preflight can block validator startup on under-provisioned hosts (measured SHA‑256 throughput < 800 MB/s by default). This affects validator availability (uptime/leadership participation), but not ledger correctness or consensus semantics.
  • Cross-cutting concerns:
    • Error handling: Preflight failures are surfaced as startup errors when preferences.enforceCpuPreflight is true (default when YAML key is absent). If enforceCpuPreflight is false the failure is downgraded to a warning and startup proceeds. The env override KLEVER_SKIP_CPU_CHECK=1 bypasses the check and logs a prominent warning.
    • Concurrency: No changes to core concurrency, consensus, or networking code. Benchmark code reduces GC noise (avoid per-iteration allocations) but does not alter runtime concurrency semantics.
    • Observability/operability: Preflight emits a single Info measurement log with architecture, measured sha256_mbps, and detected CPU feature flags. Benchmark JSON output adds score.vetoed and score.veto_reason and exposes cpu feature booleans (sha accel, AVX-512 IFMA, VAES, GFNI). The codebase uses an architecture-aware field rename (HasSHAAccel / sha_accel) and produces architecture-appropriate accel names in messages.

Key components affected

  • Validator CPU Preflight

    • Files: cmd/node/preflight.go, cmd/node/PREFLIGHT.md, cmd/node/preflight_test.go, cmd/node/startup.go, config/prefsConfig.go, config/node/config.yaml.
    • Behavior: For supported architectures (amd64, arm64) runs a 200 ms SHA‑256 sustained-throughput self-benchmark (16 KiB blocks). The bench runs twice and takes the maximum. Default enforcement threshold: 800 MB/s; measured < 800 MB/s produces a startup error when enforcement is enabled.
    • Placement: executed immediately before loading the validator BLS private key.
    • Feature detection: reports architecture-appropriate SHA accel (SHA‑NI on amd64, ARMv8 SHA2 on arm64) and other CPU flags (AVX‑512 IFMA, VAES, GFNI). Missing accel flags are informational; throughput determines pass/fail.
    • Controls: preferences.enforceCpuPreflight (pointer bool; ShouldEnforceCPUPreflight() returns true when absent) and KLEVER_SKIP_CPU_CHECK=1 (explicitly bypasses check and logs warning).
    • Tests: unit tests cover fast/slow bench, env bypass, unsupported arch, messaging, and benchSHA256 behavior.
  • Crypto Benchmark & Scoring

    • Files: cmd/benchmark/crypto.go, main.go, runner.go, report.go, score.go, score_test.go, CLI.md.
    • Measurements: SHA‑256 (1 KiB and 16 KiB MB/s), Blake2b‑512 (16 KiB), Keccak‑256 (16 KiB), Ed25519 verify (ops/sec), plus CPU feature attestation.
    • Scoring: new Crypto / Hashing category (weight 200) with per-metric ceilings and a crypto normalizer. Benchmark JSON includes report.crypto and score.crypto, and exposes cpu feature booleans.
    • Hard veto in benchmark tool: if SHA256LargeMBps < 500 MB/s, the benchmark sets Vetoed=true, sets VetoedReason, and forces overall Grade="F". (Note: the validator preflight uses the stricter 800 MB/s threshold.)
    • CLI: new --skip-crypto / -skip-crypto flag.
    • Tests: unit tests validate scoring, rebalancing, and veto logic.
    • Implementation note: CryptoResult field rename/semantics: HasSHAAccel (architecture-aware) and related messages updated to render accel names per-arch.

Configuration & operator guidance

  • New config key: preferences.enforceCpuPreflight (pointer bool) with default true when omitted to preserve safe upgrade semantics.
  • Emergency override: KLEVER_SKIP_CPU_CHECK=1 (explicit "1" value required) bypasses preflight and logs a warn.
  • Operational impact: Measured author data shows many Intel hosts may be below the validator threshold (Intel ~244–283 MB/s vetoed) while some AMD EPYC hosts exceed it (~1,474–1,674 MB/s passed). Operators should evaluate host CPU capabilities, use the benchmark tool to validate hardware, or disable enforcement / use the env override for emergency recovery.

Testing & quality

  • Benchmarks run twice to reduce transient-throttle false negatives; deterministic buffer seeding avoids blocking on entropy.
  • Unit tests added for preflight and benchmark scoring; author reports go test, race tests, vet, and builds for affected packages pass.
  • No changes to persisted state, consensus message formats, networking protocols, transaction verification, or KVM/state-machine code.

Summary judgement

  • Protocol/correctness risk: none—no changes to consensus, state, transaction processing, KVM, or networking.
  • Operational risk: moderate—validator startup availability can be blocked on low-throughput CPUs unless operators opt out or use the emergency bypass. Administrators should plan hardware validation or adjust preferences.enforceCpuPreflight prior to rolling upgrades.

Adds a startup-time SHA-256 throughput gate for validator nodes plus a
crypto benchmark category in klever-benchmark. The preflight refuses to
start a validator measuring below 800 MB/s on a 200 ms self-bench (the
floor below which leader-mode TX processing exceeds the 425 ms protocol
hardware-tolerance window). Operators control enforcement via
preferences.enforceCpuPreflight (defaults to true, including for
existing configs that omit the key); KLEVER_SKIP_CPU_CHECK=1 is an
emergency bypass.

The standalone klever-benchmark gains a Crypto category (SHA-256 1K/16K,
Blake2b, Keccak-256, Ed25519 verify) with CPU feature attestation and a
hard grade-F veto below 500 MB/s SHA-256. JSON output exposes vetoed
and veto_reason for fleet tooling.

Field validation across the validator fleet confirms the SHA-NI gap:
non-SHA-NI guests measure 244-283 MB/s while EPYC hosts measure
1474-1674 MB/s, matching the ~5x ratio that motivated the gate.
Copilot AI review requested due to automatic review settings May 7, 2026 23:58
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Walkthrough

This PR adds a crypto benchmarking suite to the benchmark CLI (SHA‑256/Blake2b/Keccak MB/s and Ed25519 verify ops/sec, CPU feature flags), integrates Crypto into scoring with a SHA‑256 hard veto, wires a --skip-crypto CLI flag, extends reporting/JSON, adds tests, and introduces a validator startup CPU preflight gate with configurable enforcement.

Changes

Benchmark Crypto Suite Addition

Layer / File(s) Summary
Data Contracts
cmd/benchmark/crypto.go, cmd/benchmark/report.go, cmd/benchmark/score.go
Adds CryptoResult, extends BenchmarkResults and BenchmarkScore with crypto fields and veto metadata; adds JSON helper types and crypto ceiling/threshold constants.
Benchmark Implementation
cmd/benchmark/crypto.go
Implements RunCryptoBenchmark() and helpers: CPU feature detection, benchHashMBps (SHA‑256/Blake2b/Keccak), and Ed25519 verify bench using deadline-bounded loops and RNG fallback.
Reporting & JSON
cmd/benchmark/report.go
Adds cryptoVerdict() and printCryptoSection(); includes CPU feature flags in text/JSON; expands JSON schema with jsonCrypto and jsonCPUFeatures; folds crypto verdict into overall verdict and score table.
Scoring & Hard Veto
cmd/benchmark/score.go
Adds cryptoCatScore() and integrates Crypto into ComputeScore; introduces minLeaderSHA256MBps hard-veto that sets Vetoed, VetoedReason, and forces grade "F" when triggered.
CLI, Runner & Tests
cmd/benchmark/CLI.md, cmd/benchmark/main.go, cmd/benchmark/runner.go, cmd/benchmark/score_test.go
Documents -skip-crypto/--skip-crypto; adds CLI flag and Config.SkipCrypto; runner conditionally executes RunCryptoBenchmark() and assigns results.CryptoResult; score_test.go adds fixtures and tests for weighting, veto behavior, and grade boundaries.
Dependencies
go.mod
Adds github.com/klauspost/cpuid/v2 dependency for CPU feature detection.

Validator CPU Preflight Gate

Layer / File(s) Summary
CPU Detection & Benchmarking
cmd/node/preflight.go
Adds detectCPU() and benchSHA256() (deterministic 16 KiB deadline-bounded SHA‑256 bench); validatorCPUPreflightWithInfo() runs bench twice, logs measurement and CPU feature flags, warns for missing flags, and errors when throughput < configured minimum unless bypassed via env or config.
Configuration
config/node/config.yaml, config/prefsConfig.go, config/prefsConfig_test.go
Adds preferences.enforceCpuPreflight setting; PreferencesConfig.EnforceCPUPreflight *bool and ShouldEnforceCPUPreflight() compute effective enforcement; unit test verifies default/explicit behavior.
Startup Integration
cmd/node/startup.go
startNode() runs validatorCPUPreflight before loading validator key and honors ShouldEnforceCPUPreflight() (abort on failure if enforced, else log a warning and continue).
Tests & Documentation
cmd/node/preflight_test.go, cmd/node/PREFLIGHT.md
Adds recordingLogger test helper and comprehensive preflight tests (happy paths, missing features, slow-bench failure, env bypass, unsupported arch), bench unit tests, a startup smoke test, and operator-facing PREFLIGHT.md documentation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested labels

security

🚥 Pre-merge checks | ✅ 5 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The pull request title correctly follows the required format [KLC-XXXX] type: description, but lacks an explicit type prefix (feat/fix/refactor/perf/test/docs/chore/ci). Add a type prefix to the title, e.g., 'KLC-2387 feat: validator CPU preflight + benchmark crypto category' or 'KLC-2387 chore: validator CPU preflight + benchmark crypto category'.
Docstring Coverage ⚠️ Warning Docstring coverage is 37.04% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Error Handling ⚠️ Warning JSON encoding error silently discarded in cmd/benchmark/report.go. printJSON() logs enc.Encode() failures to stderr but continues, causing exit code 0 on failure. Propagate JSON encoding errors: modify printJSON() to return error or call os.Exit(1) when enc.Encode fails, so failures signal properly to calling processes.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Concurrency Safety ✅ Passed All new code is concurrency-safe. No goroutines spawned in production, no shared mutable state, proper mutex pairing in tests, value-semantic structures, thread-safe external library access.
State Consistency ✅ Passed Check not applicable: PR adds CPU preflight validation and benchmark infrastructure with no blockchain state modifications. All changes are measurement-only and configuration-based.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/KLC-2387-validator-cpu-preflight

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a CPU performance gate for validator startup and expands the klever-benchmark tool with a new Crypto/Hashing category, aiming to prevent validators from running on hosts that are too slow at SHA-256 to meet consensus timing constraints.

Changes:

  • Add validator startup CPU preflight based on a short SHA-256 throughput self-benchmark, with enforcement controlled by preferences.enforceCpuPreflight and an env-var bypass.
  • Add a Crypto/Hashing benchmark category (SHA-256 small/large, Blake2b, Keccak-256, Ed25519 verify) plus CPU feature attestation, and incorporate results into scoring + JSON/text reports.
  • Add/adjust operator-facing documentation for the preflight and benchmark CLI output/grade semantics.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
config/prefsConfig.go Adds EnforceCPUPreflight *bool and ShouldEnforceCPUPreflight() to default enforcement to true when YAML key is absent.
config/node/config.yaml Documents and enables enforceCpuPreflight in the sample node config.
cmd/node/startup.go Runs the CPU preflight early in startup and conditionally blocks/warns based on preferences.
cmd/node/PREFLIGHT.md New documentation explaining validator CPU preflight rationale, behavior, and overrides.
cmd/node/preflight.go New preflight implementation: CPU feature detection + SHA-256 throughput bench + gating logic.
cmd/node/preflight_test.go Adds tests for preflight branching and a smoke test for the bench function.
cmd/benchmark/score.go Adds Crypto category scoring, rebalances weights, and introduces a SHA-256 throughput-based veto cap.
cmd/benchmark/score_test.go New tests for weight totals, crypto weight, and throughput veto behavior.
cmd/benchmark/runner.go Adds SkipCrypto option and runs the crypto benchmark when enabled.
cmd/benchmark/report.go Adds crypto verdicting, text + JSON report output for crypto metrics/features, and score veto fields.
cmd/benchmark/main.go Adds -skip-crypto CLI flag wiring into the runner config.
cmd/benchmark/crypto.go New crypto benchmark implementation (hash throughput + Ed25519 verify + CPU features).
cmd/benchmark/CLI.md Updates CLI documentation to include crypto category, weights, and the SHA-256 hard-veto explanation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/benchmark/crypto.go
Comment thread cmd/node/preflight.go
Comment thread cmd/node/PREFLIGHT.md Outdated
Comment thread cmd/benchmark/report.go
Comment thread cmd/benchmark/score.go Outdated
Comment thread cmd/benchmark/crypto.go Outdated
Comment thread cmd/node/preflight.go Outdated
Comment thread cmd/benchmark/crypto.go
Comment thread cmd/node/preflight.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/benchmark/crypto.go`:
- Around line 151-163: The hot loop is allocating each iteration via h.Sum(nil);
preallocate a reusable slice with capacity h.Size() (or max digest length)
before the loops and call h.Sum(prealloc[:0]) inside the inner loop so Sum
reuses that buffer instead of allocating; allocate the buffer once (e.g., out :=
make([]byte, 0, newHash().Size())) before deadline/start and replace h.Sum(nil)
with h.Sum(out[:0]) while keeping h.Reset(), h.Write(buf) and the bytes
accounting unchanged.

In `@cmd/benchmark/score.go`:
- Around line 48-55: Add a clarifying comment next to the minLeaderSHA256MBps
constant that references the other SHA-256 threshold (minSHA256ThroughputMBps in
preflight.go) and explains their distinct roles: minLeaderSHA256MBps (500.0) is
a benchmark scoring veto / characterization floor, while minSHA256ThroughputMBps
(800) is a stricter preflight validator startup requirement; include a short
rationale for the different values (softer gate vs. hard startup gate) and a
pointer to preflight.go for maintainers to find the stricter threshold.

In `@cmd/node/PREFLIGHT.md`:
- Around line 3-5: The sentence in PREFLIGHT.md that reads "The validator binary
runs a CPU preflight check at startup, immediately after loading the BLS signing
key." is incorrect; update that wording to reflect the actual startup order by
changing "after loading the BLS signing key" to "before loading the BLS signing
key" (or similar phrasing) so the doc matches the implementation and preserves
the intended security rationale about preflight running prior to key material
being loaded.
- Around line 106-107: Update the phrase "same SHA-NI veto" to explicitly state
the benchmark uses a throughput-based veto (measured SHA-256 MB/s) rather than a
CPU feature-bit check; reference the operator-facing doc "cmd/benchmark/CLI.md"
and change the sentence to clarify that the benchmark gates on measured SHA-256
throughput and will veto even on CPUs that report SHA-NI if measured MB/s falls
below the threshold, so operators won't mistake feature-bit presence for
adequate performance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 47edf1b0-3e5b-46b5-b81f-5dccf8537b98

📥 Commits

Reviewing files that changed from the base of the PR and between 7e0b164 and 2a691af.

📒 Files selected for processing (13)
  • cmd/benchmark/CLI.md
  • cmd/benchmark/crypto.go
  • cmd/benchmark/main.go
  • cmd/benchmark/report.go
  • cmd/benchmark/runner.go
  • cmd/benchmark/score.go
  • cmd/benchmark/score_test.go
  • cmd/node/PREFLIGHT.md
  • cmd/node/preflight.go
  • cmd/node/preflight_test.go
  • cmd/node/startup.go
  • config/node/config.yaml
  • config/prefsConfig.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (1)
**/*_test.go

⚙️ CodeRabbit configuration file

**/*_test.go: Test files. Review for: - Adequate coverage of edge cases and error paths - Proper use of test helpers and assertions - Race condition coverage (tests should use -race flag patterns) - No hardcoded sleep for synchronization (use channels or sync primitives) - Test isolation (no shared mutable state between tests)

Files:

  • cmd/benchmark/score_test.go
  • cmd/node/preflight_test.go
🧠 Learnings (1)
📚 Learning: 2026-04-21T20:12:22.959Z
Learnt from: phcarneirobc
Repo: klever-io/klever-go PR: 38
File: indexer/eventsProcessor.go:188-211
Timestamp: 2026-04-21T20:12:22.959Z
Learning: In Go structs that are JSON-marshaled, if a field is a `bool` and has the `json:"...,omitempty"` tag, then leaving that field at its zero value (`false`) is functionally equivalent (in the resulting JSON) to explicitly setting `Foundation: false`. Reviewers should not flag struct literals that omit such `bool` fields as an inconsistency; they will serialize identically because `omitempty` suppresses `false` values.

Applied to files:

  • cmd/node/startup.go
  • cmd/benchmark/runner.go
  • cmd/benchmark/main.go
  • config/prefsConfig.go
  • cmd/benchmark/score_test.go
  • cmd/node/preflight.go
  • cmd/benchmark/crypto.go
  • cmd/benchmark/score.go
  • cmd/node/preflight_test.go
  • cmd/benchmark/report.go
🔇 Additional comments (16)
config/prefsConfig.go (1)

10-31: Default-on preflight semantics are correctly encoded.

Using *bool plus ShouldEnforceCPUPreflight() cleanly distinguishes omitted config from explicit false and keeps the default safe.

config/node/config.yaml (1)

63-73: Config documentation is clear and operationally actionable.

The comments explain threshold, default behavior, and emergency override in a way operators can apply directly.

cmd/node/startup.go (1)

252-261: Preflight placement and enforcement branching look correct.

Running the gate before key loading and downgrading only when enforceCpuPreflight=false is the right startup control flow.

cmd/benchmark/runner.go (1)

106-114: Crypto benchmark integration is consistent with the existing runner pattern.

Skip gating, stderr progress output, wrapped errors, and result assignment follow the established flow cleanly.

cmd/benchmark/main.go (1)

34-35: --skip-crypto flag wiring is complete and correct.

The new flag is declared, parsed, and propagated into Config as expected.

Also applies to: 100-101

cmd/benchmark/CLI.md (1)

20-21: CLI and scoring docs for crypto category are clear and complete.

The new flag, category weight, veto behavior, and JSON fields are documented in a practical operator-focused way.

Also applies to: 64-104

cmd/benchmark/score_test.go (1)

1-181: LGTM!

The test file demonstrates good practices:

  • Thread-safe test fixtures with no shared mutable state between tests
  • No hardcoded sleeps for synchronization
  • Table-driven tests used appropriately (TestComputeScore_RebalancedWeights, TestScoreGrade_Boundaries)
  • Good coverage of edge cases: nil results, empty results, veto boundaries, and category weights
  • The excellentResults() helper cleanly isolates test fixtures
cmd/benchmark/score.go (2)

159-204: LGTM — scoring logic and veto implementation are correct.

The ComputeScore function:

  • Correctly aggregates all seven categories including the new Crypto category
  • Applies the SHA-256 veto after computing the score, ensuring Total still reflects actual performance while Grade is capped to "F"
  • The veto fires on measured throughput rather than CPU flags, matching the documented design rationale

277-288: LGTM — cryptoCatScore follows established patterns.

The implementation mirrors other category score functions with proper nil handling and uses mean() to weight all five crypto metrics equally.

cmd/node/preflight.go (2)

134-164: LGTM — benchmark implementation is sound.

benchSHA256:

  • Amortizes time.Now() syscall overhead by checking deadline only once per 256 iterations
  • Falls back to deterministic buffer content if crypto/rand.Read fails, ensuring the benchmark remains representative
  • Correctly handles edge cases (non-positive duration returns 0)
  • The inner-loop batch approach means the final measurement may slightly exceed the deadline, but this is standard practice and produces consistent results

76-124: LGTM — preflight logic is well-structured.

The implementation:

  • Uses dependency injection for CPU info and bench function, enabling full test coverage without mocking globals
  • Runs the benchmark twice and takes the maximum, mitigating false negatives from transient throttling (thermal, noisy neighbor)
  • Logs feature flags as informational even when they don't hard-fail, giving operators diagnostic context
  • Produces actionable error messages with migration guidance and override instructions
cmd/node/preflight_test.go (2)

14-66: LGTM — thread-safe test logger implementation.

recordingLogger:

  • Uses sync.Mutex correctly for all read/write operations
  • hasInfo and hasWarn helpers hold the lock for the entire iteration, preventing races if a future test invokes logging concurrently

74-244: LGTM — comprehensive test coverage.

The tests cover:

  • Happy paths for both amd64 and arm64 architectures
  • Feature-warning paths (missing AVX-512 IFMA, missing SHA-NI)
  • Throughput enforcement (slow bench → error, fast bench → pass with warn)
  • Environment bypass (KLEVER_SKIP_CPU_CHECK=1)
  • Unsupported architecture handling (no-op)
  • Edge cases for benchSHA256 (non-positive duration)
  • Production wiring smoke test

The use of t.Setenv ensures proper test isolation, and fastBench enables deterministic testing without timing dependencies.

cmd/benchmark/report.go (3)

268-287: LGTM — cryptoVerdict follows established patterns.

The function correctly checks all five crypto metrics against their respective fail/pass thresholds in the expected order (fail conditions first, then warn conditions).


731-745: LGTM — JSON schema additions are consistent.

The jsonScore struct correctly includes:

  • Vetoed bool without omitempty — always present, providing explicit status in JSON output
  • VetoedReason string with omitempty — only present when there's a reason to report

This design ensures consumers always know the veto status without checking for field existence.


907-938: LGTM — crypto results and score correctly wired into JSON output.

The crypto result population includes all measurement fields and CPU feature flags, and the score section correctly includes the crypto category alongside the veto fields.

Comment thread cmd/benchmark/crypto.go
Comment thread cmd/benchmark/score.go
Comment thread cmd/node/PREFLIGHT.md Outdated
Comment thread cmd/node/PREFLIGHT.md Outdated
- Preallocate digest buffer in benchSHA256 / benchHashMBps to remove
  per-iteration allocations from the hash hot loop (verified: 2039
  allocs/op -> 0 allocs/op). Improves measurement stability on hosts
  where GC pressure could mask SHA-NI throughput.
- Update PREFLIGHT.md: preflight runs immediately *before* loading the
  BLS key (not after), matching the implementation.
- Fix doc drift: BenchmarkScore.Vetoed and CryptoResult.HasSHA_NI now
  correctly describe throughput-based gating rather than SHA-NI feature
  flag. HasSHA_NI doc clarifies the cross-platform shorthand (SHA-NI on
  amd64, ARMv8 SHA2 on arm64).
- Document the relationship between cryptoSHA256LargeFailMBps (600,
  per-metric verdict) and minLeaderSHA256MBps (500, hard grade veto).
- Surface preferences.enforceCpuPreflight=false escape hatch in the
  preflight error message so operators discover the warn-only mode
  without needing to find the docs first.
- go mod tidy: promote klauspost/cpuid/v2 from indirect to direct dep
  since cmd/node/preflight.go imports it directly.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/benchmark/crypto.go`:
- Around line 65-76: Rename the misleading CryptoResult field HasSHA_NI to an
architecture-neutral name (e.g. HasSHA256Accel or HasSHAAccel) in the
CryptoResult type and update every usage site to the new identifier
(constructors, serializers, JSON tags, CLI output, tests, and any consumer code)
so the CLI/JSON no longer exposes the x86-specific term; ensure the struct
field's JSON tag (if present) and any documentation/comments are updated to the
new name to keep external API and docs consistent.

In `@cmd/benchmark/score.go`:
- Around line 48-54: The comment for minLeaderSHA256MBps is stale and claims it
“matches the existing fail floor for SHA-256 16 KiB blocks” while the actual
threshold constant cryptoSHA256LargeFailMBps is 600 MB/s in report.go; update
the comment in the minLeaderSHA256MBps block to reflect the true relationship
(either change the 500 MB/s wording to note the 600 MB/s fail floor or remove
the “matches” sentence) and ensure the rationale cites cryptoSHA256LargeFailMBps
and the calibrated numbers consistently (refer to minLeaderSHA256MBps and
cryptoSHA256LargeFailMBps when editing).

In `@cmd/node/preflight.go`:
- Around line 141-147: The preflight benchmark currently calls
crypto/rand.Read(buf) which can block; remove that call and always initialize
the buffer deterministically (e.g., fill buf with a repeating or index-based
pattern such as buf[i] = byte(i)) so the SHA-256 throughput measurement uses
stable, non-blocking input; update the code around the buf variable and the
existing fallback loop to perform the deterministic fill unconditionally and
remove the crypto/rand dependency in this setup.

In `@cmd/node/PREFLIGHT.md`:
- Around line 97-102: Change the absolutist phrasing that "CCX (dedicated AMD
EPYC) and CPX (shared AMD EPYC) instances always satisfy the preflight" to a
softer, observational statement (e.g., "have been observed to satisfy the
preflight") and likewise clarify that the CX series may include Skylake-class
instances that do not; keep the existing concrete guidance to run the
klever-benchmark command (klever-benchmark --skip-disk --skip-network --skip-kv
--skip-memory --skip-goroutine --skip-bignum) as the definitive check before
deploying a validator, and update the surrounding text for consistency so
operators rely on the benchmark rather than fixed SKUs (references: CCX, CPX,
CX, Skylake-class, and klever-benchmark).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 54329bbc-0ea6-479d-ac38-03fbaea65380

📥 Commits

Reviewing files that changed from the base of the PR and between 2a691af and 2a056dc.

📒 Files selected for processing (6)
  • cmd/benchmark/crypto.go
  • cmd/benchmark/report.go
  • cmd/benchmark/score.go
  • cmd/node/PREFLIGHT.md
  • cmd/node/preflight.go
  • go.mod
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Analyze (go)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-04-21T20:12:22.959Z
Learnt from: phcarneirobc
Repo: klever-io/klever-go PR: 38
File: indexer/eventsProcessor.go:188-211
Timestamp: 2026-04-21T20:12:22.959Z
Learning: In Go structs that are JSON-marshaled, if a field is a `bool` and has the `json:"...,omitempty"` tag, then leaving that field at its zero value (`false`) is functionally equivalent (in the resulting JSON) to explicitly setting `Foundation: false`. Reviewers should not flag struct literals that omit such `bool` fields as an inconsistency; they will serialize identically because `omitempty` suppresses `false` values.

Applied to files:

  • cmd/node/preflight.go
  • cmd/benchmark/crypto.go
  • cmd/benchmark/score.go
  • cmd/benchmark/report.go
🔇 Additional comments (3)
cmd/node/preflight.go (1)

81-126: Preflight gating flow is clean and operationally actionable.

The bypass/unsupported-arch/measure/fail branches are well partitioned, and the failure message includes concrete operator escape hatches.

cmd/node/PREFLIGHT.md (1)

3-84: Doc-to-implementation alignment looks solid.

Startup ordering, enforcement modes, override behavior, and threshold semantics are clearly documented and consistent with the code path.

cmd/benchmark/crypto.go (1)

157-168: Good call reusing the digest buffer in the hot loop.

This removes per-iteration allocations from the benchmark path, so the throughput numbers are much less exposed to GC noise.

Comment thread cmd/benchmark/crypto.go
Comment thread cmd/benchmark/score.go Outdated
Comment thread cmd/node/preflight.go Outdated
Comment thread cmd/node/PREFLIGHT.md Outdated
fbsobreira added 3 commits May 7, 2026 21:27
Sonar flagged the printf template "  %-32s  %7.1f MB/s  %s
(pass≥%.0f, fail<%.0f MB/s)\n" as duplicated 5 times across
printNetworkSection and printCryptoSection. Centralised as
metricThroughputMBpsRowFmt so future column-width or label-format
changes stay consistent across sections.
Sonar flagged the new safe-default accessor as uncovered. The function
is load-bearing for the rollout safety guarantee (absent YAML key must
default to enforce=true so upgrading operators are not silently
downgraded to warn-only), so the absent-key branch is the most
important one to lock down. Test covers all three states: nil pointer,
explicit *true, explicit *false.
…docs

- preflight.go: replace crypto/rand seed with deterministic init.
  rand.Read can block on entropy starvation during early boot, which
  could delay validator startup before preflight even runs. SHA-256
  amd64/arm64 fast paths are data-independent so the buffer contents
  do not affect measured throughput.
- score.go: stale comment claimed minLeaderSHA256MBps "matches the
  existing fail floor"; actual fail floor is 600 MB/s vs 500 veto.
  Document the intentional gap and explain why behavior stays
  consistent (category-fail path in gradeToVerdict).
- PREFLIGHT.md: soften absolute "always satisfy" claim about Hetzner
  CCX/CPX SKUs to "typically" + emphasise running the benchmark to
  confirm. Cloud SKU hardware can change over time.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/benchmark/report.go`:
- Around line 372-375: The CPU feature reporting currently hardcodes the label
and field for SHA-NI (using results.CryptoResult and c.HasSHA_NI), which is
incorrect on arm64; change the report and serialization to be architecture-aware
by checking SystemInfo.GOARCH (or equivalent) and selecting the appropriate
label/field name (e.g., "SHA-NI" for amd64 vs "ARMv8 SHA2" for arm64) before
calling yesNo(c.HasSHA_NI) (or the corresponding bit/flag on CryptoResult);
update all occurrences (the printf block around results.CryptoResult, and the
similar blocks at the other listed ranges) so the printed label and exported
JSON key match the selected arch-neutral or arch-specific name.

In `@cmd/benchmark/score.go`:
- Around line 194-208: The veto message currently always suggests "missing
SHA-NI on amd64"; update the SHA-256 throughput veto in ComputeScore (the block
that checks r.CryptoResult and c.SHA256LargeMBps < minLeaderSHA256MBps and sets
s.VetoedReason) to branch on the host architecture available on the
BenchmarkResults (the r value) and produce an architecture-appropriate
remediation string: for amd64 keep the existing "missing SHA-NI" wording, for
arm64 mention missing ARM SHA extensions or use a generic "missing SHA-256
acceleration / hardware support" message, while preserving the formatted
throughput numbers (c.SHA256LargeMBps and minLeaderSHA256MBps) and still setting
s.Vetoed = true and s.Grade = "F".

In `@cmd/node/preflight.go`:
- Around line 96-100: The preflight logs and error messages use the
arch-specific label "sha_ni" and remediation text tied to AMD64 even when
info.arch may be arm64; update logging and error text to be architecture-aware
or arch-neutral: replace the hardcoded "sha_ni" label with an arch-neutral key
(e.g., "sha256_accel") or compute the label from info.arch, and when emitting
messages or errors that reference missing SHA support, derive the remediation
string from info.arch (e.g., mention "SHA-NI on amd64" vs "ARMv8 SHA2 on arm64")
so both log.Info calls (the one using info.hasSHA and the other similar blocks
later) and any failure/error messages reference the correct feature for
info.arch.

In `@cmd/node/PREFLIGHT.md`:
- Around line 3-5: The current text overstates the guarantee by claiming a host
that fails the preflight "never unlocks the BLS key" without qualifying
enforcement; update the wording in PREFLIGHT.md to scope the guarantee to the
enforced default path (e.g., mention that the key is not loaded when the CPU
preflight is enabled/enforced), and reference the configuration options
preferences.enforceCpuPreflight and environment variable KLEVER_SKIP_CPU_CHECK
to note that disabling these allows startup to continue; ensure the sentence
around the preflight description and the key-loading guarantee explicitly states
"when CPU preflight is enforced (default)" or similar to match implementation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f09a930b-a26b-43dc-9a68-e77e25213e8f

📥 Commits

Reviewing files that changed from the base of the PR and between 2a056dc and 8fc73ba.

📒 Files selected for processing (5)
  • cmd/benchmark/report.go
  • cmd/benchmark/score.go
  • cmd/node/PREFLIGHT.md
  • cmd/node/preflight.go
  • config/prefsConfig_test.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (1)
**/*_test.go

⚙️ CodeRabbit configuration file

**/*_test.go: Test files. Review for: - Adequate coverage of edge cases and error paths - Proper use of test helpers and assertions - Race condition coverage (tests should use -race flag patterns) - No hardcoded sleep for synchronization (use channels or sync primitives) - Test isolation (no shared mutable state between tests)

Files:

  • config/prefsConfig_test.go
🧠 Learnings (1)
📚 Learning: 2026-04-21T20:12:22.959Z
Learnt from: phcarneirobc
Repo: klever-io/klever-go PR: 38
File: indexer/eventsProcessor.go:188-211
Timestamp: 2026-04-21T20:12:22.959Z
Learning: In Go structs that are JSON-marshaled, if a field is a `bool` and has the `json:"...,omitempty"` tag, then leaving that field at its zero value (`false`) is functionally equivalent (in the resulting JSON) to explicitly setting `Foundation: false`. Reviewers should not flag struct literals that omit such `bool` fields as an inconsistency; they will serialize identically because `omitempty` suppresses `false` values.

Applied to files:

  • config/prefsConfig_test.go
  • cmd/benchmark/score.go
  • cmd/node/preflight.go
  • cmd/benchmark/report.go
🔇 Additional comments (3)
config/prefsConfig_test.go (3)

1-4: LGTM!

Package declaration and imports are appropriate for a unit test file.


5-15: LGTM!

The nil pointer test correctly validates the critical safe-default behavior. When operators upgrade and their existing YAML lacks the new EnforceCPUPreflight key, the system will enforce the CPU preflight by default, preventing silent security/performance downgrades.


17-31: LGTM!

Both explicit true/false cases are correctly tested. The local variable pattern (v := true; &v) is a standard Go idiom for creating bool pointers. Test coverage is complete for this boolean pointer accessor.

Comment thread cmd/benchmark/report.go
Comment thread cmd/benchmark/score.go
Comment thread cmd/node/preflight.go
Comment thread cmd/node/PREFLIGHT.md Outdated
CodeRabbit's third pass surfaced a real correctness issue: text
report, JSON output, log keys, and remediation messages all hardcoded
"SHA-NI" — which is the Intel/AMD x86 brand. On arm64 the equivalent
ISA is ARMv8 SHA2, so an operator running on a slow arm64 validator
was being misdirected to migrate from SHA-NI-deficient x86 hardware
when the actual root cause is missing ARMv8 SHA2.

Rename CryptoResult.HasSHA_NI -> HasSHAAccel; JSON tag sha_ni ->
sha_accel; preflight log key sha_ni -> sha_accel. Add shaAccelName(arch)
helper that returns "SHA-NI" / "ARMv8 SHA2" / generic per architecture
and use it in: text CPU section, post-crypto warning, JSON surface,
score VetoedReason, preflight measurement log, preflight Warn line,
preflight failure error.

Qualify the PREFLIGHT.md key-loading guarantee to scope it to the
enforced default path, since enforceCpuPreflight=false and
KLEVER_SKIP_CPU_CHECK=1 deliberately allow startup past failure.

Verified end-to-end:
- amd64 Skylake host: text shows "SHA-NI=no", JSON has "sha_accel":
  false, veto_reason cites "missing SHA-NI on amd64 (Skylake-X /
  Cascade Lake / Haswell)".
- arm64 (M4 Max) local: text shows "ARMv8 SHA2=yes" instead of the
  previous misleading "SHA-NI=yes".
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/benchmark/crypto.go`:
- Around line 223-237: The ed25519 verification loop repeatedly calls
time.Now().Before(deadline) for every op causing excessive syscall overhead on
fast hosts; modify the loop in the benchmark function that performs
ed25519.Verify (the block using variables deadline, start, count and calling
ed25519.Verify(pub, msg, sig)) to use an inner batch loop: choose a small batch
size (e.g., 64 or 256), run that many ed25519.Verify calls incrementing count,
then check time.Now().After(deadline) to break; ensure you stop mid-batch if
deadline passed and compute elapsed and ops/sec the same way so semantics and
final count remain correct.

In `@cmd/benchmark/score_test.go`:
- Around line 98-137: Add a boundary unit test that sets
r.CryptoResult.SHA256LargeMBps = minLeaderSHA256MBps and calls ComputeScore to
assert the throughput veto does not trigger; specifically create a
TestComputeScore_ThroughputVeto_ExactBoundary test (analogous to the existing
tests) that verifies s.Vetoed is false (and optionally that s.Grade != "F") to
confirm the comparison is strict (<) rather than <= at the minLeaderSHA256MBps
boundary.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cb10c2e6-9aec-42c5-aeb1-2f60c07c8a07

📥 Commits

Reviewing files that changed from the base of the PR and between 8fc73ba and 9bdbaea.

📒 Files selected for processing (6)
  • cmd/benchmark/crypto.go
  • cmd/benchmark/report.go
  • cmd/benchmark/score.go
  • cmd/benchmark/score_test.go
  • cmd/node/PREFLIGHT.md
  • cmd/node/preflight.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (1)
**/*_test.go

⚙️ CodeRabbit configuration file

**/*_test.go: Test files. Review for: - Adequate coverage of edge cases and error paths - Proper use of test helpers and assertions - Race condition coverage (tests should use -race flag patterns) - No hardcoded sleep for synchronization (use channels or sync primitives) - Test isolation (no shared mutable state between tests)

Files:

  • cmd/benchmark/score_test.go
🧠 Learnings (1)
📚 Learning: 2026-04-21T20:12:22.959Z
Learnt from: phcarneirobc
Repo: klever-io/klever-go PR: 38
File: indexer/eventsProcessor.go:188-211
Timestamp: 2026-04-21T20:12:22.959Z
Learning: In Go structs that are JSON-marshaled, if a field is a `bool` and has the `json:"...,omitempty"` tag, then leaving that field at its zero value (`false`) is functionally equivalent (in the resulting JSON) to explicitly setting `Foundation: false`. Reviewers should not flag struct literals that omit such `bool` fields as an inconsistency; they will serialize identically because `omitempty` suppresses `false` values.

Applied to files:

  • cmd/benchmark/score_test.go
  • cmd/benchmark/score.go
  • cmd/benchmark/crypto.go
  • cmd/node/preflight.go
  • cmd/benchmark/report.go
🔇 Additional comments (11)
cmd/benchmark/crypto.go (2)

1-113: LGTM — crypto benchmark entry point and result structure are well-designed.

The RunCryptoBenchmark function properly sequences sub-benchmarks, handles errors, and returns a complete CryptoResult. The architecture-aware shaAccelName and hasSHAAcceleration helpers correctly distinguish SHA-NI (amd64) from ARMv8 SHA2 (arm64).


180-206: LGTM — hash benchmark loop is well-optimized.

The preallocation of digest (line 188) eliminates per-iteration allocations, and the innerLoop batching (256 iterations before deadline check) minimizes time.Now() syscall overhead. The rand.Read fallback for buffer seeding is appropriate here since this is an operator-run tool, not the boot-path preflight.

cmd/node/preflight.go (2)

64-128: LGTM — preflight logic is robust and well-documented.

The test seam via validatorCPUPreflightWithInfo enables thorough testing without mocking globals. Running the benchmark twice and taking the maximum handles transient throttling. The arch-aware error messages (shaAccelName, shaAccelArchSuffix) correctly guide operators on both amd64 and arm64.


167-198: LGTM — benchSHA256 is correctly implemented for the startup path.

Deterministic buffer initialization (line 176-178) avoids crypto/rand entropy starvation during early boot. The innerLoop batching and deadline-based termination match the benchmark tool's approach. The preallocation of digest (line 180) eliminates GC pressure during measurement.

cmd/benchmark/score_test.go (1)

162-181: LGTM — grade boundary tests are thorough.

Table-driven approach cleanly covers all grade thresholds (S/A/B/C/D/F). Test isolation is good — each case is independent.

cmd/benchmark/score.go (2)

167-214: LGTM — scoring with crypto category and hard veto is correctly implemented.

The veto logic properly:

  1. Checks CryptoResult != nil before accessing fields
  2. Uses strict < comparison against the 500 MB/s floor
  3. Sets both Vetoed and VetoedReason with arch-aware messaging via shaAccelName(r.SystemInfo.GOARCH)
  4. Forces Grade = "F" regardless of point total

The category weight sum is correct (200+200+200+150+100+100+50 = 1000).


287-298: LGTM — crypto category score normalizer.

The cryptoCatScore function correctly averages five normalized metrics (SHA-256 small/large, Blake2b, Keccak-256, Ed25519 verify). Consistent with other category normalizers in this file.

cmd/node/PREFLIGHT.md (1)

1-118: LGTM — comprehensive and accurate operator documentation.

The document properly addresses:

  • Key-loading guarantee scoped to enforced mode (lines 4-10)
  • Measured-throughput veto vs feature-bit check distinction (lines 27-31, 115-117)
  • Threshold relationship: startup 800 MB/s vs benchmark 500 MB/s (lines 85-90)
  • Cloud SKU guidance qualified as "based on current fleet observations" (lines 103-110)
  • Emergency bypass semantics with fail-closed typo protection (lines 72-73)
cmd/benchmark/report.go (3)

372-376: LGTM — CPU feature header is now architecture-aware.

The text report correctly uses shaAccelName(si.GOARCH) to display "SHA-NI" on amd64 and "ARMv8 SHA2" on arm64, addressing the previous review concern about hardcoded x86 terminology.


574-609: LGTM — crypto section with arch-aware remediation.

The printCryptoSection function:

  1. Renders all five metrics with consistent formatting via metricThroughputMBpsRowFmt
  2. Shows SHA acceleration warning only on supported architectures (amd64 or arm64)
  3. Uses shaAccelName(runtime.GOARCH) for the correct feature name per platform

817-936: LGTM — JSON output correctly exposes crypto results and veto fields.

The jsonCrypto and jsonCPUFeatures structs use appropriate field names (sha_accel instead of sha_ni). The score object includes vetoed and veto_reason fields for programmatic consumption. All crypto metrics are exposed with clear naming conventions (sha256_1k_mbps, sha256_16k_mbps, etc.).

Comment thread cmd/benchmark/crypto.go
Comment on lines +223 to +237
deadline := time.Now().Add(cryptoBenchDuration)
start := time.Now()
var count int64
for time.Now().Before(deadline) {
if !ed25519.Verify(pub, msg, sig) {
return 0, fmt.Errorf("ed25519 self-verify failed unexpectedly")
}
count++
}
elapsed := time.Since(start).Seconds()
if elapsed <= 0 {
return 0, nil
}
return float64(count) / elapsed, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Ed25519 verify loop lacks batching — potential time.Now() overhead on fast hosts.

Unlike benchHashMBps, this loop calls time.Now().Before(deadline) on every single verification. On a host achieving 25K ops/s, that's 50K+ syscalls over 2 seconds. While the impact is less severe than in hashing (verify is ~40µs vs ~300ns per hash), adding an inner batch loop would improve measurement accuracy on high-end hardware.

♻️ Optional: batch the deadline check
 func benchEd25519Verify() (float64, error) {
 	...
 	deadline := time.Now().Add(cryptoBenchDuration)
 	start := time.Now()
 	var count int64
+	const innerLoop = 64
 	for time.Now().Before(deadline) {
-		if !ed25519.Verify(pub, msg, sig) {
-			return 0, fmt.Errorf("ed25519 self-verify failed unexpectedly")
+		for i := 0; i < innerLoop; i++ {
+			if !ed25519.Verify(pub, msg, sig) {
+				return 0, fmt.Errorf("ed25519 self-verify failed unexpectedly")
+			}
 		}
-		count++
+		count += innerLoop
 	}
 	...
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
deadline := time.Now().Add(cryptoBenchDuration)
start := time.Now()
var count int64
for time.Now().Before(deadline) {
if !ed25519.Verify(pub, msg, sig) {
return 0, fmt.Errorf("ed25519 self-verify failed unexpectedly")
}
count++
}
elapsed := time.Since(start).Seconds()
if elapsed <= 0 {
return 0, nil
}
return float64(count) / elapsed, nil
}
deadline := time.Now().Add(cryptoBenchDuration)
start := time.Now()
var count int64
const innerLoop = 64
for time.Now().Before(deadline) {
for i := 0; i < innerLoop; i++ {
if !ed25519.Verify(pub, msg, sig) {
return 0, fmt.Errorf("ed25519 self-verify failed unexpectedly")
}
}
count += innerLoop
}
elapsed := time.Since(start).Seconds()
if elapsed <= 0 {
return 0, nil
}
return float64(count) / elapsed, nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/benchmark/crypto.go` around lines 223 - 237, The ed25519 verification
loop repeatedly calls time.Now().Before(deadline) for every op causing excessive
syscall overhead on fast hosts; modify the loop in the benchmark function that
performs ed25519.Verify (the block using variables deadline, start, count and
calling ed25519.Verify(pub, msg, sig)) to use an inner batch loop: choose a
small batch size (e.g., 64 or 256), run that many ed25519.Verify calls
incrementing count, then check time.Now().After(deadline) to break; ensure you
stop mid-batch if deadline passed and compute elapsed and ops/sec the same way
so semantics and final count remain correct.

Comment on lines +98 to +137
func TestComputeScore_ThroughputVeto_CapsGradeAtF(t *testing.T) {
r := excellentResults()
// Bench measured below the leader-mode floor (e.g., a Skylake/Haswell
// without SHA-NI typically lands around 250 MB/s).
r.CryptoResult.SHA256LargeMBps = 250

s := ComputeScore(r)
if !s.Vetoed {
t.Fatal("expected Vetoed=true when SHA-256 16K throughput below the floor")
}
if s.Grade != "F" {
t.Fatalf("Grade = %q, want F when veto applies", s.Grade)
}
if s.VetoedReason == "" {
t.Fatal("expected VetoedReason to be populated")
}
// The numeric score should still be substantial — the veto is a grade-cap,
// not a silent zero. Operators get to see how the rest of the system performs.
// (The Crypto category itself will score low because of the bad throughput,
// but the other six categories were set to excellent in this fixture.)
if s.Total < 700 {
t.Fatalf("Total = %d, expected non-veto categories to still score normally", s.Total)
}
}

func TestComputeScore_ThroughputVeto_DoesNotApply_AboveFloor(t *testing.T) {
r := excellentResults()
// Throughput just above the floor — veto must not trigger even though
// the host could in principle be a non-SHA-NI amd64.
r.CryptoResult.SHA256LargeMBps = minLeaderSHA256MBps + 1
r.CryptoResult.HasSHAAccel = false

s := ComputeScore(r)
if s.Vetoed {
t.Fatalf("Vetoed must be false when throughput is above floor; reason: %s", s.VetoedReason)
}
if s.Grade == "F" {
t.Fatalf("Grade = F unexpectedly when throughput is above floor")
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider adding a test case at the exact veto boundary.

The tests cover 250 (below) and minLeaderSHA256MBps + 1 (above), but not the exact boundary value 500.0. Floating-point boundary conditions can be subtle — a test at exactly 500.0 would confirm whether < vs <= is intended.

♻️ Optional: add boundary test
func TestComputeScore_ThroughputVeto_ExactBoundary(t *testing.T) {
	r := excellentResults()
	r.CryptoResult.SHA256LargeMBps = minLeaderSHA256MBps // exactly 500.0

	s := ComputeScore(r)
	if s.Vetoed {
		t.Fatal("Vetoed must be false at exact boundary (< is used, not <=)")
	}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/benchmark/score_test.go` around lines 98 - 137, Add a boundary unit test
that sets r.CryptoResult.SHA256LargeMBps = minLeaderSHA256MBps and calls
ComputeScore to assert the throughput veto does not trigger; specifically create
a TestComputeScore_ThroughputVeto_ExactBoundary test (analogous to the existing
tests) that verifies s.Vetoed is false (and optionally that s.Grade != "F") to
confirm the comparison is strict (<) rather than <= at the minLeaderSHA256MBps
boundary.

Comment thread cmd/benchmark/crypto.go
func shaCommonCauseSuffix(arch string) string {
switch arch {
case "amd64":
return " on amd64 (Skylake-X / Cascade Lake / Haswell)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every amd64 is one of thoose 3?
i would replace with something like "Likely a x,y,z"

@fbsobreira fbsobreira marked this pull request as draft May 12, 2026 01:26
@fbsobreira fbsobreira added the WIP work in progress label May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

security WIP work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants