feat(coprocessor): otel cardinality cleanup, dfg pattern attribution, spanmetrics#2019
feat(coprocessor): otel cardinality cleanup, dfg pattern attribution, spanmetrics#2019
Conversation
ceb3dff to
a4f023f
Compare
🧪 CI InsightsHere's what we observed from your CI run for 5f61c98. 🟢 All jobs passed!But CI Insights is watching 👀 |
762d58e to
4d407b1
Compare
c750e92 to
5f4e105
Compare
0e498f2 to
256b446
Compare
011fb1d to
81ed9e8
Compare
91edb09 to
fe28ad9
Compare
fe28ad9 to
efbcbd1
Compare
|
@claude /pr-review |
|
Claude finished @Eikix's task in 3m 20s —— View job [AUTOMATED] Code Review — OTel Cardinality Cleanup, DFG Pattern Attribution, Spanmetrics
SummaryThis is a high-quality, well-structured PR that delivers comprehensive OTel observability improvements. The cardinality cleanup is thorough and consistent, the pattern attribution algorithm is clever and well-tested, and the infrastructure changes are production-ready. The previous review's P2 ( Verdict: No blocking issues found. A few observations and one minor nit below. FindingsNit — The Observation — The LRU dedup cache ( Observation — The Observation — Pattern stability depends on node insertion order The encoding ( Acceptance Criteria Matrix
Architecture Highlights
Minor Residual Risks
Overall: This is a thorough, well-tested PR that significantly improves the coprocessor's observability posture. No blocking issues. Ready for merge. |
Summary
This PR finishes the coprocessor OTEL cleanup in four parts:
host-listener,tfhe-worker,sns-worker,zkproof-worker, andtransaction-senderoperation_pattern_idandtransaction_pattern_idspan attributesCloses zama-ai/fhevm-internal#1044, zama-ai/fhevm-internal#1013, zama-ai/fhevm-internal#1012.
What changed
1. Coprocessor span cardinality cleanup
handle,txn_id,request_id, and similar per-item fields from runtime spansparent: &spancall sites withspan.in_scope()/ equivalent scoped flow so parent-child relationships stay correctcoprocessor/fhevm-engine/.sqlxMain files:
coprocessor/fhevm-engine/host-listener/src/database/tfhe_event_propagate.rscoprocessor/fhevm-engine/tfhe-worker/src/tfhe_worker.rscoprocessor/fhevm-engine/sns-worker/src/aws_upload.rscoprocessor/fhevm-engine/sns-worker/src/executor.rscoprocessor/fhevm-engine/sns-worker/src/squash_noise.rscoprocessor/fhevm-engine/transaction-sender/src/ops/*.rscoprocessor/fhevm-engine/zkproof-worker/src/verifier.rs2. Scheduler pattern attribution
operation_pattern_idandtransaction_pattern_idattribution in the schedulerscheduler/src/dfg/pattern/encoding.rsscheduler/src/dfg/pattern/grouping.rsscheduler/src/dfg/pattern/types.rsscheduler/src/dfg/pattern/tests.rsscheduler/src/dfg/scheduler.rsto emit the new pattern attributes while keepingtransaction_idon the transaction root spanMain files:
coprocessor/fhevm-engine/scheduler/src/dfg.rscoprocessor/fhevm-engine/scheduler/src/dfg/scheduler.rscoprocessor/fhevm-engine/scheduler/src/dfg/pattern/*3. Local spanmetrics pipeline
Main files:
test-suite/fhevm/config/otel-collector/otel-collector-config.yamltest-suite/fhevm/docker-compose/tracing-docker-compose.ymltest-suite/fhevm/config/prometheus/prometheus.yml.gitignore4. tfhe-worker pattern integration test hardening
pattern_integrationtest module for realistic encrypted-transfer shapestransaction_idemitted onexecute_transaction, so the assertions ignore cross-test span pollution without serializing the suiteMain files:
coprocessor/fhevm-engine/tfhe-worker/src/tests/pattern_integration.rscoprocessor/fhevm-engine/tfhe-worker/src/tests/utils.rscoprocessor/fhevm-engine/tfhe-worker/src/tests/mod.rsFollow-up
Commit map
feat(telemetry): add JSON log-trace correlationrefactor(coprocessor): standardize low-cardinality span attributionfeat(infra): add OTEL collector spanmetrics pipelinefeat(scheduler): add DFG pattern attribution with scalable encodingfix(deps): pin tfhe-cuda-backend to 0.13.0 in lockfiletest(tfhe-worker): harden pattern integration tracing assertionsfix(tfhe-worker): align compressed ciphertext plumbing with schedulerrefactor(coprocessor): keep tx ids on root spans onlyValidation
cargo checkand clippy passed while rewriting the branch historySQLX_OFFLINE=true cargo check -p fhevm-engine-common -p scheduler -p tfhe-worker -p transaction-sender -p sns-worker -p zkproof-worker -p host-listener --tests --quietcargo test -p scheduler pattern::tests --quietSQLX_OFFLINE=true cargo test -p tfhe-worker test_erc20_transaction_pattern_ids --no-runReview notes