feat(coprocessor): otel cardinality cleanup, dfg pattern attribution, spanmetrics#2019
feat(coprocessor): otel cardinality cleanup, dfg pattern attribution, spanmetrics#2019
Conversation
ceb3dff to
a4f023f
Compare
🧪 CI InsightsHere's what we observed from your CI run for e9579f1. 🟢 All jobs passed!But CI Insights is watching 👀 |
762d58e to
4d407b1
Compare
c750e92 to
5f4e105
Compare
0e498f2 to
256b446
Compare
011fb1d to
81ed9e8
Compare
fe28ad9 to
efbcbd1
Compare
82f366a to
d0b9ad0
Compare
There was a problem hiding this comment.
[AUTOMATED]
Code Review Summary
This is a well-structured PR that accomplishes four distinct goals: OTEL cardinality cleanup, DFG pattern attribution, spanmetrics plumbing, and test hardening. The code quality is high — the pattern encoding module is well-documented, the union-find implementation is correct, the binary layout is sound, and the tracing refactors (replacing parent: &span with span.in_scope() and .instrument()) are semantically correct.
What was reviewed
| Agent | Focus | Findings |
|---|---|---|
| Bug Hunter ×2 (opus) | Runtime bugs, security | No high-confidence bugs found |
| Error Handling Auditor | Silent failures | 1 minor observability nit (see inline) |
| Guideline Compliance | Project conventions | No violations — new code follows established patterns |
Highlights
- Pattern encoding (
encoding.rs): Clean binary layout with proper bounds checking. Encode/decode round-trips are verified by extensive tests. The two-tier strategy (v1 compact encoding → v2 Keccak hash for large groups) is well-designed. - Union-find grouping (
grouping.rs): Correct path-splitting implementation. The cone-tracing and merge algorithm is well-documented with worked examples. - Tracing refactors: The
parent: &span→span.in_scope()/.instrument()changes are semantically correct and properly propagate span context across async boundaries. - Test coverage: Comprehensive unit tests for pattern encoding/decoding, plus an end-to-end integration test (
pattern_integration.rs) that validates real encrypted-transfer shapes.
One minor observability nit is noted inline. Overall this is a solid, well-tested change.
|
The review has been posted successfully. Here's a summary of what was done: Review Complete ✅PR #2019: Review Process
Results
Filtered Issues
Posted
|
68a953c to
e9579f1
Compare
antoniupop
left a comment
There was a problem hiding this comment.
Looks good - and very well tested. Pattern encoding looks nice!
|
@Mergifyio queue |
Merge Queue Status
Required conditions to enter a queue
|
ed1ef45 to
b369da0
Compare
|
CI seems to pass now. |
…vents and fix span correlation gap
f8bf5f8 to
4519d5f
Compare
Summary
This PR finishes the coprocessor OTEL cleanup in four parts:
host-listener,tfhe-worker,sns-worker,zkproof-worker, andtransaction-senderoperation_pattern_idandtransaction_pattern_idspan attributesCloses zama-ai/fhevm-internal#1044, zama-ai/fhevm-internal#1013, zama-ai/fhevm-internal#1012.
What changed
1. Coprocessor span cardinality cleanup
handle,txn_id,request_id, and similar per-item fields from runtime spansparent: &spancall sites withspan.in_scope()/ equivalent scoped flow so parent-child relationships stay correctcoprocessor/fhevm-engine/.sqlxMain files:
coprocessor/fhevm-engine/host-listener/src/database/tfhe_event_propagate.rscoprocessor/fhevm-engine/tfhe-worker/src/tfhe_worker.rscoprocessor/fhevm-engine/sns-worker/src/aws_upload.rscoprocessor/fhevm-engine/sns-worker/src/executor.rscoprocessor/fhevm-engine/sns-worker/src/squash_noise.rscoprocessor/fhevm-engine/transaction-sender/src/ops/*.rscoprocessor/fhevm-engine/zkproof-worker/src/verifier.rs2. Scheduler pattern attribution
operation_pattern_idandtransaction_pattern_idattribution in the schedulerscheduler/src/dfg/pattern/encoding.rsscheduler/src/dfg/pattern/grouping.rsscheduler/src/dfg/pattern/types.rsscheduler/src/dfg/pattern/tests.rsscheduler/src/dfg/scheduler.rsto emit the new pattern attributes while keepingtransaction_idon the transaction root spanMain files:
coprocessor/fhevm-engine/scheduler/src/dfg.rscoprocessor/fhevm-engine/scheduler/src/dfg/scheduler.rscoprocessor/fhevm-engine/scheduler/src/dfg/pattern/*3. Local spanmetrics pipeline
Main files:
test-suite/fhevm/config/otel-collector/otel-collector-config.yamltest-suite/fhevm/docker-compose/tracing-docker-compose.ymltest-suite/fhevm/config/prometheus/prometheus.yml.gitignore4. tfhe-worker pattern integration test hardening
pattern_integrationtest module for realistic encrypted-transfer shapestransaction_idemitted onexecute_transaction, so the assertions ignore cross-test span pollution without serializing the suiteMain files:
coprocessor/fhevm-engine/tfhe-worker/src/tests/pattern_integration.rscoprocessor/fhevm-engine/tfhe-worker/src/tests/utils.rscoprocessor/fhevm-engine/tfhe-worker/src/tests/mod.rsFollow-up
Commit map
feat(telemetry): add JSON log-trace correlationrefactor(coprocessor): standardize low-cardinality span attributionfeat(infra): add OTEL collector spanmetrics pipelinefeat(scheduler): add DFG pattern attribution with scalable encodingfix(deps): pin tfhe-cuda-backend to 0.13.0 in lockfiletest(tfhe-worker): harden pattern integration tracing assertionsfix(tfhe-worker): align compressed ciphertext plumbing with schedulerrefactor(coprocessor): keep tx ids on root spans onlyValidation
cargo checkand clippy passed while rewriting the branch historySQLX_OFFLINE=true cargo check -p fhevm-engine-common -p scheduler -p tfhe-worker -p transaction-sender -p sns-worker -p zkproof-worker -p host-listener --tests --quietcargo test -p scheduler pattern::tests --quietSQLX_OFFLINE=true cargo test -p tfhe-worker test_erc20_transaction_pattern_ids --no-runReview notes