|
| 1 | +# Phase 5: Integration Test Task List |
| 2 | + |
| 3 | +> **Goal**: End-to-end verification of the complete telemetry pipeline using a |
| 4 | +> 6-node consensus network. Proves that RPC, transaction, and consensus spans |
| 5 | +> flow through the observability stack (otel-collector, Jaeger, Prometheus, |
| 6 | +> Grafana) under realistic conditions. |
| 7 | +> |
| 8 | +> **Scope**: Integration test script, manual testing plan, 6-node local network |
| 9 | +> setup, Jaeger/Prometheus/Grafana verification. |
| 10 | +> |
| 11 | +> **Branch**: `pratik/otel-phase5-docs-deployment` |
| 12 | +
|
| 13 | +### Related Plan Documents |
| 14 | + |
| 15 | +| Document | Relevance | |
| 16 | +| ---------------------------------------------------------------- | ------------------------------------------ | |
| 17 | +| [07-observability-backends.md](./07-observability-backends.md) | Jaeger, Grafana, Prometheus setup | |
| 18 | +| [05-configuration-reference.md](./05-configuration-reference.md) | Collector config, Docker Compose | |
| 19 | +| [06-implementation-phases.md](./06-implementation-phases.md) | Phase 5 tasks, definition of done | |
| 20 | +| [Phase5_taskList.md](./Phase5_taskList.md) | Phase 5 main task list (5.6 = integration) | |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +## Task IT.1: Create Integration Test Script |
| 25 | + |
| 26 | +**Objective**: Automated bash script that stands up a 6-node xrpld network |
| 27 | +with telemetry, exercises all span categories, and verifies data in |
| 28 | +Jaeger/Prometheus. |
| 29 | + |
| 30 | +**What to do**: |
| 31 | + |
| 32 | +- Create `docker/telemetry/integration-test.sh`: |
| 33 | + - Prerequisites check (docker, xrpld binary, curl, jq) |
| 34 | + - Start observability stack via `docker compose` |
| 35 | + - Generate 6 validator key pairs via temp standalone xrpld |
| 36 | + - Generate 6 node configs + shared `validators.txt` |
| 37 | + - Start 6 xrpld nodes in consensus mode (`--start`, no `-a`) |
| 38 | + - Wait for all nodes to reach `"proposing"` state (120s timeout) |
| 39 | + |
| 40 | +**Key new file**: `docker/telemetry/integration-test.sh` |
| 41 | + |
| 42 | +**Verification**: |
| 43 | + |
| 44 | +- [ ] Script starts without errors |
| 45 | +- [ ] All 6 nodes reach "proposing" state |
| 46 | +- [ ] Observability stack is healthy (otel-collector, Jaeger, Prometheus, Grafana) |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## Task IT.2: RPC Span Verification (Phase 2) |
| 51 | + |
| 52 | +**Objective**: Verify RPC spans flow through the telemetry pipeline. |
| 53 | + |
| 54 | +**What to do**: |
| 55 | + |
| 56 | +- Send `server_info`, `server_state`, `ledger` RPCs to node1 (port 5005) |
| 57 | +- Wait for batch export (5s) |
| 58 | +- Query Jaeger API for: |
| 59 | + - `rpc.request` spans (ServerHandler::onRequest) |
| 60 | + - `rpc.process` spans (ServerHandler::processRequest) |
| 61 | + - `rpc.command.server_info` spans (callMethod) |
| 62 | + - `rpc.command.server_state` spans (callMethod) |
| 63 | + - `rpc.command.ledger` spans (callMethod) |
| 64 | +- Verify `xrpl.rpc.command` attribute present on `rpc.command.*` spans |
| 65 | + |
| 66 | +**Verification**: |
| 67 | + |
| 68 | +- [ ] Jaeger shows `rpc.request` traces |
| 69 | +- [ ] Jaeger shows `rpc.process` traces |
| 70 | +- [ ] Jaeger shows `rpc.command.*` traces with correct attributes |
| 71 | + |
| 72 | +--- |
| 73 | + |
| 74 | +## Task IT.3: Transaction Span Verification (Phase 3) |
| 75 | + |
| 76 | +**Objective**: Verify transaction spans flow through the telemetry pipeline. |
| 77 | + |
| 78 | +**What to do**: |
| 79 | + |
| 80 | +- Get genesis account sequence via `account_info` RPC |
| 81 | +- Submit Payment transaction using genesis seed (`snoPBrXtMeMyMHUVTgbuqAfg1SUTb`) |
| 82 | +- Wait for consensus inclusion (10s) |
| 83 | +- Query Jaeger API for: |
| 84 | + - `tx.process` spans (NetworkOPsImp::processTransaction) on submitting node |
| 85 | + - `tx.receive` spans (PeerImp::handleTransaction) on peer nodes |
| 86 | +- Verify `xrpl.tx.hash` attribute on `tx.process` spans |
| 87 | +- Verify `xrpl.peer.id` attribute on `tx.receive` spans |
| 88 | + |
| 89 | +**Verification**: |
| 90 | + |
| 91 | +- [ ] Jaeger shows `tx.process` traces with `xrpl.tx.hash` |
| 92 | +- [ ] Jaeger shows `tx.receive` traces with `xrpl.peer.id` |
| 93 | + |
| 94 | +--- |
| 95 | + |
| 96 | +## Task IT.4: Consensus Span Verification (Phase 4) |
| 97 | + |
| 98 | +**Objective**: Verify consensus spans flow through the telemetry pipeline. |
| 99 | + |
| 100 | +**What to do**: |
| 101 | + |
| 102 | +- Consensus runs automatically in 6-node network |
| 103 | +- Query Jaeger API for: |
| 104 | + - `consensus.proposal.send` (Adaptor::propose) |
| 105 | + - `consensus.ledger_close` (Adaptor::onClose) |
| 106 | + - `consensus.accept` (Adaptor::onAccept) |
| 107 | + - `consensus.validation.send` (Adaptor::validate) |
| 108 | +- Verify attributes: |
| 109 | + - `xrpl.consensus.mode` on `consensus.ledger_close` |
| 110 | + - `xrpl.consensus.proposers` on `consensus.accept` |
| 111 | + - `xrpl.consensus.ledger.seq` on `consensus.validation.send` |
| 112 | + |
| 113 | +**Verification**: |
| 114 | + |
| 115 | +- [ ] Jaeger shows `consensus.ledger_close` traces with `xrpl.consensus.mode` |
| 116 | +- [ ] Jaeger shows `consensus.accept` traces with `xrpl.consensus.proposers` |
| 117 | +- [ ] Jaeger shows `consensus.proposal.send` traces |
| 118 | +- [ ] Jaeger shows `consensus.validation.send` traces |
| 119 | + |
| 120 | +--- |
| 121 | + |
| 122 | +## Task IT.5: Spanmetrics Verification (Phase 5) |
| 123 | + |
| 124 | +**Objective**: Verify spanmetrics connector derives RED metrics from spans. |
| 125 | + |
| 126 | +**What to do**: |
| 127 | + |
| 128 | +- Query Prometheus for `traces_span_metrics_calls_total` |
| 129 | +- Query Prometheus for `traces_span_metrics_duration_milliseconds_count` |
| 130 | +- Verify Grafana loads at `http://localhost:3000` |
| 131 | + |
| 132 | +**Verification**: |
| 133 | + |
| 134 | +- [ ] Prometheus returns non-empty results for `traces_span_metrics_calls_total` |
| 135 | +- [ ] Prometheus returns non-empty results for duration histogram |
| 136 | +- [ ] Grafana UI accessible with dashboards visible |
| 137 | + |
| 138 | +--- |
| 139 | + |
| 140 | +## Task IT.6: Manual Testing Plan |
| 141 | + |
| 142 | +**Objective**: Document how to run tests manually for future reference. |
| 143 | + |
| 144 | +**What to do**: |
| 145 | + |
| 146 | +- Create `docker/telemetry/TESTING.md` with: |
| 147 | + - Prerequisites section |
| 148 | + - Single-node standalone test (quick verification) |
| 149 | + - 6-node consensus test (full verification) |
| 150 | + - Expected span catalog (all 12 span names with attributes) |
| 151 | + - Verification queries (Jaeger API, Prometheus API) |
| 152 | + - Troubleshooting guide |
| 153 | + |
| 154 | +**Key new file**: `docker/telemetry/TESTING.md` |
| 155 | + |
| 156 | +**Verification**: |
| 157 | + |
| 158 | +- [ ] Document covers both single-node and multi-node testing |
| 159 | +- [ ] All 12 span names documented with source file and attributes |
| 160 | +- [ ] Troubleshooting section covers common failure modes |
| 161 | + |
| 162 | +--- |
| 163 | + |
| 164 | +## Task IT.7: Run and Verify |
| 165 | + |
| 166 | +**Objective**: Execute the integration test and validate results. |
| 167 | + |
| 168 | +**What to do**: |
| 169 | + |
| 170 | +- Run `docker/telemetry/integration-test.sh` locally |
| 171 | +- Debug any failures |
| 172 | +- Leave stack running for manual verification |
| 173 | +- Share URLs: |
| 174 | + - Jaeger: `http://localhost:16686` |
| 175 | + - Grafana: `http://localhost:3000` |
| 176 | + - Prometheus: `http://localhost:9090` |
| 177 | + |
| 178 | +**Verification**: |
| 179 | + |
| 180 | +- [ ] Script completes with all checks passing |
| 181 | +- [ ] Jaeger UI shows rippled service with all expected span names |
| 182 | +- [ ] Grafana dashboards load and show data |
| 183 | + |
| 184 | +--- |
| 185 | + |
| 186 | +## Task IT.8: Commit |
| 187 | + |
| 188 | +**Objective**: Commit all new files to Phase 5 branch. |
| 189 | + |
| 190 | +**What to do**: |
| 191 | + |
| 192 | +- Run `pcc` (pre-commit checks) |
| 193 | +- Commit 3 new files to `pratik/otel-phase5-docs-deployment` |
| 194 | + |
| 195 | +**Verification**: |
| 196 | + |
| 197 | +- [ ] `pcc` passes |
| 198 | +- [ ] Commit created on Phase 5 branch |
| 199 | + |
| 200 | +--- |
| 201 | + |
| 202 | +## Summary |
| 203 | + |
| 204 | +| Task | Description | New Files | Depends On | |
| 205 | +| ---- | ----------------------------- | --------- | ---------- | |
| 206 | +| IT.1 | Integration test script | 1 | Phase 5 | |
| 207 | +| IT.2 | RPC span verification | 0 | IT.1 | |
| 208 | +| IT.3 | Transaction span verification | 0 | IT.1 | |
| 209 | +| IT.4 | Consensus span verification | 0 | IT.1 | |
| 210 | +| IT.5 | Spanmetrics verification | 0 | IT.1 | |
| 211 | +| IT.6 | Manual testing plan | 1 | -- | |
| 212 | +| IT.7 | Run and verify | 0 | IT.1-IT.6 | |
| 213 | +| IT.8 | Commit | 0 | IT.7 | |
| 214 | + |
| 215 | +**Exit Criteria**: |
| 216 | + |
| 217 | +- [ ] All 6 xrpld nodes reach "proposing" state |
| 218 | +- [ ] All 11 expected span names visible in Jaeger |
| 219 | +- [ ] Spanmetrics available in Prometheus |
| 220 | +- [ ] Grafana dashboards show data |
| 221 | +- [ ] Manual testing plan document complete |
0 commit comments