Skip to content

Commit b12d10c

Browse files
Cofsonclaude
andcommitted
Clarify retry responsibility model in error handling
Added clear separation between: - System-level retry: automatic peer rotation, transparent to caller - Caller-level retry: when all peers exhausted, caller decides Addresses reviewer comment about fail-fast vs queued-retry contradiction. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent c8db338 commit b12d10c

File tree

1 file changed

+49
-3
lines changed

1 file changed

+49
-3
lines changed

codex/raw/codex-block-exchange.md

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1561,6 +1561,52 @@ The Block Exchange protocol defines error handling for common failure scenarios:
15611561

15621562
### Recovery Strategies
15631563

1564+
#### Retry Responsibility Model
1565+
1566+
The protocol defines a clear separation between system-level and caller-level
1567+
retry responsibilities:
1568+
1569+
**System-Level Retry (Automatic):**
1570+
1571+
The Block Exchange module automatically retries in these scenarios:
1572+
1573+
- **Peer failure**: If a peer disconnects or times out, the system
1574+
transparently tries alternative peers from the discovery set
1575+
- **Transient errors**: Network glitches, temporary unavailability
1576+
- **Peer rotation**: Automatic failover to next available peer
1577+
1578+
The caller's `requestBlock` call remains pending during system-level retries.
1579+
This is transparent to the caller.
1580+
1581+
**Caller-Level Retry (Manual):**
1582+
1583+
The caller is responsible for retry decisions when:
1584+
1585+
- **All peers exhausted**: No more peers available from discovery
1586+
- **Permanent failures**: Block doesn't exist in the network
1587+
- **Timeout exceeded**: Request timeout (300s) expired
1588+
- **Verification failures**: All peers provided invalid data
1589+
1590+
In these cases, `requestBlock` returns an error and the caller decides
1591+
whether to retry, perhaps after waiting or refreshing the peer list
1592+
via discovery.
1593+
1594+
**Retry Flow:**
1595+
1596+
```text
1597+
requestBlock(address)
1598+
1599+
├─► System tries Peer A ──► Fails
1600+
│ │
1601+
│ └─► System tries Peer B ──► Fails (automatic, transparent)
1602+
│ │
1603+
│ └─► System tries Peer C ──► Success ──► Return block
1604+
1605+
└─► All peers failed ──► Return error to caller
1606+
1607+
└─► Caller decides: retry? wait? abort?
1608+
```
1609+
15641610
**Peer Rotation:**
15651611

15661612
When a peer fails to deliver blocks:
@@ -1574,14 +1620,14 @@ When a peer fails to deliver blocks:
15741620

15751621
- If verification fails, request block from alternative peer
15761622
- If all peers fail, propagate error to caller
1577-
- Maintain request queue for later retry
15781623
- Clean up resources (memory, pending requests) on unrecoverable failures
15791624

15801625
**Error Propagation:**
15811626

1582-
- Service interface functions (`requestBlock`, `cancelRequest`) return errors to callers
1627+
- Service interface functions (`requestBlock`, `cancelRequest`) return errors
1628+
to callers only after system-level retries are exhausted
15831629
- Internal errors logged for debugging
1584-
- Network errors result in peer disconnection and retry
1630+
- Network errors trigger automatic peer rotation before surfacing to caller
15851631
- Verification errors result in block rejection and peer reputation impact
15861632

15871633
## Security Considerations

0 commit comments

Comments
 (0)