Four layers, each with a dedicated Lava error code range.
Retryable means: retrying the same relay (with the same parameters) to a different endpoint/provider has a chance of succeeding.
Yes= retry on a different provider.No= do not retry.
| Code | Name | Description | Retryable |
|---|---|---|---|
| 0 | UNKNOWN_ERROR |
Unclassified error — no matcher matched | Yes |
Errors raised from within the Lava protocol itself — not from nodes or chains.
| Code | Name | Description | Retryable |
|---|---|---|---|
| 1001 | PROTOCOL_CONNECTION_TIMEOUT |
Network operation timed out connecting to provider | Yes |
| 1002 | PROTOCOL_CONNECTION_REFUSED |
Provider connection refused | Yes |
| 1003 | PROTOCOL_DNS_FAILURE |
DNS resolution failed | Yes |
| 1004 | PROTOCOL_TLS_MISMATCH |
HTTP/HTTPS protocol mismatch | No |
| 1005 | PROTOCOL_CONNECTION_RESET |
Connection reset by peer | Yes |
| 1006 | PROTOCOL_CONNECTION_CLOSED |
Connection closed (EOF) | Yes |
| 1007 | PROTOCOL_CONTEXT_DEADLINE |
Caller's context.Context deadline expired before the relay completed | Yes |
| 1008 | PROTOCOL_CONTEXT_CANCELED |
Request context was canceled (client disconnect or relay race resolved) | No |
| 1009 | PROTOCOL_NETWORK_UNREACHABLE |
Network or host unreachable (no route) | Yes |
| 1010 | PROTOCOL_NO_PROVIDERS |
No providers/pairings available | No |
| 1011 | PROTOCOL_ALL_ENDPOINTS_DISABLED |
All provider endpoints disabled | No |
| 1012 | PROTOCOL_PROVIDER_UNAVAILABLE |
Provider service unavailable (gRPC UNAVAILABLE) | Yes |
| 1013 | PROTOCOL_PROVIDER_ABORTED |
Provider aborted (gRPC ABORTED) | Yes |
| 1014 | PROTOCOL_PROVIDER_DATA_LOSS |
Provider data loss (gRPC DATA_LOSS) | Yes |
| 1020 | PROTOCOL_RATE_LIMITED |
Lava-side rate limit exceeded (SubCategoryRateLimit) | No |
| 1021 | PROTOCOL_MAX_CU_EXCEEDED |
Maximum compute units exceeded for session | No |
| 1022 | PROTOCOL_BATCH_SIZE_EXCEEDED |
Batch request size exceeded limit | No |
| 1030 | PROTOCOL_SESSION_NOT_FOUND |
Session does not exist | No |
| 1031 | PROTOCOL_EPOCH_MISMATCH |
Epoch mismatch or too old | No |
| 1032 | PROTOCOL_CONSUMER_BLOCKED |
Consumer is blocklisted | No |
| 1033 | PROTOCOL_CONSUMER_NOT_REGISTERED |
Consumer not registered | No |
| 1034 | PROTOCOL_RELAY_NUMBER_MISMATCH |
Relay number mismatch | No |
| 1035 | PROTOCOL_SESSION_OUT_OF_SYNC |
Session out of sync | No |
| 1040 | PROTOCOL_FINALIZATION_ERROR |
Provider finalization data incorrect | Yes |
| 1041 | PROTOCOL_CONSISTENCY_ERROR |
Response consistency validation failed | Yes |
| 1042 | PROTOCOL_HASH_CONSENSUS_ERROR |
Conflicting response hashes detected | Yes |
| 1043 | PROTOCOL_NO_RESPONSE_TIMEOUT |
Relay race timeout — no provider returned a response within the protocol deadline | Yes |
| 1050 | PROTOCOL_SUBSCRIPTION_NOT_FOUND |
Subscription not found | No |
| 1051 | PROTOCOL_SUBSCRIPTION_INIT_FAILED |
Failed to initialize subscription | No |
| 1052 | PROTOCOL_WEBSOCKET_IDLE_TIMEOUT |
WebSocket idle timeout | No |
| 1053 | PROTOCOL_SUBSCRIPTION_ALREADY_EXISTS |
Subscription already exists for this consumer/key | No |
Errors returned by the blockchain node itself (not execution/state errors).
| Code | Name | Description | Retryable | Standard Code |
|---|---|---|---|---|
| Generic Node Errors (2000-2099) | ||||
| 2001 | NODE_METHOD_NOT_FOUND |
Method does not exist on this node (unknown to the API surface); non-retryable (SubCategoryUnsupportedMethod) | No | JSON-RPC -32601 |
| 2002 | NODE_METHOD_NOT_SUPPORTED |
Method exists but is DISABLED on this specific node (provider tier / policy / admin config). Retryable on a different provider. | Yes | JSON-RPC -32004 |
| 2003 | NODE_INTERNAL_ERROR |
Internal node error | Yes | JSON-RPC -32603 |
| 2004 | NODE_SERVER_ERROR |
Generic server error | Yes | JSON-RPC -32000 |
| 2005 | NODE_RATE_LIMITED |
Rate limited by node (SubCategoryRateLimit) | Yes | HTTP 429 / MessageContains("rate limit") |
| 2006 | NODE_SERVICE_UNAVAILABLE |
Node temporarily unavailable | Yes | HTTP 503 |
| 2007 | NODE_SYNCING |
Node is syncing/catching up | Yes | MessageContains("node is syncing" / "catching up to the chain") |
| 2008 | NODE_UNIMPLEMENTED |
gRPC method unimplemented (SubCategoryUnsupportedMethod) | No | gRPC 12 |
| 2009 | NODE_ENDPOINT_NOT_FOUND |
REST endpoint not found (SubCategoryUnsupportedMethod) | No | HTTP 404 |
| 2010 | NODE_METHOD_NOT_ALLOWED |
REST method not allowed (SubCategoryUnsupportedMethod) | No | HTTP 405 |
| 2011 | NODE_LIMIT_EXCEEDED |
Request exceeds node limit (e.g., eth_getLogs range) (SubCategoryRateLimit) | No | JSON-RPC -32005 |
| 2012 | NODE_RESOURCE_NOT_FOUND |
Resource not found at node level | Yes | JSON-RPC -32001 |
| 2013 | NODE_RESOURCE_UNAVAILABLE |
Resource exists but unavailable | Yes | JSON-RPC -32002 |
| 2014 | NODE_GATEWAY_TIMEOUT |
Gateway timeout (HTTP 504 from provider) | Yes | HTTP 504 |
| 2015 | NODE_BAD_GATEWAY |
Bad gateway (HTTP 502 from provider) | Yes | HTTP 502 |
| Bitcoin/UTXO Node Errors (2100-2149) | ||||
| 2101 | NODE_BITCOIN_WARMUP |
Node still warming up (Bitcoin -28 / RPC_IN_WARMUP) | Yes | — |
| 2102 | NODE_BITCOIN_INITIAL_DOWNLOAD |
Node in initial block download (Bitcoin -10 / RPC_CLIENT_IN_INITIAL_DOWNLOAD) | Yes | — |
| 2103 | NODE_BITCOIN_NOT_CONNECTED |
Node has no peers (Bitcoin -9 / RPC_CLIENT_NOT_CONNECTED) | Yes | — |
| Solana Node Errors (2150-2169) | ||||
| 2150 | NODE_SOLANA_UNHEALTHY |
Solana node behind/unhealthy (-32005) | Yes | — |
Errors from the blockchain execution/state layer — transaction failures, state queries, etc.
Uses a tiered classification system (see Section 3 for details):
- Tier 1 (Generic): Semantic codes covering all chains (e.g.,
CHAIN_NONCE_TOO_LOW) - Tier 2 (Chain-specific): Distinct codes ONLY where retryability differs from the generic pattern
| Code | Name | Description | Retryable | Chains |
|---|---|---|---|---|
| Transaction Errors (3000-3099) | ||||
| 3001 | CHAIN_NONCE_TOO_LOW |
Nonce/sequence too low | No | EVM, Cosmos, Starknet, XRP, NEAR |
| 3002 | CHAIN_NONCE_TOO_HIGH |
Nonce too high | No | EVM |
| 3003 | CHAIN_INSUFFICIENT_FUNDS |
Insufficient funds for transfer/gas | No | Universal |
| 3004 | CHAIN_GAS_TOO_LOW |
Intrinsic gas too low | No | EVM |
| 3005 | CHAIN_GAS_LIMIT_EXCEEDED |
Exceeds block gas limit | No | EVM |
| 3006 | CHAIN_TX_UNDERPRICED |
Transaction gas price too low | No | EVM |
| 3007 | CHAIN_TX_ALREADY_KNOWN |
Transaction already in mempool | No | EVM, Starknet, XRP |
| 3008 | CHAIN_TX_REPLACEMENT_UNDERPRICED |
Replacement tx gas too low | No | EVM |
| 3009 | CHAIN_MEMPOOL_FULL |
Mempool/tx pool is full | No | EVM, Cosmos |
| 3010 | CHAIN_TX_TOO_LARGE |
Transaction exceeds size limit | No | EVM, Solana |
| 3011 | CHAIN_MAX_FEE_BELOW_BASE |
Max fee per gas below base fee | No | EVM (EIP-1559) |
| 3012 | CHAIN_INVALID_SEQUENCE |
Invalid sequence (Cosmos nonce equivalent) | No | Cosmos |
| 3013 | CHAIN_INSUFFICIENT_FEE |
Insufficient fee | No | Cosmos |
| 3014 | CHAIN_TX_REJECTED |
Transaction rejected by network rules | No | Universal |
| 3015 | CHAIN_DOUBLE_SPEND |
Double spend / UTXO already spent | No | Bitcoin/UTXO |
| 3016 | CHAIN_INVALID_SIGNATURE |
Invalid transaction signature | No | Universal |
| Execution Errors (3100-3199) | ||||
| 3101 | CHAIN_EXECUTION_REVERTED |
Smart contract execution reverted | No | EVM, Starknet, NEAR, TON |
| 3102 | CHAIN_OUT_OF_GAS |
Out of gas during execution | No | EVM, Cosmos, TON |
| 3103 | CHAIN_STACK_OVERFLOW |
Stack limit reached | No | EVM, TON |
| 3104 | CHAIN_INVALID_OPCODE |
Invalid opcode encountered | No | EVM, TON |
| 3105 | CHAIN_WRITE_PROTECTION |
Write in STATICCALL context | No | EVM |
| 3106 | CHAIN_CONTRACT_SIZE_EXCEEDED |
Contract bytecode exceeds 24KB EIP-170 limit (Geth: 'max code size exceeded') | No | EVM |
| 3107 | CHAIN_ACCOUNT_NOT_FOUND |
Account/contract does not exist | No | Cosmos |
| 3108 | CHAIN_ZKEVM_OUT_OF_COUNTERS |
Polygon zkEVM prover exceeded circuit counter budget | No | EVM (Polygon zkEVM) |
| State/Data Errors (3200-3299) | ||||
| 3201 | CHAIN_BLOCK_NOT_FOUND |
Block not found | Yes | Universal |
| 3202 | CHAIN_TX_NOT_FOUND |
Transaction not found | Yes | Universal |
| 3203 | CHAIN_RECEIPT_NOT_FOUND |
Transaction receipt not found | Yes | EVM (Cosmos-EVM variant) |
| 3204 | CHAIN_STATE_PRUNED |
State pruned/missing trie node | Yes | EVM |
| 3205 | CHAIN_DATA_NOT_AVAILABLE |
Historical data not available | Yes | Universal |
| 3206 | CHAIN_BLOCK_TOO_OLD |
Block results only for recent blocks | Yes | Cosmos |
| 3207 | CHAIN_LOG_RESPONSE_TOO_LARGE |
Log query returned too many results | No | EVM |
| Solana-Specific (3300-3319) — Tier 2 | ||||
| 3302 | CHAIN_SOLANA_MISSING_LONG_TERM |
Slot missing in long-term storage (-32009) | No | Solana |
| 3303 | CHAIN_SOLANA_LEDGER_JUMP |
Missing due to ledger jump/snapshot (-32007) | Yes | Solana |
| 3304 | CHAIN_SOLANA_BLOCKHASH_NOT_FOUND |
Blockhash not found/expired | No | Solana |
| 3305 | CHAIN_SOLANA_SIMULATION_FAILED |
Transaction simulation failed (-32002) | No | Solana |
| 3306 | CHAIN_SOLANA_SIGNATURE_VERIFY_FAILED |
Signature verification failure (-32003) | No | Solana |
| 3307 | CHAIN_SOLANA_EXCLUDED_FROM_INDEX |
Excluded from account secondary indexes (-32010) | No | Solana |
| 3308 | CHAIN_SOLANA_SIGNATURE_LENGTH_MISMATCH |
Signature length mismatch (-32013) | No | Solana |
| 3309 | CHAIN_SOLANA_BLOCK_STATUS_UNAVAILABLE |
Block status unavailable (-32014) | No | Solana |
| 3310 | CHAIN_SOLANA_TX_VERSION_UNSUPPORTED |
Transaction version not supported (-32015) | No | Solana |
| 3311 | CHAIN_SOLANA_MIN_CONTEXT_SLOT_NOT_REACHED |
Minimum context slot not reached (-32016) | No | Solana |
| Starknet-Specific (3320-3339) — Tier 2 | ||||
| 3320 | CHAIN_STARKNET_FAILED_TO_RECEIVE_TX |
Failed to receive tx (code 1) | No | Starknet |
| 3321 | CHAIN_STARKNET_CLASS_NOT_FOUND |
Class hash not found (code 28) | No | Starknet |
| 3322 | CHAIN_STARKNET_COMPILATION_FAILED |
Sierra to CASM compilation failed (code 56) | No | Starknet |
| 3323 | CHAIN_STARKNET_CLASS_ALREADY_DECLARED |
Class already declared (code 51) | No | Starknet |
| 3324 | CHAIN_STARKNET_CONTRACT_ERROR |
Contract error during execution (code 40) | No | Starknet |
| 3325 | CHAIN_STARKNET_TX_EXEC_ERROR |
Tx exec error (code 41) | No | Starknet |
| 3326 | CHAIN_STARKNET_INVALID_NONCE |
Invalid nonce (code 52) | No | Starknet |
| 3327 | CHAIN_STARKNET_INSUFFICIENT_FEE |
Insufficient fee (code 53) | No | Starknet |
| 3328 | CHAIN_STARKNET_INSUFFICIENT_BALANCE |
Insufficient balance (code 54) | No | Starknet |
| 3329 | CHAIN_STARKNET_VALIDATION_FAILURE |
Validation failure (code 55) | No | Starknet |
| 3330 | CHAIN_STARKNET_CONTRACT_NOT_FOUND |
Contract not found (code 20) | No | Starknet |
| 3331 | CHAIN_STARKNET_BLOCK_NOT_FOUND |
Block not found (code 24) | No | Starknet |
| 3332 | CHAIN_STARKNET_TX_HASH_NOT_FOUND |
Tx hash not found (code 29) | No | Starknet |
| 3333 | CHAIN_STARKNET_DUPLICATE_TX |
Duplicate tx (code 59) | No | Starknet |
| 3334 | CHAIN_STARKNET_TX_VERSION_UNSUPPORTED |
Unsupported tx version (code 61) | No | Starknet |
| 3335 | CHAIN_STARKNET_UNEXPECTED_ERROR |
Unexpected error (code 63) | No | Starknet |
| Bitcoin/UTXO-Specific (3340-3359) — Tier 2 | ||||
| 3341 | CHAIN_BITCOIN_VERIFY_ERROR |
Transaction verification failed (-25 / RPC_VERIFY_ERROR) | No | Bitcoin/UTXO |
| 3342 | CHAIN_BITCOIN_VERIFY_REJECTED |
Transaction rejected by rules (-26 / RPC_VERIFY_REJECTED) | No | Bitcoin/UTXO |
| 3343 | CHAIN_BITCOIN_ALREADY_IN_CHAIN |
Transaction already confirmed (-27 / RPC_VERIFY_ALREADY_IN_CHAIN) | No | Bitcoin/UTXO |
| 3344 | CHAIN_BITCOIN_WALLET_INSUFFICIENT_FUNDS |
Wallet UTXO coin selection failed (-6 / RPC_WALLET_INSUFFICIENT_FUNDS). Distinct from EVM CHAIN_INSUFFICIENT_FUNDS (3003) which is a tx submission failure. | No | Bitcoin/UTXO |
| NEAR-Specific (3360-3379) — Tier 2 | ||||
| 3360 | CHAIN_NEAR_UNKNOWN_BLOCK |
Block not found or garbage-collected (UNKNOWN_BLOCK) | Yes | NEAR |
| 3361 | CHAIN_NEAR_UNKNOWN_CHUNK |
Chunk not found (UNKNOWN_CHUNK) | Yes | NEAR |
| 3362 | CHAIN_NEAR_INVALID_SHARD_ID |
Shard ID does not exist (INVALID_SHARD_ID) | No | NEAR |
| 3363 | CHAIN_NEAR_NOT_SYNCED_YET |
Node still syncing (NOT_SYNCED_YET) | Yes | NEAR |
Errors caused by malformed or invalid client requests — classified by nature of error, regardless of where caught (pre-forwarding by Lava or returned by node).
Layer D codes are non-retryable but charge normal CU — the provider does real work on every call because the response is not cached (the next request from the same client may carry valid input). Only SubCategoryUnsupportedMethod errors get the zero-CU carve-out, because those responses are cached and the provider won't be hit again.
| Code | Name | Description | Retryable | Standard Code |
|---|---|---|---|---|
| 4001 | USER_PARSE_ERROR |
Invalid JSON in request | No | JSON-RPC -32700 |
| 4002 | USER_INVALID_REQUEST |
Request is not a valid JSON-RPC/REST/gRPC object | No | JSON-RPC -32600 |
| 4003 | USER_INVALID_PARAMS |
Invalid method parameters | No | JSON-RPC -32602 |
| 4004 | USER_INVALID_BLOCK_FORMAT |
Invalid block number format (e.g., non-hex) | No | MessageContains("hex string without 0x prefix" / "invalid block number" / "invalid block hash") |
| 4005 | USER_INVALID_ADDRESS |
Invalid address format | No | MessageContains("bad address checksum" / "invalid address") |
| 4006 | USER_REQUEST_TOO_LARGE |
Request body exceeds size limit | No | HTTP 413 |
| 4007 | USER_INVALID_HEX |
Invalid hex encoding | No | MessageContains("hex string has odd length" / "invalid hex") |
Each Lava error has:
- Lava Code (uint32): Internal code for logging, metrics, counting (1001, 2001, etc.)
- Name (string): Human-readable constant name (
PROTOCOL_CONNECTION_TIMEOUT) - Category (enum):
Internal(Lava-introduced) orExternal(pass-through) - SubCategory (enum): Finer classification (e.g.,
SubCategoryUnsupportedMethod) - Retryable (bool): Whether the error warrants retry on a different provider
- Description (string): Human-readable explanation
Lava error codes are internal only — they live within the Lava protocol layer:
- Smart Router path: Between router and endpoint (user and node never see Lava codes)
- Decentralized path: Between consumer and provider (user and node never see Lava codes)
- External responses always use standard protocol codes (JSON-RPC, gRPC, HTTP)
- Lava codes appear in logs and metrics only
Errors are classified using a two-level lookup with chain-family awareness.
The authoritative map lives in protocol/common/error_registry.go (chainFamilyMap). GetChainFamilyOrDefault returns ChainFamilyUnknown (a dedicated sentinel, NOT EVM) when a chain ID is not registered — Tier-2 lookups against the sentinel intentionally miss so classification falls through to transport-scoped Tier-1 matchers rather than silently inheriting another family's semantics.
Two helper functions (common.IsSolanaFamily etc.) delegate to this map so there is a single source of truth.
| ChainFamily | Chain IDs |
|---|---|
EVM |
ETH1, SEP1, HOL1, ARBITRUM, POLYGON, BASE, OPTM, AVAX, BSC, BLAST, FTM250, SONIC, ... |
Solana |
SOLANA, SOLANAT, KOII, KOIIT |
Bitcoin |
BTC, BTCT, LTC, LTCT, DOGE, DOGET, BCH, BCHT |
CosmosSDK |
COSMOSHUB, LAVA, LAV1, AXELAR, EVMOS, OSMOSIS, JUN1, CELESTIA, ... |
Starknet |
STRK, STRKS |
Aptos |
APT1 |
Sui |
SUIT |
NEAR |
NEAR, NEART |
XRP |
XRP, XRPT |
Stellar |
XLM, XLMT |
TON |
TON, TONT |
Tron |
TRX, TRXT |
Cardano |
CARDANO, CARDANOT |
PolygonZkEVM |
(no chain IDs currently map to this family — matchers live in Tier-1 EVM) |
Unknown |
sentinel — Tier-2 lookups miss and fall through to Tier-1 |
func ClassifyError(connectionError *LavaError, chainFamily ChainFamily, transport TransportType, errorCode int, errorMessage string) *LavaError {
// Step 0: If caller already identified a connection-level error, use it directly
if connectionError != nil {
return connectionError
}
// Step 1: Check chain-specific mappings (Tier 2)
// These override generic mappings where retryability differs
if chainMappings, ok := ChainErrorMappings[chainFamily]; ok {
for _, mapping := range chainMappings {
if mapping.Matcher.Matches(errorCode, errorMessage) {
return mapping.LavaError
}
}
}
// Step 2: Fall back to generic semantic mappings (Tier 1)
// Only check matchers applicable to this transport type
// (e.g., EVM/JSON-RPC chains skip gRPC matchers, Cosmos/gRPC chains skip HTTP matchers)
for _, mapping := range GenericErrorMappings[transport] {
if mapping.Matcher.Matches(errorCode, errorMessage) {
return mapping.LavaError
}
}
// Step 3: Unknown error
return UNKNOWN_ERROR
}
// ClassifyMessage is a convenience wrapper for when the transport is genuinely unknown.
// It tries all transports in order (JsonRPC → REST → gRPC) and returns the first
// non-unknown classification. Prefer ClassifyError with an explicit transport when
// the transport is known, to avoid false matches across transport boundaries.
func ClassifyMessage(code int, message string) *LavaError {
for _, transport := range []TransportType{TransportJsonRPC, TransportREST, TransportGRPC} {
if c := ClassifyError(nil, -1, transport, code, message); c != LavaErrorUnknown {
return c
}
}
return LavaErrorUnknown
}type TransportType int
const (
TransportJsonRPC TransportType = iota // EVM, Solana, Bitcoin, etc. (includes WebSocket subscriptions)
TransportREST // Aptos, Stellar, some Cosmos endpoints
TransportGRPC // Cosmos SDK chains
)Generic matchers are partitioned by transport so that ClassifyError only evaluates relevant matchers for the given chain's protocol.
type ErrorMatcher interface {
Matches(errorCode int, errorMessage string) bool
}
// Concrete matchers:
// CodeEquals(-32009) — exact error code match
// MessageContains("nonce too low") — substring match in error message
// MessageRegex(`missing.*storage`) — regex match
// HTTPStatusEquals(429) — HTTP status code match
// GRPCCodeEquals(codes.Unavailable) — gRPC status code matchBefore calling ClassifyError, callers must call DetectConnectionError(err) on the raw Go error and pass the result as connectionError. This handles errors that can't be detected via code/message matching alone (e.g. context.Canceled, context.DeadlineExceeded, net.Error timeouts, ECONNREFUSED). It also falls back to string matching for errors wrapped without %w where errors.Is can't traverse the chain.
func DetectConnectionError(err error) *LavaError {
// errors.Is checks (properly wrapped errors)
// net.Error timeout check
// String fallback for non-%w wrapped errors ("context deadline exceeded", "context canceled")
// ECONNREFUSED via net.OpError
}ClassifyError Step 0 returns connectionError immediately if non-nil — so a detected connection error always takes precedence and can never produce UNKNOWN_ERROR.
GenericErrorMappingsis evaluated in declaration order — first match wins. Matchers MUST be ordered most-specific first. A broader matcher placed before a narrower one will shadow it silently.- Safety nets (required in Phase 1):
- Shadow detection test: A unit test that iterates every pair of matchers and fails if a broader matcher appears before a narrower one it would shadow.
- Real-world fixture tests: A
testdata/directory containing actual error responses captured from each node client (Geth, Erigon, Nethermind, Solana validator, Bitcoin Core, etc.). Table-driven tests run every fixture throughClassifyErrorand assert the expected Lava error code. When a new client variant surfaces in production asUNKNOWN_ERROR, its message is added to the fixtures.
All mappings live in the central registry file:
// Tier 2: Chain-specific (checked first, overrides generic)
var ChainErrorMappings = map[ChainFamily][]ChainErrorMapping{
Solana: {
{CodeEquals(-32009), CHAIN_SOLANA_MISSING_LONG_TERM}, // non-retryable
{CodeEquals(-32007), CHAIN_SOLANA_LEDGER_JUMP}, // retryable
{CodeEquals(-32002), NODE_SOLANA_UNHEALTHY}, // retryable
},
Bitcoin: {
{CodeEquals(-13), NODE_BITCOIN_WARMUP}, // retryable
{CodeEquals(-25), NODE_BITCOIN_INITIAL_DOWNLOAD}, // retryable
{CodeEquals(-10), CHAIN_BITCOIN_VERIFY_ERROR}, // non-retryable
{CodeEquals(-11), CHAIN_BITCOIN_VERIFY_REJECTED}, // non-retryable
{CodeEquals(-100), CHAIN_BITCOIN_INSUFFICIENT_FUNDS}, // non-retryable
},
Starknet: {
{CodeEquals(28), CHAIN_STARKNET_CLASS_NOT_FOUND},
{CodeEquals(51), CHAIN_STARKNET_CLASS_ALREADY_DECLARED},
{CodeEquals(56), CHAIN_STARKNET_COMPILATION_FAILED},
{CodeEquals(40), CHAIN_STARKNET_CONTRACT_ERROR},
},
// ... other chains
}
// Tier 1: Generic semantic (fallback), partitioned by transport type
var GenericErrorMappings = map[TransportType][]GenericMapping{
TransportJsonRPC: {
{MessageContains("nonce too low"), CHAIN_NONCE_TOO_LOW},
{MessageContains("insufficient funds"), CHAIN_INSUFFICIENT_FUNDS},
{MessageContains("execution reverted"), CHAIN_EXECUTION_REVERTED},
{MessageContains("already known"), CHAIN_TX_ALREADY_KNOWN},
{MessageContains("missing trie node"), CHAIN_STATE_PRUNED},
{MessageContains("historical state"), CHAIN_STATE_PRUNED},
{MessageContains("block not found"), CHAIN_BLOCK_NOT_FOUND},
{MessageRegex(`(?i)block #?\w+ not found`), CHAIN_BLOCK_NOT_FOUND},
{CodeEquals(-32601), NODE_METHOD_NOT_FOUND},
{CodeEquals(-32602), USER_INVALID_PARAMS},
{CodeEquals(-32700), USER_PARSE_ERROR},
{CodeEquals(-32000), NODE_SERVER_ERROR}, // generic fallback — specific messages matched above
// ...
},
TransportREST: {
{HTTPStatusEquals(429), NODE_RATE_LIMITED},
{HTTPStatusEquals(503), NODE_SERVICE_UNAVAILABLE},
{HTTPStatusEquals(404), NODE_ENDPOINT_NOT_FOUND},
{HTTPStatusEquals(405), NODE_METHOD_NOT_ALLOWED},
// ...
},
TransportGRPC: {
{GRPCCodeEquals(codes.Unavailable), NODE_SERVICE_UNAVAILABLE},
{GRPCCodeEquals(codes.Unimplemented), NODE_UNIMPLEMENTED},
// HTTP status message matchers also appended at init() — HTTP status strings
// can appear in gRPC error messages when the underlying transport is HTTP
// (e.g. provider relay errors arriving via ClassifyLegacyError)
// ...
},
}- If it uses standard protocols (EVM, JSON-RPC): Assign a
ChainFamily→ generic mappings handle it automatically. Zero code changes. - If it has unique errors with different retryability: Add entries to
ChainErrorMappingsand define newLavaErrorconstants in the registry.
Single file: protocol/common/error_registry.go
// ErrorCategory — top-level grouping: internal (Lava-introduced) vs external (pass-through)
type ErrorCategory int
const (
Internal ErrorCategory = iota // Errors introduced by Lava — user would never see these without Lava
External // Pass-through errors — user would get the same error talking to the node directly
)
// ErrorSubCategory — finer classification within each category.
// Subcategories carry behavioral implications the consumer hot path branches on
// (retries, CU charging, caching, endpoint health scoring).
type ErrorSubCategory int
const (
SubCategoryNone ErrorSubCategory = iota
SubCategoryUnsupportedMethod // zero retries, zero CU, cached response, no provider scoring
SubCategoryRateLimit // endpoint is healthy but busy; apply backoff, do not mark unhealthy
)
func (sc ErrorSubCategory) IsUnsupportedMethod() bool { return sc == SubCategoryUnsupportedMethod }
func (sc ErrorSubCategory) IsRateLimit() bool { return sc == SubCategoryRateLimit }
// LavaError is the central error definition
type LavaError struct {
Code uint32
Name string
Category ErrorCategory
SubCategory ErrorSubCategory
Description string
Retryable bool
}
// Registry: all errors defined in one place (unexported — access via lookup
// helpers). Populated at package-init time and never mutated at runtime, so
// readers access it lock-free on the hot classification path.
var errorRegistry = map[uint32]*LavaError{...}
// Internal lookup helpers (unexported — callers should use ClassifyError or
// the subcategory predicates rather than raw registry lookups):
func getLavaError(code uint32) *LavaError
func getLavaErrorByName(name string) *LavaError
// Public chain-family helpers
func GetChainFamily(chainID string) (ChainFamily, bool) // ok=false when unknown
func GetChainFamilyOrDefault(chainID string) ChainFamily // returns ChainFamilyUnknown sentinel when unknown
// Public classification entry points
func ClassifyError(connErr *LavaError, family ChainFamily, transport TransportType, code int, msg string) *LavaError
func ClassifyMessage(code int, msg string) *LavaError // transport + chain unknown
// Retry-policy predicates. IsUnsupportedMethodError keys off SubCategory
// (zero-CU carve-out + caching). IsNonRetryableNodeError keys off
// LavaError.Retryable directly so every terminal classification short-circuits
// retries, not only unsupported methods.
func IsUnsupportedMethodError(chainID string, statusCode int, message string) bool
func IsNonRetryableNodeError(chainID string, statusCode int, message string) bool
func IsNonRetryableNodeErrorWithContext(family ChainFamily, transport TransportType, statusCode int, message string) bool
// NodeErrorClassification aggregates the retry-related flags derived from a
// single ClassifyError lookup. IsNonRetryable is the authoritative retry signal;
// IsUnsupportedMethod is a strict subset that also drives the zero-CU carve-out
// and caching policy.
type NodeErrorClassification struct {
IsNonRetryable bool
IsUnsupportedMethod bool
}
// ClassifyNodeErrorForRetry is the preferred entry point on hot error paths —
// one lookup produces all three flags, avoiding the JSON-RPC→REST→gRPC scan
// the individual predicates perform when the caller doesn't know transport.
func ClassifyNodeErrorForRetry(family ChainFamily, transport TransportType, errorCode int, message string) NodeErrorClassification
// Metrics callback registration (single-writer, atomic-pointer reads on hot path)
func SetErrorMetricsCallback(cb ErrorMetricsCallback)
func EmitErrorMetric(lavaError *LavaError, chainID string) // metric only, no log
// Structured logging entry points (fire metric + emit log)
func LogCodedError(description string, err error, lavaError *LavaError, chainID string, chainErrorCode int, chainErrorMessage string, attrs ...utils.Attribute) error
func LogCodedWarning(description string, err error, lavaError *LavaError, chainID string, chainErrorCode int, chainErrorMessage string, attrs ...utils.Attribute) errorExtend existing LavaFormatError with a coded error helper:
// Dedicated coded error helper — auto-populates structured fields
utils.LavaFormatCodedError(PROTOCOL_CONNECTION_TIMEOUT, err,
utils.LogAttr("provider", providerAddr),
)Log output automatically includes:
error_code: numeric code (1001)error_name: string name (PROTOCOL_CONNECTION_TIMEOUT)error_category: layer (protocol/node/blockchain/user)retryable: boolchain_error_code: original chain error code (e.g., -32009) — for Tier 1 generic codeschain_error_message: original chain error message — for debugging- Standard structured attributes (provider, chainId, method, etc.)
- Create
protocol/common/error_registry.gowithLavaErrorstruct,ErrorCategoryenum, and all error code constants - Define all error codes from the taxonomy (Layers A-D) in the registry
- Implement
ChainFamilyenum and chain ID → family mapping - Implement
TransportTypeenum and chain ID → transport mapping - Implement
ErrorMatcherinterface with concrete matchers (CodeEquals,MessageContains,MessageRegex,HTTPStatusEquals,GRPCCodeEquals) - Implement
ClassifyErrorfunction with two-tier lookup (chain-specific first, generic fallback) - Define all
ChainErrorMappings(Tier 2) andGenericErrorMappings(Tier 1) - Add lookup helpers (
GetError,GetErrorByName,IsRetryable,GetCategory) - Write unit tests for the registry and classification logic
- Write shadow detection test to verify no broader matcher shadows a narrower one in
GenericErrorMappings - Create
testdata/directory with real error response fixtures from each node client (Geth, Erigon, Nethermind, Solana validator, Bitcoin Core, Starknet, etc.) - Write table-driven fixture tests that run every fixture through
ClassifyErrorand assert expected Lava error code
- Add
LavaFormatCodedErrorhelper toutils/lavalog.gothat takes aLavaErrorcode - Ensure coded errors emit
error_code,error_name,error_category,retryable,chain_error_code,chain_error_messagefields in structured logs - Add Prometheus counter that auto-increments per error code (
lava_errors_total{code, name, category, retryable, chain_id}) - Write unit tests for coded error logging
- Map existing
protocol/lavaprotocol/protocolerrors/errors.gocodes to new registry - Map existing
protocol/lavasession/errors.go(consumer + provider) to new registry - Map existing
protocol/chaintracker/errors.goto new registry - Map existing
protocol/common/errors.goto new registry - Map existing
protocol/chainlib/common.goerrors to new registry - Map existing
protocol/performance/errors.goto new registry - Map existing
ecosystem/cache/handlers.goerrors to new registry (intentionally deferred — cache layer has no production call site for ClassifyError; revisit if cache errors need structured metrics) - Update
protocol/chainlib/node_error_handler.goto useClassifyErrorand registry codes - Replace
IsUnsupportedMethodError()pattern matching withLavaError.SubCategory.IsUnsupportedMethod()check - Replace
IsUnsupportedMethodMessage()inprotocol/common/errors.gowith registry-based classification - Update
protocol/rpcsmartrouter/error_mapper.goto useClassifyErrorand registry codes - Migrate
relayInnerDirect()inprotocol/rpcsmartrouter/rpcsmartrouter_server.goto useLavaErrorclassification for endpoint health decisions (replace ad-hoc 5xx/429/timeout checks withLavaError.CategoryandLavaError.Retryable)
- Update
protocol/common/return_errors.goto use registry for JSON-RPC/REST error responses - Update JSON-RPC error handler to classify and log with codes
- Update REST error handler to classify and log with codes
- Update gRPC error handler to classify and log with codes
- Update TendermintRPC error handler to classify and log with codes
- Add
LavaError *LavaErrorfield toRelayErrorstruct inprotocol/relaycore/relay_errors.go - Call
ClassifyErrorwhen creatingRelayErrorinresults_manager.go(setErrorResponseandsetValidResponse) — decentralized path - Populate
RelayError.LavaErrorin the smart-router path (direct_rpc_relay.go→ pass classification fromClassifyDirectRPCErrorinto the relay response flow) - Update
GetBestErrorMessageForUserto prefer external errors (CHAIN_*,NODE_*) over internal (PROTOCOL_*) when selecting the best error for the user - Update
protocol/relaycore/relay_processor.goto propagate codes - Populate
RelayResult.IsNonRetryablefromClassifyErroron both consumer and smart-router paths sorelay_processor.HasNonRetryableUserFacingErrorshonorsLavaError.Retryable=falsefor every terminal classification — not onlySubCategoryUnsupportedMethod. Prior to this, node errors likeCHAIN_EXECUTION_REVERTED,CHAIN_OUT_OF_GAS, andCHAIN_DOUBLE_SPENDwere retried across providers despiteRetryable=falsein the registry. - Update consumer server (
rpcconsumer/rpcconsumer_server.go) to log with codes - Update provider server (
rpcprovider/rpcprovider_server.go) to log with codes
- Verify Prometheus counter
lava_errors_total{code, name, category, retryable}works end-to-end - Update
protocol/metrics/consumer_metrics_manager.goto use error codes (lava_errors_total auto-fires via LogCodedError — existing incident metrics kept for backwards compat) - Update
protocol/metrics/rpcconsumer_logs.goto use error codes (same — LogCodedError handles it) - Verify error codes appear in existing dashboards/alerts (lava_errors_total emits all labels needed for dashboards)
- Remove
UnsupportedMethodError/SolanaNonRetryableErrorcustom types, replace withLavaWrappedError+LavaError.SubCategory/LavaError.Retryable - Update
ShouldRetryError()innode_error_handler.goto use registry'sRetryablefield - Make
LavaErrorimplementerrorinterface withError(),Is(),ABCICode()for drop-in replacement - Add
LavaWrappedError+NewLavaError()for wrapping errors with classification that supportserrors.Is
Blocked on protocol upgrade: sdkerrors carry ABCI codes used in the gRPC wire format between consumer and provider. Changing them requires coordinated upgrade across all network participants.
- Replace
sdkerrors.Registererror variables withLavaError-based equivalents - Update all
errors.Is(err, SomeOldError)callsites to useLavaError-based checks - Delete old error packages / re-exports after all consumers are migrated
- Verify no remaining imports of old error definitions
-
User Error boundary:
USER_*errors include cases detected both pre-forwarding by Lava AND returned by the node (e.g.,-32602 Invalid params). Classification is by nature of error, not where it's caught. -
Error code visibility: Lava error codes are internal only — visible in logs and metrics. They live within the Lava protocol layer (Smart Router: between router and endpoint; Decentralized: between consumer and provider). Users and nodes never see Lava codes. External responses use standard protocol codes (JSON-RPC, gRPC, HTTP).
-
Chain-specific codes: Tiered approach (Option A hybrid). Distinct Lava codes exist for chain-specific errors where retryability differs from the generic pattern (Solana -32009 vs -32007, Bitcoin warmup, Starknet class errors, etc.). All other errors use generic semantic codes with chain detail in log attributes.
-
x/ module errors: Left as-is — governed by Cosmos SDK conventions, only relevant on-chain.
-
LavaErroris a classification struct that also implementserror. It implementsError(),Is(), andABCICode(). It is metadata about an error — used for logging, metrics, and retry decisions — but it can also participate inerrors.Ischains. To attach classification to a real error (so both the original message and the classification travel together), useLavaWrappedErrorviaNewLavaError(classified, originalErr.Error()). Callers useerrors.As(err, &LavaWrappedError{})to extract the*LavaErrorfrom a wrapped error. (Updated in Phase 7 — original design had LavaError as pure metadata; reversed to enable retry/health decisions via errors.Is.) -
Transport-scoped generic matching:
ClassifyErroraccepts aTransportTypeparameter. Generic (Tier 1) matchers are partitioned by transport (JSON-RPC, REST, gRPC) so that EVM/JSON-RPC chains never evaluate gRPC matchers and vice versa. -
Unsupported methods use
SubCategoryUnsupportedMethod. Codes 2001, 2008, 2009, and 2010 haveSubCategory: SubCategoryUnsupportedMethod. This replaces the current pattern-matching approach (IsUnsupportedMethodError,IsUnsupportedMethodMessage) with a subcategory check viaLavaError.SubCategory.IsUnsupportedMethod(). The special behavior (zero CU, cached response, no provider scoring) is derived from the subcategory. The retry short-circuit itself runs off the registry'sRetryableflag — see Decision 10. (Note: 2002NODE_METHOD_NOT_SUPPORTEDwas removed from this list during Phase 7 review — it represents a method that exists but is disabled on this node, which is a retryable condition on a different provider. It hasRetryable: trueand no subcategory.) -
Two-level error grouping: Category + SubCategory. Category is
Internal(errors Lava introduces — protocol layer) vsExternal(errors the user would get regardless of Lava — node, chain, user input). SubCategory provides finer classification within each category (e.g., UnsupportedMethod, Connection, Session, ChainExecution, ChainState, UserInput). SubCategories to be finalized before Phase 1 implementation. -
Transparent hop: original errors pass through unchanged. The router/consumer is a transparent hop — the user always receives the original error from the node, unmodified.
LavaErrorclassification is metadata for internal use only (logging, metrics, endpoint health). Unknown/unmatched errors default toCategoryExternalbecause they are node pass-throughs. (Clarification added in Phase 7:handleAndClassifywraps classified errors inLavaWrappedErroron the Go error return path — this is internal plumbing for retry/health decisions and never reaches the user. The actual node response body travels separately and is always returned unmodified to the user. The "transparent hop" principle applies to the response body, not the internal Go error return.) -
Retryable is the primary retry signal, not SubCategory. The consumer and smart-router retry state machines short-circuit on
LavaError.Retryable=falseviaRelayResult.IsNonRetryable, populated at classification time on both paths (consumer:rpcconsumer_server.go; smart-router:direct_rpc_relay.go/rpcsmartrouter_server.go). This covers every terminal classification —CHAIN_EXECUTION_REVERTED,CHAIN_OUT_OF_GAS,CHAIN_DOUBLE_SPEND,CHAIN_INVALID_SIGNATURE, all of 3000-range — not only unsupported methods. SubCategory continues to govern adjacent policy:SubCategoryUnsupportedMethodtriggers the zero-CU carve-out and caching, andSubCategoryRateLimitdrives backoff without marking the endpoint unhealthy. Layer D user-input errors are non-retryable but charge normal CU — the provider does real work because responses are not cached.relay_processor.HasNonRetryableUserFacingErrorskeys off theIsNonRetryableflag rather than re-classifying, so adding a new non-retryable error type requires no state-machine changes.
| Chain Family | Unique Error System? | Tier 2 Codes Needed? | Details |
|---|---|---|---|
| EVM (ETH, Arbitrum, Optimism, Base, Polygon, Avalanche, Blast, Sonic) | No — all use standard -32000 range | No (message-based differentiation only) | L2 sequencer errors differ by message, not code |
| Polygon zkEVM | Partial — out of counters is unique |
Yes (1 code) | Prover constraint, non-retryable, no equivalent elsewhere |
| Solana | Yes — codes -32001 to -32011 | Yes (4 codes in Layer C, 1 in Layer B) | -32009 vs -32007 have different retryability |
| Bitcoin/UTXO | Yes — codes -1 to -111 | Yes (5 codes) | Warmup, initial download, verify errors, UTXO-specific |
| Starknet | Yes — 25+ codes (1-63) | Yes (4 codes) | Class system entirely unique, compilation errors |
| NEAR | Yes — hierarchical error types | Yes (2 codes in Layer C, 1 in Layer B) | Shard/chunk unavailability retryable on diff provider |
| XRP/Ripple | Yes — tec/tef/tel/tem/ter system | Yes (3 codes) | Most complex; prefix determines retryability |
| TON | Partial — TVM exit codes + custom HTTP | Yes (3 codes) | Lite-server timeout retryable, message expired retryable |
| Aptos | Partial — custom REST error format | No (generic codes sufficient) | Error format differs but semantics map to generic codes |
| Stellar | Yes — typed error URIs + result_codes | No (generic codes sufficient) | REST-based, HTTP codes sufficient for retry decisions |
| Cosmos SDK | Standard — tx_response.code | No (generic codes sufficient) | Standard patterns, well-covered by generic tier |