Skip to content

ENSApi: DI init crashes instead of waiting when ENSIndexer metadata isn't published yet #2272

@shrugs

Description

@shrugs

Summary

ENSApi crashes permanently (process.exit(1)) at startup if it initializes before ENSIndexer has published its ensnode.metadata, instead of waiting/retrying. This is a startup-ordering race: any deployment where ENSApi can come up before the indexer's first metadata write (fresh DB, co-located stacks, restarts) hard-fails the API.

Symptom

ERROR: Error initializing DI container
  DI container initialization failed: could not connect to ENS Root Chain RPC
  due to relation "ensnode.metadata" does not exist
 ELIFECYCLE  Command failed with exit code 1.

ENSApi exits and does not recover on its own; only a restart after the indexer has written metadata succeeds.

Root cause

  • apps/ensapi/src/index.ts runs di.init().catch(() => process.exit(1)) — no retry.
  • di.init() (apps/ensapi/src/di.ts) eagerly reads the indexer's published ensnode.metadata via stackInfoCache.read() / indexingStatusCache.read() and the root-chain RPC config derived from it. When ensnode.metadata doesn't exist yet, these throw and init rejects.

So a transient, expected startup-ordering condition (indexer hasn't published metadata yet) is treated as a fatal error.

Proposed fix

Treat "indexer metadata not published yet" as a retryable condition during DI init:

  • Add a classifier isIndexerMetadataNotReadyError(err) matching the missing-relation / empty-metadata case.
  • In di.init(), wrap the metadata-dependent initialization (the three cache reads + root-chain RPC config/getBlockNumber) in a bounded retry-with-backoff loop. On a "not ready" error: log and retry with backoff; on any other error: fail fast; give up after a configurable timeout (env, default ~10 min) so a genuine misconfig still surfaces. Ensure caches re-query on retry rather than caching the failure.

This makes ENSApi tolerant of being started before/alongside the indexer, which is the common case for fresh deployments and co-located dev/checkpoint stacks.

Notes

  • Keep genuine config/connectivity failures fail-fast — only the metadata-not-ready class should retry.
  • Found while running co-located indexer+ENSApi checkpoint stacks, where ENSApi reliably loses the race against the indexer's first metadata write and dies.

Part of #1360 (tracking).

Metadata

Metadata

Assignees

No one assigned

    Labels

    ensapiENSApi related

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions