Skip to content

CacheClient exhausts Lambda sockets causing timeouts (DEADLINE_EXCEEDED) #1575

@anttiviljami

Description

@anttiviljami

Note: Opening this issue to document the bug & resolution. We are a Momento customer, and this has already been looked at by technical support 🙏

We are experiencing rare, but intermittent Lambda timeouts after enabling caching with momento in production.

Image

Lambda specs:

  ApiHandlerFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: lambda/ApiHandlerFunction/dist
      Handler: api-handler.handler
      Runtime: nodejs22.x
      Architectures:
        - arm64
      Timeout: 60
      MemorySize: 2048
      Events:
        Api:
          Type: HttpApi

These timeouts are always accompanied by multiple DEADLINE_EXCEEDED warnings and error logs within the same lambda request. (Note that the client deadline is set far below 60 seconds)

WARN (Momento: grpc-interceptor): Deadline Exceeded! Received status: 4 Deadline exceeded after 59.775s,name resolution: 0.162s,metadata filters: 0.064s,LB pick: 0.092s,remote_addr=x.x.x.x:443 and grpc connection status: READY
WARN (Momento: grpc-interceptor): Deadline Exceeded! Received status: 4 Deadline exceeded after 59.561s,name resolution: 0.169s,metadata filters: 0.047s,LB pick: 0.091s,remote_addr=x.x.x.x:443 and grpc connection status: READY
{"error": "The client's configured timeout was exceeded; you may need to use a Configuration with more lenient timeouts: 4 DEADLINE_EXCEEDED: Deadline exceeded after 5.286s,name resolution: 0.014s,metadata filters: 0.007s,LB pick: 0.020s,remote_addr=x.x.x.x:443"}

Additionally we are seeing socket acquisition warnings from aws sdk while caching is enabled, hinting that the CacheClient is leaking used sockets.

@smithy/node-http-handler:WARN - socket usage at capacity=50 and 500 additional requests are enqueued. See https://docs.aws.amazon.com/sdk-for-javascript/v3/developer-guide/node-configuring-maxsockets.html or increase socketAcquisitionWarningTimeout=(millis) in the NodeHttpHandler config.

The issue occurs with the default lambda configuration:

const cacheClient = await CacheClient.create({
  configuration: Configurations.Lambda.latest(),
  credentialProvider: CredentialProvider.fromEnvVar('MOMENTO_API_KEY'),
  defaultTtlSeconds: 60,
});

Various attempts have been made to limit the number of connections and concurrent calls with CacheClient configurations, but this has so far not eliminated the problem. Latest attempt was:

Configurations.Lambda.latest()
    // set max concurrent requests to 60 (recommendation from momento)
    .withTransportStrategy(
      Configurations.Lambda.latest().getTransportStrategy().withMaxConcurrentRequests(config.MOMENTO_MAX_CONCURRENCY),
    )
    // limit number of connections to 6 (recommendation from momento)
    .withNumConnections(config.MOMENTO_CONNECTIONS || 6);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions