Skip to content

TokenManager may turn token request failures into secondary runtime errors #182

@lzqlzzq

Description

@lzqlzzq

Description

We observed some strange EPIPE connection state while using the SDK. Then the eventloop crashed.

Image

After looking into the token flow, it seems TokenManager does not handle transient request failures robustly.

In both getCustomTenantAccessToken() and getMarketTenantAccessToken(), request errors are caught only for logging, but execution continues as if a valid response object were returned. When the request fails and the result is undefined, destructuring fields such as tenant_access_token, expire, or app_access_token may raise a secondary runtime error like Cannot destructure property ... of undefined.

This makes the original transport failure harder to diagnose.

It also looks like token refresh is not deduplicated across concurrent callers. On cache miss, multiple requests may enter the same refresh path independently instead of sharing a single in-flight refresh, which may lead to a thundering herd problem.

Root Cause

The immediate root cause is that the return value of the token request may become undefined after .catch(...), but it is destructured without an undefined check.

More specifically:

  1. Request failures are caught and logged inside .catch(...), but not rethrown.
  2. The awaited result may therefore be undefined.
  3. The code immediately destructures fields such as tenant_access_token, expire, and app_access_token from that value without checking it first.

As a result, normal transport failures such as EPIPE, timeout, or connection reset may surface as secondary runtime breakdown like Cannot destructure property ... of undefined.

Impact

This can lead to several issues:

  • transient network failures become harder to debug because the visible error may be a secondary destructuring exception rather than the original request error
  • callers may receive undefined or encounter unexpected runtime exceptions instead of a clear token acquisition failure
  • invalid or incomplete token data may be written into cache if the response is not validated before caching
  • concurrent requests may increase refresh pressure during cache miss windows

Suggested Fix

  1. Validate the response object before destructuring required fields such as tenant_access_token, expire, and app_access_token.
  2. Only write token data into cache after the response has been verified to be present and well-formed.
  3. Introduce single-flight deduplication for token refresh, so concurrent callers can await the same in-flight Promise for a given token key.
  4. Return clearer token-fetch errors so upstream code can distinguish transport failures from invalid response payloads.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions