Skip to content

HybridCache: add expiry to cached tag invalidation timestamps to improve distributed consistency without a backplane #7411

@LuisM000

Description

@LuisM000

Summary

Add a TTL to the in-memory cache of tag invalidation timestamps (_tagInvalidationTimes) so that entries are periodically re-read from L2, improving consistency in distributed deployments without requiring a full backplane.

Background

DefaultHybridCache stores tag invalidation timestamps in a ConcurrentDictionary<string, Task<long>> (_tagInvalidationTimes). Once a tag is seen, its entry is never refreshed:

private void PrefetchTagWithBackendCache(string tag)
{
    if (!_tagInvalidationTimes.TryGetValue(tag, out _))
    {
        _ = _tagInvalidationTimes.TryAdd(tag, SafeReadTagInvalidationAsync(tag));
        // only added once, never updated
    }
}

In a multi-instance deployment (e.g. multiple pods in Kubernetes), if Instance B calls RemoveByTagAsync("my-tag"), Instance A will never learn about it because its _tagInvalidationTimes["my-tag"] is permanently 0 (never invalidated). Even after L1 expires and Instance A re-reads from L2, the tag check always returns "valid".

Proposed fix

Add a configurable TTL to each entry in _tagInvalidationTimes. When the TTL expires, the next access re-reads the invalidation timestamp from L2. This way:

  • Instances periodically re-sync their tag state from L2
  • No backplane or pub/sub mechanism is required
  • The TTL controls the tradeoff between consistency and L2 read cost

A reasonable default could match LocalCacheExpiration, or be independently configurable via HybridCacheOptions.

Why this matters

A full backplane (as proposed in the original distributed invalidation design) is the complete solution, but it requires additional infrastructure. A TTL-based approach is a low-cost intermediate improvement that makes tag invalidation eventually consistent in distributed scenarios using only the existing IDistributedCache backend.

Impact

Without this fix, RemoveByTagAsync in a multi-instance deployment only reliably invalidates the instance that called it. Other instances continue serving stale data indefinitely, which is surprising given the API's intent.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions