Skip to content

On caching strategy & backing store #1

@bdach

Description

@bdach

This issue is going to touch on a few disparate topics which to me are interlinked enough to warrant being included in one issue.

Usage of redis

The caching primitive the service is using is IDistributedCache backed by redis:

builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = AppSettings.RedisHost;
});

While we do have a production redis instance, there are significant reservations about using it for the purpose of this service. @peppy and @ThePooN can probably elaborate on that topic, but the gist of it from my understanding is that redis is not very reliable and has in fact had several instances of falling over, primarily due to running out of storage.

I understand @peppy has some ideas for alternatives to redis but I'll leave it to him to elaborate on that.

The cache period

Every replay that the service is caching, is cached for a full day:

new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromDays(1),
});

which seems to be way over the top IMO but I guess it makes some degree of sense given that consumers of the “firehose” score API want to be using this. Will require cache hit rate monitoring. Maybe even a good idea to extract the cache duration to an envvar or something other that can be adjusted without having to apply source changes.

Back-of-napkin estimate of storage required for a day of replays for lazer alone to be in memory would be (if I’m not misreading ddog metrics):

266 800 replays/24hr * 50 KB/replay (semi-educated guess) to GB is 13,34 GB

so my bets are that 1 day is going to be too much, especially if this is to encompass stable as well (extrapolating from user numbers, you could be looking at about 5x as much storage).

Replays only enter cache via the upload operation

The download operation will never put anything in cache itself, it will only query it. If the cache is missed, any replays fetched from S3 will not be stored into the cache, not even for a short time. The only thing that puts a replay in the cache is uploading the replay.

I could see this making sense because I don’t think very many people download replays, but maybe someone with cf analytics access is able to disprove this.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions