Skip to content

Valkey support for memory locking (?) #669

Open
@ranshid

Description

@ranshid

The problem/use-case that the feature addresses

Valkey users are mostly interested in having their data in-memory. Currently whenever the server memory utilization is peaking it might cause the system to swap memory pages into swap memory thus introducing increase latency which in turn can lead to client applications to just attempt to connect more clients and potentially try to consume more memory. OS memory thrashing also leads to some maintenance operations (defrag, lazy evictions, client evictions etc...) to operate in a lower rate, thus contributing to more swap utilization and overall system health issues. When system is thrashing some other crucial server side operations (replication keepalive, pings and clusterbus notifications) are lagging which usually end at unresponsive server which is taken down or failed over. I would generally state that in AWS we have rarely seen cases were system was able to automatically stabilize after OS starts thrashing.

Description of the feature
The suggestion is to provide a configuration which will mlockall pages in memory. I think we can start with a startup configuration and potentially turn it to a dynamic configuration (so we can also switch mlockall/munlockall) at runtime.

What alternatives are there?
Disable swap - The best way to avoid system swapping is to just disable swap memory. On linux this can be done by turning off system swap swapoff -a. this would however impact the entire system and might lead to different services crashing when memory pressure is high.
tune system swappiness - AFAIK most linux distros come with swappiness level set to 60 which means the system will be aggressive swapping pages of inactive processes. While in most cases I have seen it is suggested to setup swappiness level to 1 in order to prevent unplanned paging to swap, it is mostly dependent on the system Valkey is running on.

What risks are there locking all process pages?
From my past experience running a process with mlockall and still enable system swap can lead to many cases of unplanned crashes. Unless the process virtual memory is setup to be bounded to the real amount of memory the process is expected to utilize, page allocations would still succeed and later access to allocated virtual pages would sigfault. it is harder to analyze these cases and probably require to tune other system config (like swappiness) to make sure there will always be some RAM space to allocate pages for the locked process.

Are there any external examples providing this config?
I looked in the Elasticsearch documentation and found they are offering 3 alternatives to avoid swap (similar to the ones I have mentioned earlier) but they also utilize JVM bounded virtual memory configuration, which limit the heap size utilization. I think that this configuration falls under the "expert" level of configurations so it should be considered if we want to expose users to such options.

Should we also mlock bgsave pages in RAM?
page lock is not inherited across forks, so by default the BGSAVE will run without lock protection. I think that at the first stage we should NOT enable BGSAVE to lock pages in memory given the ephemeral nature of it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    pending-refinementThis issue/request is still a high level idea that needs to be further refined

    Type

    No type

    Projects

    Status

    Idea

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions