Valkey support for memory locking (?)

**The problem/use-case that the feature addresses**

Valkey users are mostly interested in having their data in-memory. Currently whenever the server memory utilization is peaking it might cause the system to swap memory pages into swap memory thus introducing increase latency which in turn can lead to client applications to just attempt to connect more clients and potentially try to consume more memory. OS memory thrashing also leads to some maintenance operations (defrag, lazy evictions, client evictions etc...) to operate in a lower rate, thus contributing to more swap utilization and overall system health issues. When system is thrashing some other crucial server side operations (replication keepalive, pings and clusterbus notifications) are lagging which usually end at unresponsive server which is taken down or failed over. I would generally state that in AWS we have rarely seen cases were system was able to automatically stabilize after OS starts thrashing. 
     
**Description of the feature**
The suggestion is to provide a configuration which will mlockall pages in memory. I think we can start with a startup configuration and potentially turn it to a dynamic configuration (so we can also switch mlockall/munlockall) at runtime.

**What alternatives are there?**
Disable swap - The best way to avoid system swapping is to just disable swap memory. On linux this can be done by turning off  system swap `swapoff -a`. this would however impact the entire system and might lead to different services crashing when memory pressure is high.
tune system swappiness - AFAIK most linux distros come with swappiness level set to 60 which means the system will be aggressive swapping pages of inactive processes. While in most cases I have seen it is suggested to setup swappiness level to 1  in order to prevent unplanned paging to swap, it is mostly dependent on the system Valkey is running on.

**What risks are there locking all process pages?**
From my past experience running a process with mlockall and still enable system swap can lead to many cases of unplanned crashes. Unless the process virtual memory is setup to be bounded to the real amount of memory the process is expected to utilize, page allocations would still succeed  and later access to allocated virtual pages would sigfault. it is harder to analyze these cases and probably require to tune other system config (like swappiness) to make sure there will always be some RAM space to allocate pages for the locked process. 

**Are there any external examples providing this config?**
I looked in the [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html) and found they are offering 3 alternatives to avoid swap (similar to the ones I have mentioned earlier) but they also utilize JVM bounded virtual memory configuration, which limit the heap size utilization. I think that this configuration falls under the "expert" level of configurations so it should be considered if we want to expose users to such options.
 
*Should we also mlock bgsave pages in RAM?*
page lock is not inherited across forks, so by default the BGSAVE will run without lock protection. I think that at the first stage we should NOT enable BGSAVE to lock pages in memory given the ephemeral nature of it. 
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Valkey support for memory locking (?) #669

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Valkey support for memory locking (?) #669

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions