Skip to content

Memory leak while reading /sys/devices/system/cpu/online inside an Incus container #677

Open
@DeyanSG

Description

@DeyanSG

Hello,

We are using Incus + lxcfs in our setup and we’ve come to an issue with memory consumption of the lxcfs process while reading aggressively from /sys/devices/system/cpu/online.

Versions
We’ve tested with different versions of both lxcfs and libfuse3 and the issue seems to be present even with the latest stable versions:

  • lxcfs 6.0.3
  • libfuse3 both latest (3.16.2) and CentOS 9 default (3.10.2)

Setup
We are running an Incus container on a node with 56 CPU cores. It seems reproducible even with one single container. In our setup the container itself is restricted in CPU usage using limits.cpu.allowance: 1200ms/60ms (although not very relevant it is much faster to see the effect if the container can use more CPU).

Reproducer
To reproduce the issue, compile the following C code that starts a number of threads inside the container, each opening, reading from and then closing а file:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <signal.h>
#include <stdbool.h>

//compile with:
//gcc -pthread -o fuse-stress-poc fuse-stress-poc.c

#define DEFAULT_FILE_PATH "/sys/devices/system/cpu/online"

volatile sig_atomic_t run = 1;

void handle_sigint(int sig) {
    run = 0;
}

void* stress_work(void* arg) {
    const char* file_path = (const char*)arg;
    int fd;
    char buffer[256];
    ssize_t bytes_read;

    while (run) {
        fd = open(file_path, O_RDONLY);
        if (fd != -1) {
            bytes_read = read(fd, buffer, sizeof(buffer) - 1);
            close(fd);
        }
    }
    return NULL;
}

int main(int argc, char* argv[]) {
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <num_threads> <file_path>\n", argv[0]);
        return EXIT_FAILURE;
    }

    int num_threads = atoi(argv[1]);
    if (num_threads <= 0) {
        fprintf(stderr, "Number of threads must be positive.\n");
        return EXIT_FAILURE;
    }

    const char* file_path = argv[2];
    if (access(file_path, R_OK) != 0) {
        fprintf(stderr, "File '%s' does not exist or is not accessible.\n", file_path);
        return EXIT_FAILURE;
    }

    signal(SIGINT, handle_sigint);

    pthread_t* threads = malloc(num_threads * sizeof(pthread_t));
    if (threads == NULL) {
        perror("malloc");
        return EXIT_FAILURE;
    }

    for (int i = 0; i < num_threads; i++) {
        pthread_create(&threads[i], NULL, stress_work, (void*)file_path);
    }

    for (int i = 0; i < num_threads; i++) {
        pthread_join(threads[i], NULL);
    }

    free(threads);

    return 0;
}

Run it with the following command in a container:
./fuse-stress-poc 400 /sys/devices/system/cpu/online

Monitor the RSS memory usage of lxcfs. We can see it go over 1GB in about a minute. Then if we just stop/kill the process inside the container the RSS memory usage stays around the same value instead of dropping back to about 2MB.

So far we’ve tried the following:

  • Reading /proc/uptime and /proc/cpuinfo to see if we can see a leak with these files but we could not reproduce the issue, RSS usage stays low (around 2MB) while reading these files.
  • We’ve attempted to find which commit introduces this behavior and as far as we saw, weirdly enough it seems to be the one enabling direct_io: c2b4b50

We would appreciate your assistance in verifying if this issue is reproducible on your end, so we can collaborate effectively to identify and implement a solution.

While investigating other issues related to hanging lxcfs file operations, we inadvertently discovered this situation. As a result, we developed a stress test. Although we were unable to reproduce the hang, we identified what appears to be a memory leak.

Apologies for any confusion caused by the opening, resolving, and creating a new issue. I accidentally clicked the wrong option while typing.

Regards,
Deyan

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions