Skip to content

Conversation

@mathildemerle
Copy link
Collaborator

From the previous PR on branch med3.2: #1043
This PR is a draft one.

This is a Work In Progress about RAM monitoring

Testing medInria3.2.x and MUSICardio3 i saw some recurrent problem: some toolboxes/processes use so much RAM that a 16Go RAM computer is not enough.

This is a major problem because it happens a lot, and users have no clue about what just happened.

I talked about that to @Florent2305 and we did some researches about solutions. It appears that @paulineMig worked also on this subject because she had the same crashes in the application.

Recap of the researches

  • Some filters usage in VTK/ITK could be useful to reduce memory usage.
  • Memory leaks should be searched, but it's complicated if their are located in external projects for instance. @paulineMig worked on memory leaks inside the core of medInria, specially with Qt.
  • We could try-catch errors from external projects (ITK/VTK/RPI, etc). I worked on that part, and the error sent by the application is SIGKILL which is impossible to reach from inside the application to avoid a crash or to display an error message before crashing.
  • The last idea is: monitoring the RAM usage when a process is run in the application. If the RAM usage reaches a defined percentage, we cancel the process and display an error message for the user. This would be a plaisant tool for users, and also for us which are trying to find errors in our codes where there aren't. This PR is about that part.

Idea of the monitoring of RAM

  • I tried to monitor the RAM usage with C++ (complicated and OS-dependent), Qt (there is no monitoring tool).
  • I found vtksys::SystemInformation which is the base ot this PR.

With vtksys we can do for instance:

    vtksys::SystemInformation sys_info;
    sys_info.RunOSCheck();
    sys_info.RunCPUCheck();
    sys_info.RunMemoryCheck();
    std::cout<<"### GetVendorString "<<sys_info.GetVendorString()<<std::endl;
    std::cout<<"### GetVendorID "<<sys_info.GetVendorID()<<std::endl;
    std::cout<<"### GetProcessorCacheSize "<<sys_info.GetProcessorCacheSize()<<std::endl;
    std::cout<<"### GetLogicalProcessorsPerPhysical "<<sys_info.GetLogicalProcessorsPerPhysical()<<std::endl;
    //
    std::cout<<"### GetHostname "<<sys_info.GetHostname()<<std::endl;
    std::cout<<"### GetOSDescription "<<sys_info.GetOSDescription()<<std::endl;
    std::cout<<"### GetCPUDescription "<<sys_info.GetCPUDescription()<<std::endl;
    std::cout<<"### GetOSPlatform "<<sys_info.GetOSPlatform()<<std::endl;
    std::cout<<"### Is64Bits "<<sys_info.Is64Bits()<<std::endl;
    // Retrieve memory information in MiB.
    std::cout<<"### GetTotalVirtualMemory "<<sys_info.GetTotalVirtualMemory()<<std::endl;
    std::cout<<"### GetAvailableVirtualMemory "<<sys_info.GetAvailableVirtualMemory()<<std::endl;
    std::cout<<"### GetTotalPhysicalMemory "<<sys_info.GetTotalPhysicalMemory()<<std::endl;
    std::cout<<"### GetAvailablePhysicalMemory "<<sys_info.GetAvailablePhysicalMemory()<<std::endl;
    // Retrieve amount of physical memory installed on the system in KiB units.
    std::cout<<"### GetHostMemoryTotal "<<sys_info.GetHostMemoryTotal()<<std::endl;
    // Get total system RAM in units of KiB available colectivley to all
    // processes in a process group. An example of a process group
    // are the processes comprising an mpi program which is running in
    // parallel. The amount of memory reported may differ from the host
    // total if a host wide resource limit is applied. Such reource limits
    // are reported to us via an application specified environment variable.
    std::cout<<"### GetHostMemoryAvailable "<<sys_info.GetHostMemoryAvailable()<<std::endl;
    // Get total system RAM in units of KiB available to this process.
    // This may differ from the host available if a per-process resource
    // limit is applied. per-process memory limits are applied on unix
    // system via rlimit API. Resource limits that are not imposed via
    // rlimit API may be reported to us via an application specified
    // environment variable.
    std::cout<<"### GetProcMemoryAvailable "<<sys_info.GetProcMemoryAvailable()<<std::endl;
    // Get the system RAM used by all processes on the host, in units of KiB.
    std::cout<<"### GetHostMemoryUsed "<<sys_info.GetHostMemoryUsed()<<std::endl;
    // Get system RAM used by this process id in units of KiB.
    std::cout<<"### GetProcMemoryUsed "<<sys_info.GetProcMemoryUsed()<<std::endl;
    // Return the load average of the machine or -0.0 if it cannot be determined.
    std::cout<<"### GetLoadAverage "<<sys_info.GetLoadAverage()<<std::endl;

with

#include <vtksys/SystemInformation.hxx>
// and vtksys in CMakeLists.txt

and get these results:

### GetVendorString GenuineIntel
### GetVendorID Intel Corporation
### GetProcessorCacheSize 9216
### GetLogicalProcessorsPerPhysical 2
### GetHostname IHULUX002
### GetOSDescription Linux 5.15.0-50-generic #56~20.04.1-Ubuntu SMP Tue Sep 27 15:51:29 UTC 2022
### GetCPUDescription 6 core Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
### GetOSPlatform x86_64
### Is64Bits 1
### GetTotalVirtualMemory 1906
### GetAvailableVirtualMemory 5
### GetTotalPhysicalMemory 15639
### GetAvailablePhysicalMemory 9602
### GetHostMemoryTotal 16015208
### GetHostMemoryAvailable 16015208
### GetProcMemoryAvailable 16015208
### GetHostMemoryUsed 6460292
### GetProcMemoryUsed 227256
### GetLoadAverage 1.37

I developed a prototype of a memory monitoring of medRunnableProcess/medJobItemL processes. It works well, but for now there are 2 problems:

  • vtksys is from VTK which is not reachable from inside the core of medInria for now.
  • The cancel process feature in medInria is bugged since years, and the button had been removed, so it needs to be fixed.

We can discuss about this topic here.

Ⓜ️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant