Using the Real-Time Version of Edge Microvisor Toolkit #585
stevenhoenisch
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The principal versions of Edge Microvisor Toolkit showcase the capabilities of Intel platforms for edge AI workloads through yet-to-be-upstreamed Linux patches from Intel that optimize performance and other capabilities for Intel processors.
The versions include a real-time developer node designed with the Preempt RT Linux Kernel for predictable performance, which gives you a reference Linux operating system primed to demonstrate how Intel processors can empower operating system vendors and other technology partners to optimize their platforms for running edge and AI solutions, including those requiring real-time performance.
Benefits of the Preempt_RT Patch for the Linux Kernel
The real-time version of the Edge Microvisor Toolkit developer node includes the Preempt_RT patch for Linux Kernel 6.12 to improve real-time performance with the following capabilities:
Reduced latency: The real-time (RT) patch transforms parts of the kernel to be fully preemptible so that high-priority tasks can interrupt lower-priority tasks quickly, leading to significantly lower worst-case latency occurrences.
Deterministic scheduling: By threading interrupt handlers and converting spinlocks to preemptible mutexes, the RT kernel furnishes more predictable and deterministic behavior, a crucial result for time-sensitive applications where meeting strict deadlines is mandatory.
Improved interrupt handling: With the RT kernel, most interrupts are handled by kernel threads. As a result, the scheduler can manage them more effectively, ensuring that critical real-time tasks are not unduly delayed by interrupt processing.
Better synchronization primitives: The patch refines locking mechanisms, reducing the time when critical sections cannot be interrupted. The result improves overall responsiveness so the system can handle real-time workloads with minimal jitter.
Tuning RT Performance and Power Consumption
The real-time version of the microvisor includes several tools to analyze and tune performance and power management:
The Turbostat tool from Intel® diagnoses performance issues and optimizes power consumption on systems with Intel® processors by using hardware performance counters to display real-time data for each processor core, such as data about per-core frequency, C-state (idle state) residency, performance states (P-states), and power usage. These metrics reveal how Intel® CPUs manage power under varying workloads so you can benchmark and tune systems to maximize performance and efficiency while minimizing energy usage.
The Linux
perftool is a powerful, integrated performance analysis suite that taps directly into the Linux Performance Events subsystem to provide comprehensive metrics and in-depth reports of overall system performance.perfcan measure a wide range of performance events, including CPU cycles, instructions, cache misses, and branch mispredictions. The granular data is invaluable for identifying performance bottlenecks in kernel and user-space applications. The tool's multiple modes of operation let you capture a quick summary of performance counters over different periods. You can visualize real-time performance data by using thetopcommand.The Linux
cpupowertool controls CPU power management on Linux to optimize CPU behavior to meet the demands of your workload. By querying and setting up CPU frequency scaling, the tool empowers you to mediate the trade-offs between performance and power usage with three capabilities. First, frequency management lets you view current CPU frequencies and adjust settings using various governors, such as performance, powersave, and on-demand. Second, power-saving adjustments help tune the system's energy usage by changing parameters like frequency limits and turbo boost. Third, dynamic control provides commands such ascpupower frequency-infoto display frequency data andcpupower frequency-setto adjust CPU frequency settings.Customizing the RT Kernel Command Line for Performance
The kernel command line for the RT kernel can be customized for a workload's requirements. At present,
idleis the only configured command-line argument that affects real-time performance. It is set toidle=pollto force the CPU to actively poll for work when idle, rather than entering low-power idle states. In RT systems, this setting can reduce latency by ensuring that the CPU is always ready to handle high-priority tasks, though that can result in a trade-off with higher power consumption.To configure kernel command line arguments before you generate a build, add them in the
ExtraCommandLineparameter inside your image config file; example:For an example in the context of an entire image config file, see image-rt-json.
isolcpus=<list>isolates specific CPU cores from the general scheduler, preventing non-RT tasks from being scheduled on those cores. This setting ensures that designated cores are available solely for RT tasks.nohz_full=<list>enables full tickless (nohz) mode on specified cores, reducing periodic timer interrupts that could introduce latency on cores dedicated to RT workloads.rcu_nocbs=<list>offloads RCU (Read-Copy-Update) callbacks from the specified CPUs, reducing interference on cores that need to be as responsive as possible.threadirqsforces interrupts to be handled by dedicated threads rather than in interrupt context, which can improve the predictability and granularity of scheduling RT tasks.nosmtdisables simultaneous multi-threading (hyperthreading). This argument can prevent contention between sibling threads that share the same physical core, leading to more predictable performance.numa_balancing=0disables automatic NUMA balancing. While NUMA awareness is important, automatic migration of processes can introduce latency. Disabling it helps maintain predictable memory locality.intel_idle.max_cstate=0limits deep idle states on Intel® CPUs, reducing wake-up latencies that can affect RT performance -- a setting you can use to help validate the performance of edge AI workloads on Intel processors.Download the Latest Release
Edge Microvisor Toolkit is part of Open Edge Platform, a set of open-source edge AI solutions on GitHub that also includes Edge Manageability Framework, Edge AI Libraries, and Edge AI Suites. Vendors, developers, and technology partners can take part in the GitHub community for these solutions by, for instance, trying out releases or using them to benchmark application performance on Intel silicone; see the demo videos on YouTube
Here's how to find our more or obtain the latest releases of these projects:
Edge Microvisor Toolkit
Edge Manageability Framework
Edge AI Libraries
Edge AI Suites
Beta Was this translation helpful? Give feedback.
All reactions