Skip to content

AVIC gets disabled due to PIT in reinject mode #498

@karuboniru

Description

@karuboniru

I noticed that libkrun currently initializes the KVM PIT with the default configuration, which leaves reinject mode enabled. On AMD platforms, this configuration triggers APICV_INHIBIT_REASON_PIT_REINJ, which effectively inhibits AVIC support.

https://github.com/torvalds/linux/blob/7f98ab9da046865d57c102fd3ca9669a29845f67/arch/x86/include/asm/kvm_host.h#L1311-L1315

Disabling AVIC forces interrupts to go through the software path (VM-EXIT -> Host -> Inject), resulting in performance overhead compared to the hardware-accelerated path.

Observed Behavior

$ sudo perf kvm --host stat live  -p 324830
Analyze events for pid(s) 324830, all VCPUs:

                                 VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time 

                                     msr          9    50.00%    50.01%      5.32us 511996.84us  56893.84us ( +-  99.99% )
                                     npf          6    33.33%     0.00%      5.67us     10.90us      7.57us ( +-  13.33% )
                                   vintr          2    11.11%    49.97%      4.19us 511624.82us 255814.51us ( +- 100.00% )
                               interrupt          1     5.56%     0.01%     91.51us     91.51us     91.51us ( +-   0.00% )

Total Samples:18, Total events handled time:1023810.53us.

$ sudo perf trace -e kvm:kvm_apicv_inhibit_changed -p 324830
     0.000 fc_vcpu 9/324842 kvm:kvm_apicv_inhibit_changed(reason: 8, set: 1, inhibits: 768)
     0.016 fc_vcpu 9/324842 kvm:kvm_apicv_inhibit_changed(reason: 8, inhibits: 512)
   511.876 fc_vcpu 9/324842 kvm:kvm_apicv_inhibit_changed(reason: 8, set: 1, inhibits: 768)
   511.890 fc_vcpu 9/324842 kvm:kvm_apicv_inhibit_changed(reason: 8, inhibits: 512)
  1023.989 fc_vcpu 9/324842 kvm:kvm_apicv_inhibit_changed(reason: 8, set: 1, inhibits: 768)
  1024.003 fc_vcpu 9/324842 kvm:kvm_apicv_inhibit_changed(reason: 8, inhibits: 512)

Expected Behavior (with Fix)

Analyze events for pid(s) 330840, all VCPUs:

                                 VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time 

                                     msr         12    66.67%     0.03%      4.53us     80.25us     11.59us ( +-  53.88% )
               avic_unaccelerated_access          4    22.22%     0.00%      2.41us      3.09us      2.75us ( +-   7.01% )
                                     hlt          1     5.56%    99.96% 511802.51us 511802.51us 511802.51us ( +-   0.00% )
                               interrupt          1     5.56%     0.02%     78.85us     78.85us     78.85us ( +-   0.00% )

Total Samples:18, Total events handled time:512031.38us.

Proof of Concept

I can't speak Rust, but I created a PoC to verify this fix.

It appears that the upstream kvm-ioctls crate does not currently provide a wrapper for KVM_REINJECT_CONTROL. Therefore, I used unsafe code with a raw ioctl call (magic number 44657 which is 0xae71 / KVM_REINJECT_CONTROL) to test the hypothesis.

diff --git a/src/devices/src/legacy/kvmioapic.rs b/src/devices/src/legacy/kvmioapic.rs
index 57b63e7..54473b2 100644
--- a/src/devices/src/legacy/kvmioapic.rs
+++ b/src/devices/src/legacy/kvmioapic.rs
@@ -7,7 +7,7 @@ use crate::bus::BusDevice;
 use crate::legacy::irqchip::IrqChipT;
 use crate::Error as DeviceError;

-use kvm_bindings::{kvm_pit_config, KVM_PIT_SPEAKER_DUMMY};
+use kvm_bindings::{KVM_PIT_SPEAKER_DUMMY, kvm_pit_config, kvm_reinject_control};
 use kvm_ioctls::{Error, VmFd};
 use utils::eventfd::EventFd;

@@ -23,6 +23,18 @@ impl KvmIoapic {
             ..Default::default()
         };
         vm.create_pit2(pit_config)?;
+        let control = kvm_reinject_control {
+            pit_reinject: 0,
+            ..Default::default()
+        };
+
+        use vmm_sys_util::ioctl::ioctl_with_ref;
+        let result = unsafe {
+          ioctl_with_ref(vm, 44657, &control)
+        };
+        if result < 0 {
+          warn!("Failed to set reinject control for PIT: {}", result);
+        }

         Ok(Self {})
     }

Proposal

Since kvm-ioctls lacks the necessary wrapper, I am unsure of the best path forward for libkrun. Should we:

  • Call ioctl from libkrun?
  • Make a PR to kvm-ioctl to warp around KVM_REINJECT_CONTROL?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions