Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
dfadfbc
Revert "NVIDIA: SAUCE: vfio/nvgrace-egm: Prevent double-unregister of…
ankita-nv Jan 21, 2026
006b422
Revert "NVIDIA: SAUCE: vfio/nvgrace-gpu: Avoid resmem pfn unregistrat…
ankita-nv Jan 21, 2026
bd0febb
Revert "NVIDIA: SAUCE: KVM: arm64: Allow exec fault on memory mapped …
ankita-nv Jan 21, 2026
9eb8008
Revert "NVIDIA: SAUCE: arm64: configs: Replace VFIO_CONTAINER with IO…
ankita-nv Jan 21, 2026
230f470
Revert "NVIDIA: SAUCE: WAR: Expose PCI PASID capability to userspace"
ankita-nv Jan 21, 2026
ca576fe
Revert "NVIDIA: SAUCE: vfio/nvgrace-egm: Register EGM for runtime ECC…
ankita-nv Jan 21, 2026
24a28c0
Revert "NVIDIA: SAUCE: vfio/nvgrace-gpu: register device memory for p…
ankita-nv Jan 21, 2026
0230499
Revert "NVIDIA: SAUCE: mm: Change ghes code to allow poison of non-st…
ankita-nv Jan 21, 2026
b98ce8c
Revert "NVIDIA: SAUCE: mm: Add poison error check in fixup_user_fault…
ankita-nv Jan 21, 2026
d5be6ba
Revert "NVIDIA: SAUCE: mm: correctly identify pfn without struct pages"
ankita-nv Jan 21, 2026
11cdd90
Revert "NVIDIA: SAUCE: mm: handle poisoning of pfn without struct pages"
ankita-nv Jan 21, 2026
aef2b13
mm: change ghes code to allow poison of non-struct pfn
ankita-nv Nov 2, 2025
71a9249
mm: handle poisoning of pfn without struct pages
ankita-nv Nov 2, 2025
fce1248
KVM: arm64: VM exit to userspace to handle SEA
Oct 13, 2025
8c45374
KVM: selftests: Test for KVM_EXIT_ARM_SEA
Oct 13, 2025
e4b0fd7
Documentation: kvm: new UAPI for handling SEA
Oct 13, 2025
d3c4269
vfio: refactor vfio_pci_mmap_huge_fault function
ankita-nv Nov 27, 2025
d677ad1
vfio/nvgrace-gpu: Add support for huge pfnmap
ankita-nv Nov 27, 2025
5b64a26
vfio: use vfio_pci_core_setup_barmap to map bar in mmap
ankita-nv Nov 27, 2025
c5d0232
vfio/nvgrace-gpu: split the code to wait for GPU ready
ankita-nv Nov 27, 2025
6375555
vfio/nvgrace-gpu: Inform devmem unmapped after reset
ankita-nv Nov 27, 2025
c21ab86
vfio/nvgrace-gpu: wait for the GPU mem to be ready
ankita-nv Nov 27, 2025
d0fb1ee
mm: fixup pfnmap memory failure handling to use pgoff
ankita-nv Dec 11, 2025
e424c81
mm: add stubs for PFNMAP memory failure registration functions
ankita-nv Jan 15, 2026
2ffcbc8
vfio/nvgrace-gpu: register device memory for poison handling
ankita-nv Jan 15, 2026
72033a0
NVIDIA: SAUCE: vfio/nvgrace-egm: register EGM PFNMAP range with memor…
ankita-nv Jan 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions Documentation/virt/kvm/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7245,6 +7245,41 @@ exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case,
it will enter with output fields already valid; in the common case, the
``unknown.ret`` field of the union will be ``TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED``.
Userspace need not do anything if it does not wish to support a TDVMCALL.

::

/* KVM_EXIT_ARM_SEA */
struct {
#define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 0)
__u64 flags;
__u64 esr;
__u64 gva;
__u64 gpa;
} arm_sea;

Used on arm64 systems. When the VM capability ``KVM_CAP_ARM_SEA_TO_USER`` is
enabled, a KVM exits to userspace if a guest access causes a synchronous
external abort (SEA) and the host APEI fails to handle the SEA.

``esr`` is set to a sanitized value of ESR_EL2 from the exception taken to KVM,
consisting of the following fields:

- ``ESR_EL2.EC``
- ``ESR_EL2.IL``
- ``ESR_EL2.FnV``
- ``ESR_EL2.EA``
- ``ESR_EL2.CM``
- ``ESR_EL2.WNR``
- ``ESR_EL2.FSC``
- ``ESR_EL2.SET`` (when FEAT_RAS is implemented for the VM)

``gva`` is set to the value of FAR_EL2 from the exception taken to KVM when
``ESR_EL2.FnV == 0``. Otherwise, the value of ``gva`` is unknown.

``gpa`` is set to the faulting IPA from the exception taken to KVM when
the ``KVM_EXIT_ARM_SEA_FLAG_GPA_VALID`` flag is set. Otherwise, the value of
``gpa`` is unknown.

::

/* Fix the size of the union. */
Expand Down Expand Up @@ -8662,6 +8697,18 @@ This capability indicate to the userspace whether a PFNMAP memory region
can be safely mapped as cacheable. This relies on the presence of
force write back (FWB) feature support on the hardware.

7.45 KVM_CAP_ARM_SEA_TO_USER
----------------------------

:Architecture: arm64
:Target: VM
:Parameters: none
:Returns: 0 on success, -EINVAL if unsupported.

When this capability is enabled, KVM may exit to userspace for SEAs taken to
EL2 resulting from a guest access. See ``KVM_EXIT_ARM_SEA`` for more
information.

8. Other capabilities.
======================

Expand Down
1 change: 1 addition & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -11394,6 +11394,7 @@ M: Miaohe Lin <linmiaohe@huawei.com>
R: Naoya Horiguchi <nao.horiguchi@gmail.com>
L: linux-mm@kvack.org
S: Maintained
F: include/linux/memory-failure.h
F: mm/hwpoison-inject.c
F: mm/memory-failure.c

Expand Down
2 changes: 0 additions & 2 deletions arch/arm64/configs/defconfig
Original file line number Diff line number Diff line change
Expand Up @@ -1813,9 +1813,7 @@ CONFIG_MEMTEST=y
CONFIG_NVGRACE_GPU_VFIO_PCI=m
CONFIG_NVGRACE_EGM=m
CONFIG_VFIO_DEVICE_CDEV=y
# CONFIG_VFIO_CONTAINER is not set
CONFIG_FAULT_INJECTION=y
CONFIG_IOMMUFD_DRIVER=y
CONFIG_IOMMUFD=y
CONFIG_IOMMUFD_TEST=y
CONFIG_IOMMUFD_VFIO_CONTAINER=y
2 changes: 2 additions & 0 deletions arch/arm64/include/asm/kvm_host.h
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,8 @@ struct kvm_arch {
#define KVM_ARCH_FLAG_GUEST_HAS_SVE 9
/* MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are writable from userspace */
#define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10
/* Unhandled SEAs are taken to userspace */
#define KVM_ARCH_FLAG_EXIT_SEA 11
unsigned long flags;

/* VM-wide vCPU feature set */
Expand Down
5 changes: 5 additions & 0 deletions arch/arm64/kvm/arm.c
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
}
mutex_unlock(&kvm->lock);
break;
case KVM_CAP_ARM_SEA_TO_USER:
r = 0;
set_bit(KVM_ARCH_FLAG_EXIT_SEA, &kvm->arch.flags);
break;
default:
break;
}
Expand Down Expand Up @@ -322,6 +326,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_IRQFD_RESAMPLE:
case KVM_CAP_COUNTER_OFFSET:
case KVM_CAP_ARM_WRITABLE_IMP_ID_REGS:
case KVM_CAP_ARM_SEA_TO_USER:
r = 1;
break;
case KVM_CAP_SET_GUEST_DEBUG2:
Expand Down
73 changes: 68 additions & 5 deletions arch/arm64/kvm/mmu.c
Original file line number Diff line number Diff line change
Expand Up @@ -1493,7 +1493,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
bool s2_force_noncacheable = false, vfio_allow_any_uc = false;
unsigned long mmu_seq;
phys_addr_t ipa = fault_ipa;
unsigned long mt;
struct kvm *kvm = vcpu->kvm;
struct vm_area_struct *vma;
short vma_shift;
Expand Down Expand Up @@ -1613,8 +1612,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
vma_pagesize = min(vma_pagesize, (long)max_map_size);
}

mt = FIELD_GET(PTE_ATTRINDX_MASK, pgprot_val(vma->vm_page_prot));

/*
* Both the canonical IPA and fault IPA must be hugepage-aligned to
* ensure we find the right PFN and lay down the mapping in the right
Expand Down Expand Up @@ -1698,7 +1695,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
writable = false;
}

if (exec_fault && s2_force_noncacheable && mt != MT_NORMAL)
if (exec_fault && s2_force_noncacheable)
ret = -ENOEXEC;

if (ret) {
Expand Down Expand Up @@ -1819,8 +1816,48 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
read_unlock(&vcpu->kvm->mmu_lock);
}

/*
* Returns true if the SEA should be handled locally within KVM if the abort
* is caused by a kernel memory allocation (e.g. stage-2 table memory).
*/
static bool host_owns_sea(struct kvm_vcpu *vcpu, u64 esr)
{
/*
* Without FEAT_RAS HCR_EL2.TEA is RES0, meaning any external abort
* taken from a guest EL to EL2 is due to a host-imposed access (e.g.
* stage-2 PTW).
*/
if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
return true;

/* KVM owns the VNCR when the vCPU isn't in a nested context. */
if (is_hyp_ctxt(vcpu) && !kvm_vcpu_trap_is_iabt(vcpu) && (esr & ESR_ELx_VNCR))
return true;

/*
* Determining if an external abort during a table walk happened at
* stage-2 is only possible with S1PTW is set. Otherwise, since KVM
* sets HCR_EL2.TEA, SEAs due to a stage-1 walk (i.e. accessing the
* PA of the stage-1 descriptor) can reach here and are reported
* with a TTW ESR value.
*/
return (esr_fsc_is_sea_ttw(esr) && (esr & ESR_ELx_S1PTW));
}

int kvm_handle_guest_sea(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
struct kvm_run *run = vcpu->run;
u64 esr = kvm_vcpu_get_esr(vcpu);
u64 esr_mask = ESR_ELx_EC_MASK |
ESR_ELx_IL |
ESR_ELx_FnV |
ESR_ELx_EA |
ESR_ELx_CM |
ESR_ELx_WNR |
ESR_ELx_FSC;
u64 ipa;

/*
* Give APEI the opportunity to claim the abort before handling it
* within KVM. apei_claim_sea() expects to be called with IRQs enabled.
Expand All @@ -1829,7 +1866,33 @@ int kvm_handle_guest_sea(struct kvm_vcpu *vcpu)
if (apei_claim_sea(NULL) == 0)
return 1;

return kvm_inject_serror(vcpu);
if (host_owns_sea(vcpu, esr) ||
!test_bit(KVM_ARCH_FLAG_EXIT_SEA, &vcpu->kvm->arch.flags))
return kvm_inject_serror(vcpu);

/* ESR_ELx.SET is RES0 when FEAT_RAS isn't implemented. */
if (kvm_has_ras(kvm))
esr_mask |= ESR_ELx_SET_MASK;

/*
* Exit to userspace, and provide faulting guest virtual and physical
* addresses in case userspace wants to emulate SEA to guest by
* writing to FAR_ELx and HPFAR_ELx registers.
*/
memset(&run->arm_sea, 0, sizeof(run->arm_sea));
run->exit_reason = KVM_EXIT_ARM_SEA;
run->arm_sea.esr = esr & esr_mask;

if (!(esr & ESR_ELx_FnV))
run->arm_sea.gva = kvm_vcpu_get_hfar(vcpu);

ipa = kvm_vcpu_get_fault_ipa(vcpu);
if (ipa != INVALID_GPA) {
run->arm_sea.flags |= KVM_EXIT_ARM_SEA_FLAG_GPA_VALID;
run->arm_sea.gpa = ipa;
}

return 0;
}

/**
Expand Down
Loading