-
Notifications
You must be signed in to change notification settings - Fork 54
[linux-nvidia-6.18] Backport support for FEAT_LS64 #293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nvmochs
wants to merge
11
commits into
NVIDIA:linux-nvidia-6.18
Choose a base branch
from
nvmochs:618_feat_ls64
base: linux-nvidia-6.18
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…evice-nGnRE
Add CONFIG_ARM64_WORKAROUND_NC_TO_NGNRE configuration option that
enables conversion of MT_NORMAL_NC (Normal Non-Cacheable) memory
attribute to Device-nGnRE memory type in MAIR_EL1 for hardware that
requires stricter memory ordering or has issues with Non-Cacheable
memory mappings.
Key changes:
1. New memory type MT_NORMAL_NC_DMA (Attr5):
- Introduced specifically for DMA coherent memory mappings
- Configured with the same Normal Non-Cacheable attribute (0x44)
as MT_NORMAL_NC (Attr2) by default
- pgprot_dmacoherent uses MT_NORMAL_NC_DMA when workaround is
enabled, MT_NORMAL_NC otherwise
2. MAIR_EL1 conversion via alternatives framework:
- arch/arm64/mm/proc.S uses ARM64 alternatives to patch MAIR_EL1
during early boot
- Converts MT_NORMAL_NC (Attr2) from 0x44 to 0x04 (Device-nGnRE)
using efficient bfi instruction
- MT_NORMAL_NC_DMA (Attr5) keeps the same attribute value as
MT_NORMAL_NC originally had
- Zero performance overhead when workaround is disabled
3. Boot-time configuration:
- Enabled via kernel command line: mair_el1_nc_to_ngnre=1
- Boot CPU fixup in enable_nc_to_ngnre() applies conversion before
alternatives are patched
- Secondary CPUs automatically use patched alternatives in
__cpu_setup
- Runtime changes not supported as alternatives cannot be
re-patched after boot
4. Errata framework integration:
- Registered in arm64_errata[] array as ARM64_WORKAROUND_NC_TO_NGNRE
- Capability type: ARM64_CPUCAP_BOOT_CPU_FEATURE
- Uses cpucap_is_possible() for build-time capability checking
The workaround preserves pgprot_dmacoherent behavior while allowing
MT_NORMAL_NC to be converted to Device memory type for other mappings
that may be affected by hardware issues.
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
When APEI fails to handle a stage-2 synchronous external abort (SEA),
today KVM injects an asynchronous SError to the VCPU then resumes it,
which usually results in unpleasant guest kernel panic.
One major situation of guest SEA is when vCPU consumes recoverable
uncorrected memory error (UER). Although SError and guest kernel panic
effectively stops the propagation of corrupted memory, guest may
re-use the corrupted memory if auto-rebooted; in worse case, guest
boot may run into poisoned memory. So there is room to recover from
an UER in a more graceful manner.
Alternatively KVM can redirect the synchronous SEA event to VMM to
- Reduce blast radius if possible. VMM can inject a SEA to VCPU via
KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison
consumption or fault is not from guest kernel, blast radius can be
limited to the triggering thread in guest userspace, so VM can
keep running.
- Allow VMM to protect from future memory poison consumption by
unmapping the page from stage-2, or to interrupt guest of the
poisoned page so guest kernel can unmap it from stage-1 page table.
- Allow VMM to track SEA events that VM customers care about, to restart
VM when certain number of distinct poison events have happened,
to provide observability to customers in log management UI.
Introduce an userspace-visible feature to enable VMM handle SEA:
- KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior
when host APEI fails to claim a SEA, userspace can opt in this new
capability to let KVM exit to userspace during SEA if it is not
owned by host.
- KVM_EXIT_ARM_SEA. A new exit reason is introduced for this.
KVM fills kvm_run.arm_sea with as much as possible information about
the SEA, enabling VMM to emulate SEA to guest by itself.
- Sanitized ESR_EL2. The general rule is to keep only the bits
useful for userspace and relevant to guest memory.
- Flags indicating if faulting guest physical address is valid.
- Faulting guest physical and virtual addresses if valid.
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Co-developed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://msgid.link/20251013185903.1372553-2-jiaqiyan@google.com
Signed-off-by: Oliver Upton <oupton@kernel.org>
(cherry picked from commit ad9c62b)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Test how KVM handles guest SEA when APEI is unable to claim it, and KVM_CAP_ARM_SEA_TO_USER is enabled. The behavior is triggered by consuming recoverable memory error (UER) injected via EINJ. The test asserts two major things: 1. KVM returns to userspace with KVM_EXIT_ARM_SEA exit reason, and has provided expected fault information, e.g. esr, flags, gva, gpa. 2. Userspace is able to handle KVM_EXIT_ARM_SEA by injecting SEA to guest and KVM injects expected SEA into the VCPU. Tested on a data center server running Siryn AmpereOne processor that has RAS support. Several things to notice before attempting to run this selftest: - The test relies on EINJ support in both firmware and kernel to inject UER. Otherwise the test will be skipped. - The under-test platform's APEI should be unable to claim the SEA. Otherwise the test will be skipped. - Some platform doesn't support notrigger in EINJ, which may cause APEI and GHES to offline the memory before guest can consume injected UER, and making test unable to trigger SEA. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Link: https://msgid.link/20251013185903.1372553-3-jiaqiyan@google.com Signed-off-by: Oliver Upton <oupton@kernel.org> (cherry picked from commit feee9ef) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Document the new userspace-visible features and APIs for handling synchronous external abort (SEA) - KVM_CAP_ARM_SEA_TO_USER: How userspace enables the new feature. - KVM_EXIT_ARM_SEA: exit userspace gets when it needs to handle SEA and what userspace gets while taking the SEA. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Link: https://msgid.link/20251013185903.1372553-4-jiaqiyan@google.com [ oliver: make documentation concise, remove implementation detail ] Signed-off-by: Oliver Upton <oupton@kernel.org> (cherry picked from commit 4debb5e) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.
However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.
Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.
A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit f174a9ffcd48d78a45d560c02ce4071ded036b53 linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add a bit of documentation for KVM_EXIT_ARM_LDST64B so that userspace knows what to expect. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 902eebac8fa3bad1c369f48f2eaf859755ad9e6d linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
…emory If FEAT_LS64WB not supported, FEAT_LS64* instructions only support to access Device/Uncacheable memory, otherwise a data abort for unsupported Exclusive or atomic access (0x35, UAoEF) is generated per spec. It's implementation defined whether the target exception level is routed and is possible to implemented as route to EL2 on a VHE VM according to DDI0487L.b Section C3.2.6 Single-copy atomic 64-byte load/store. If it's implemented as generate the DABT to the final enabled stage (stage-2), inject the UAoEF back to the guest after checking the memslot is valid. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 2937aeec9dc5d25a02c1415a56d88ee4cc17ad83 linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Instructions introduced by FEAT_{LS64, LS64_V} is controlled by
HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage
at EL0/1.
This doesn't mean these instructions are always available in
EL0/1 if provided. The hypervisor still have the control at
runtime.
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit dea58da4b6fede082d9f38ce069090fd6d43f4e2 linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 151b92c92a45704216c37d6238efbffd84aac538 linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V}.
These features are identified by ID_AA64ISAR1_EL1.LS64 and the
use of such instructions in userspace (EL0) can be trapped.
As st64bv (FEAT_LS64_V) and st64bv0 (FEAT_LS64_ACCDATA) can not be tell
apart, FEAT_LS64 and FEAT_LS64_ACCDATA which will be supported in later
patch will be exported to userspace, FEAT_LS64_V will be enabled only
in kernel.
In order to support the use of corresponding instructions in userspace:
- Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
- Add identifying and enabling in the cpufeature list
- Expose these support of these features to userspace through HWCAP3
and cpuinfo
ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
special memory (device memory) so requires support by the CPU, system
and target memory location (device that support these instructions).
The HWCAP3_LS64, implies the support of CPU and system (since no
identification method from system, so SoC vendors should advertise
support in the CPU if system also support them).
Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
DABT will be generated, so users (probably userspace driver developer)
should make sure the target memory (device) also have the support.
For st64bv 0xffffffffffffffff will be returned as status result for
unsupported memory so user should check it.
Document the restrictions along with HWCAP3_LS64.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(backported from commit 58ce78667a641f93afa0c152c700a1673383d323 linux-next)
[mochs: Minor context cleanup due to lack of "arm64: Detect FEAT_XNX"]
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add tests for FEAT_LS64. Issue related instructions if feature presents, no SIGILL should be received. When such instructions operate on Device memory or non-cacheable memory, we may received a SIGBUS during the test (w/o FEAT_LS64WB). Just ignore it since we only tested whether the instruction itself can be issued as expected on platforms declaring the support of such features. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 57a96356bb6942e16283138d0a42baad29169ed8 linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Collaborator
|
|
clsotog
approved these changes
Jan 27, 2026
Collaborator
clsotog
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acked-by: Carol L Soto <csoto@nvidia.com>
3b9023e to
d602686
Compare
nirmoy
approved these changes
Jan 27, 2026
Collaborator
nirmoy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acked-by: Nirmoy Das<nirmoyd@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This series was recently merged to linux-next and is targeting v6.20.
It adds support for Armv8.7 FEAT_{LS64, LS64_V}:
Lore: https://lore.kernel.org/all/20260119022928.149358-1-wangzhou1@hisilicon.com/
linux-next:
Also included in this PR is the "VMM can handle guest SEA via KVM_EXIT_ARM_SEA” series from v6.19 to ease picking and because this series will also need to be backported in the future to support nvgrace ECC.
Lore: https://lore.kernel.org/kvmarm/20251013185903.1372553-1-jiaqiyan@google.com/
v6.19:
Tested the following: