Description
Describe the bug
There is a Windows 2022 server virtual machine with physically connected 6TB drive and 6TB drive laying on the ZFS HDD partition.
While copying the data from physical drive to the virtual drive the BSOD happens anywhere between 100 and 200 GBs transferred.
It happens with target drive is connected either with Virtio SCSI, Virtio SCSI Single, Virtio block. When target drive is connected with SATA the data are copied perfectly.
As an addition the Optimize-Volume -Retrim for that SCSI connected disk requires about 80GB+ of RAM available. Though trimming same drive connected via SATA causes no problems.
To Reproduce
Steps to reproduce the behaviour:
Copy from one disk to another the relative big amount of data.
Host:
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
pve-kernel-5.15: 7.4-6
pve-kernel-5.13: 7.1-9
proxmox-kernel-6.5: 6.5.11-7
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2.16-18-pve: 6.2.16-18
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
openvswitch-switch: 3.1.0-2
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.3
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.4
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1
VM:
- Windows version is Windows Server 2022 21H2 Build 20348.2159
- Which driver has a problem 0.1.215-0.1.240
Additional context
The usual minidump analyzis:
BLACKBOXWINLOGON: 1
CUSTOMER_CRASH_COUNT: 1
PROCESS_NAME: svchost.exe
PAGE_HASH_ERRORS_DETECTED: 1
STACK_TEXT:
ffffef0f`78de57e8 fffff805`3f7a74e1 : 00000000`0000001a 00000000`0000003f 00000000`00006e81 00000000`00006e81 : nt!KeBugCheckEx
ffffef0f`78de57f0 fffff805`3f68afc1 : ffffab8f`5d5960c0 ffffffff`ffffffff ffffef0f`78de5a10 ffffef0f`78de5b40 : nt!MiValidatePagefilePageHash+0x241
ffffef0f`78de58d0 fffff805`3f4ba915 : ffffef0f`00000000 ffffef0f`78de5a00 ffffef0f`78de5a28 ffffcee7`00000000 : nt!MiWaitForInPageComplete+0x1d0091
ffffef0f`78de59d0 fffff805`3f4a9b6d : 00000000`c0033333 00000000`00000001 00007ffb`73063228 00000000`00000000 : nt!MiIssueHardFault+0x1d5
ffffef0f`78de5a80 fffff805`3f630d41 : ffffab8f`5ec73080 ffffab8f`5e5b9080 000001b4`8b3d8730 ffffab8f`00000000 : nt!MmAccessFault+0x35d
ffffef0f`78de5c20 00007ffb`73050be7 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiPageFault+0x341
00000082`c3dfece0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffb`73050be7
SYMBOL_NAME: PAGE_HASH_ERRORS_INPAGE
MODULE_NAME: Unknown_Module
IMAGE_NAME: Unknown_Image
STACK_COMMAND: .cxr; .ecxr ; kb
FAILURE_BUCKET_ID: PAGE_HASH_ERRORS_0x1a_3f
OS_VERSION: 10.0.20348.859
BUILDLAB_STR: fe_release_svc_prod2
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {6a2d4548-0eec-578d-e8f1-9e2239aa9a00}
Followup: MachineOwner
---------
*** Memory manager detected 1 instance(s) of corrupted pagefile page(s) while performing in-page operations.
What tried to solve:
- SCSI Single/SCSI switching
- Matchine types 440fx and q35
- Trim/IoThread/io_uring/native threads/ssd emulation/Discard/Caching options juggling.
- Balloon switching on and off
- Numa switching on and off