Skip to content

Massive Memory Leaks while Enumerating and Updating Files in a ZFS Volume + BSOD when running KSTAT #542

@VykosX

Description

@VykosX

System information

Type Version/Name
Distribution Name Microsoft Windows 10 Pro
Distribution Version 10.0.19045 Build 19045
Kernel Version 22H2
Architecture x64
OpenZFS Version OpenZFSOnWindows-debug-2.3.1rc14

Describe the problem you're observing

BSOD on Windows 10 Pro 22H2 when Running Kstat and massive memory leaks observed during regular file operations with a ZFS drive, totaling hundreds of gigabytes of wasted RAM.


So this is a follow up to an issue I was experiencing over a year ago where the ZFS on Windows driver allocates inordinate amounts of RAM during standard operations like enumerating, comparing and updating files and never releases any of it, easily swallowing all the available memory in the system.

I've since built a new PC, a dual Xeon machine with 256GB of RAM and decided now would be a great time to try and resync my drives, which have slowly drifted apart for a over a year, since I'm using the ZFS volume as a backup drive and due to this issue I had not managed to update for quite some time as my previous paltry 32GB RAM+16GB pagefile PC would entirely die from simply trying to enumerate the contents of the ZFS drive... But unfortunately after over a year of development, it appears this issue still remains a core problem in the driver and has not been addressed.

So I decided to open another bug report in the hopes that this issue can be fixed and I can finally use my ZFS volume without it swallowing every GB of memory on my machine, as it makes my 256GB system feel like it has a measly 16GB left for anything else I want to run.

I've observed massive memory increases in task manager while enumerating files in the ZFS volume and specially when overwriting or comparing existing files between the ZFS volume and the original NTFS volume the files were replicated from. This is specially concerning because the used RAM cannot be reclaimed unless I fully reboot the system, does not appear anywhere in the process list of task manager and is never released even after the file operations are finished. Simply running WizTree to enumerate the ZFS disk or FreeFileSync with the binary comparison option enabled with a sufficient amount of large files is enough to replicate the issue on my machine.

For reference, I've originally instantiated the zpool with the following command: zpool.exe create -O casesensitivity=insensitive -O normalization=formD -O compression=zstd -O atime=off -o ashift=12 Data PHYSICALDRIVE1 using a single brand new Seagate ST18000VE002-3BS101 18TB CBR drive.

The whole reason I wanted to try ZFS out was to take advantage of the transparent ZSTD compression so I could make the most out of the usable space of my drives since NTFS compression in Windows is very poor and I have lots of small files mixed in with large ones.

I also don't know if this is relevant but the new zfs-tray app included with the recent versions of the driver does not find the Data pool, but it is however able to find a different pool that I created later with the same command. Perhaps it's because the Data pool was created earlier on an older version of ZFS on Windows? Regardless, the pool works otherwise correctly and I can import it via zpool import Data over the command line manually and export it whenever I need it, only the GUI application fails to detect it.

I was also hoping to attach a kstat dump of the system with more than half of my 256GB in use from the ZFS driver, but as soon as I ran kstat I got immediately hit with a SYSTEM_SERVICE_EXCEPTION BSOD. I thought perhaps I might be overflowing the reporting capacity of the program somehow in that instance, so I tried again after a fresh reboot and sure enough it appears that merely running the kstat software on my PC will immediately trigger a BSOD. This did not happen with the previous driver version I was using a year ago (zfs-windows-2.2.2-rc1).

So in lieu of the kstat output, I'm attaching a couple Task Manager screenshots showing the memory usage for the System and the User instead, so you can see the massive discrepancy. Also worth noting that for whatever reason Task Manager reports both my pools running on different mount points (G: and I: ) for Disk 0, but Disk 0 is only a separate pool with its own mounting point whereas the Data pool is on Disk 1 and Task Manager does not correctly associate the G: mounting point to it. I don't know how much that matters in practice, as everything else appears to work otherwise.

I humbly ask the development team to please have a look at this situation because it makes the file system borderline unusable for any serious daily usage if you intend to fill a hard drive with standard media files like videos and games.

Thank you very much and happy holidays everyone!


Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions