Skip to content

remove thread unsafe debug code causing FreeBSD double free panic#18140

Merged
behlendorf merged 1 commit intoopenzfs:masterfrom
alek-p:freebsd_double_free_panic
Jan 21, 2026
Merged

remove thread unsafe debug code causing FreeBSD double free panic#18140
behlendorf merged 1 commit intoopenzfs:masterfrom
alek-p:freebsd_double_free_panic

Conversation

@alek-p
Copy link
Contributor

@alek-p alek-p commented Jan 16, 2026

Motivation and Context

While attempting to teach scrub to do a "thorough" scrub where it decompresses and decrypts scrubbed blocks #17630, I encountered a reproducible panic. It turns out that in FreeBSD, a ZFS crypto function zio_do_crypt_data() includes thread-unsafe code in an error path, which can lead to a double-free panic.

Description

If ZFS attempts to decrypt data and decryption fails, the code enters an error-handling path. In module/os/freebsd/zfs/zio_crypt.c, this error path contains thread-unsafe debug logic that attempts to save the failed decryption buffer to a global variable (failed_decrypt_buf) without any locking.

With multiple threads encountering errors concurrently, this can lead to a race condition where multiple threads attempt to free the same global pointer, resulting in a duplicate-free panic with the following stack trace:

panic: Duplicate free of 0xfffff800bf043800 from zone 0xfffffe002da24000(malloc-512) slab 0xfffff800bf336c58(4)
  cpuid = 7
  time = 1767985732
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00febf9880
  vpanic() at vpanic+0x136/frame 0xfffffe00febf99b0
  panic() at panic+0x43/frame 0xfffffe00febf9a10
  uma_dbg_free() at uma_dbg_free+0x103/frame 0xfffffe00febf9a30
  uma_zfree_arg() at uma_zfree_arg+0x95/frame 0xfffffe00febf9a80
  free() at free+0xa5/frame 0xfffffe00febf9ac0
  zio_do_crypt_data() at zio_do_crypt_data+0xccc/frame 0xfffffe00febf9c30
  spa_do_crypt_abd() at spa_do_crypt_abd+0x19e/frame 0xfffffe00febf9cd0
  zio_decrypt() at zio_decrypt+0x298/frame 0xfffffe00febf9da0
  zio_done() at zio_done+0x97e/frame 0xfffffe00febf9e10
  zio_execute() at zio_execute+0x78/frame 0xfffffe00febf9e40
  taskqueue_run_locked() at taskqueue_run_locked+0x1c2/frame 0xfffffe00febf9ec0
  taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe00febf9ef0
  fork_exit() at fork_exit+0x82/frame 0xfffffe00febf9f30
  fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00febf9f30
  --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
  KDB: enter: panic
  [ thread pid 0 tid 100304 ]
  Stopped at      kdb_enter+0x33: movq    $0,0x121b132(%rip)

This patch removes the thread-unsafe debug logic to prevent this panic.

How Has This Been Tested?

I've tested this by running the reproducing workload again after the patch was implemented, and the panic was successfully avoided.
I've also run the zfs-tests suite with good results.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuuti,l and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Copilot AI review requested due to automatic review settings January 16, 2026 18:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a critical thread-safety bug in FreeBSD's ZFS crypto implementation that could lead to double-free panics. The bug was caused by thread-unsafe debug code in the error path of zio_do_crypt_data() that attempted to save failed decryption buffers to global variables without proper synchronization.

Changes:

  • Removed duplicate global variable declarations (failed_decrypt_buf and failed_decrypt_size/faile_decrypt_size)
  • Removed thread-unsafe debug code in the error handling path that caused race conditions leading to double-free panics

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@asomers asomers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably be fixed with an atomic pointer swap. But removing it looks good to me too. I don't think it's needed.

@alek-p alek-p added Status: Code Review Needed Ready for review and testing Type: Defect Incorrect behavior (e.g. crash, hang) Status: Understood The root cause of the issue is known Component: Encryption "native encryption" feature labels Jan 20, 2026
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jan 20, 2026
@behlendorf behlendorf merged commit cd895f0 into openzfs:master Jan 21, 2026
30 of 32 checks passed
@alek-p alek-p deleted the freebsd_double_free_panic branch January 21, 2026 18:19
amotin pushed a commit to amotin/zfs that referenced this pull request Jan 29, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
mcmilk pushed a commit to mcmilk/zfs that referenced this pull request Jan 31, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
amotin pushed a commit to amotin/zfs that referenced this pull request Feb 3, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
amotin pushed a commit to amotin/zfs that referenced this pull request Feb 3, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
lundman pushed a commit to openzfsonosx/openzfs-fork that referenced this pull request Feb 5, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
tonyhutter pushed a commit that referenced this pull request Feb 5, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes #18140
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Feb 12, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Feb 23, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Feb 23, 2026
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes openzfs#18140
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Encryption "native encryption" feature Status: Accepted Ready to integrate (reviewed, tested) Status: Understood The root cause of the issue is known Type: Defect Incorrect behavior (e.g. crash, hang)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants