System information
| Type |
Version/Name |
| Distribution Name |
Ubuntu (Delphix Engine appliance) |
| Distribution Version |
24.04 LTS base |
| Kernel Version |
6.17.0 (x86_64) |
| Architecture |
x86_64 |
| OpenZFS Version |
master at or after 0ecf5e3f6; verified A/B against master at 545d66204d (pre-0ecf5e3f6) and 4655bdd8ab (post-0ecf5e3f6) |
The bug is independent of distribution and kernel — it is in lib/libzfs/libzfs_mnttab.c and reproduces against any libzfs built from a tree that contains commit 0ecf5e3f6.
This regression is currently on master only. No released or staging branch (zfs-2.2.x, zfs-2.3.x, zfs-2.4.x) contains 0ecf5e3f6 yet, so reverting before the next release branch is cut keeps downstream impact at zero.
Describe the problem you're observing
After our latest merge from upstream master, we started seeing ZFS_PROP_MOUNTED return 0 for filesystems that are plainly mounted.. which trips a "filesystems not mounted" check in our snapshot path and aborts the workflow.
After some investigation, I believe the cause is 0ecf5e3f6 libzfs/mnttab: always enable the cache (PR #18296), which silently turned libzfs_mnttab_cache(hdl, B_FALSE) into a no-op. The function is still part of the public libzfs.h API:
_LIBZFS_H void libzfs_mnttab_cache(libzfs_handle_t *, boolean_t);
But after 0ecf5e3f6 its body is:
void
libzfs_mnttab_cache(libzfs_handle_t *hdl, boolean_t enable)
{
/* This is a no-op to preserve ABI backward compatibility. */
(void) hdl, (void) enable;
}
The ABI is preserved, but the behavior the function used to provide — disabling the per-handle mnttab cache so that libzfs_mnttab_find consults /etc/mtab directly on every call — isn't there anymore. For consumers that hold more than one libzfs_handle_t in a process and rely on the cache being disabled for cross-handle correctness, this leads to wrong answers from ZFS_PROP_MOUNTED for filesystems that are mounted.
Looking at the commit, two pieces of the cache-disabled path got removed together:
- The cache-disabled fast path in
libzfs_mnttab_find that did fopen(MNTTAB) + getmntany() per call (and defensively libzfs_mnttab_fini'd any stray AVL state).
- The
if (avl_numnodes != 0) guard in libzfs_mnttab_add that kept the AVL empty when the cache was disabled.
Without those two pieces, the AVL is unconditionally populated by libzfs_mnttab_add. Once the AVL has any entries, libzfs_mnttab_find skips the /etc/mtab re-read (it only re-reads when the AVL is empty). So a handle that has done one zfs_mount will then return ENOENT from libzfs_mnttab_find for any other dataset that another handle (or out-of-process actor) has mounted — even though that dataset is plainly in /etc/mtab.
For us this shows up as ZFS_PROP_MOUNTED returning 0 for filesystems that are mounted, which aborts our snapshot workflow with a fatal "filesystems not mounted" check failure.
Describe how to reproduce the problem
Single-file C program below using only public libzfs.h API. No real mounts and no root permissions are needed; just supply the name of any currently-mounted ZFS dataset.
/*
* libzfs_mnttab_cache_repro.c
*
* Demonstrates that libzfs_mnttab_cache(hdl, B_FALSE) no longer disables
* the per-handle mnttab cache after openzfs/zfs commit 0ecf5e3f6, and
* that zfs_prop_get_int(ZFS_PROP_MOUNTED) consequently returns 0 for
* filesystems that are mounted.
*
* Build:
* cc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -o repro \
* libzfs_mnttab_cache_repro.c $(pkg-config --cflags --libs libzfs)
*
* (The two -D flags are needed because libspl's sys/stat.h references
* struct stat64; libzfs.pc.in does not currently set them itself.)
*
* Run (as a user that can read /etc/mtab):
* ./repro <name-of-any-currently-mounted-zfs-dataset>
*/
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/mnttab.h>
#include <libzfs.h>
int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr,
"usage: %s <currently-mounted-zfs-dataset>\n", argv[0]);
return (2);
}
const char *real_ds = argv[1];
libzfs_handle_t *hdl = libzfs_init();
if (hdl == NULL) {
fprintf(stderr, "libzfs_init failed\n");
return (1);
}
/* Ask libzfs to disable the per-handle mnttab cache. */
libzfs_mnttab_cache(hdl, B_FALSE);
/*
* Stand-in for what zfs_mount() does internally on every successful
* mount: lib/libzfs/libzfs_mount.c:582 calls libzfs_mnttab_add(hdl, ...)
* after do_mount(). In a real consumer, this happens implicitly; we
* call it directly here so the reproducer doesn't need root or a
* mountable dataset.
*
* Pre-0ecf5e3f6: this was a no-op because the AVL was empty and the
* old code guarded the insert with `if (avl_numnodes != 0)`.
* Post-0ecf5e3f6: the guard is gone and this unconditionally
* populates the AVL.
*/
libzfs_mnttab_add(hdl, "fake/dataset", "/fake/mountpoint", "rw");
/*
* Now query ZFS_PROP_MOUNTED on a real, currently-mounted dataset.
* This is the standard libzfs API a consumer uses to check mount
* state. Internally it calls libzfs_mnttab_find().
*
* Pre-0ecf5e3f6: cache disabled, find() fopens /etc/mtab, returns 1.
* Post-0ecf5e3f6: AVL has the fake entry from above (non-empty), so
* find() skips the /etc/mtab refresh, doesn't see real_ds in its
* AVL, returns ENOENT internally, and ZFS_PROP_MOUNTED evaluates to 0.
*/
zfs_handle_t *zhp = zfs_open(hdl, real_ds, ZFS_TYPE_FILESYSTEM);
if (zhp == NULL) {
fprintf(stderr, "zfs_open(%s) failed\n", real_ds);
libzfs_fini(hdl);
return (1);
}
uint64_t mounted = zfs_prop_get_int(zhp, ZFS_PROP_MOUNTED);
zfs_close(zhp);
int rc;
if (mounted) {
printf("OK: ZFS_PROP_MOUNTED reports %s as mounted\n", real_ds);
rc = 0;
} else {
printf("BUG: ZFS_PROP_MOUNTED reports %s as NOT mounted\n",
real_ds);
printf(" but %s IS mounted (see /etc/mtab and `zfs get mounted`).\n",
real_ds);
printf(" libzfs_mnttab_cache(hdl, B_FALSE) did not actually disable the cache.\n");
rc = 1;
}
libzfs_fini(hdl);
return (rc);
}
Expected output before 0ecf5e3f6:
OK: ZFS_PROP_MOUNTED reports <dataset> as mounted
Expected output after 0ecf5e3f6:
BUG: ZFS_PROP_MOUNTED reports <dataset> as NOT mounted
but <dataset> IS mounted (see /etc/mtab and `zfs get mounted`).
libzfs_mnttab_cache(hdl, B_FALSE) did not actually disable the cache.
Why this matters in a multi-handle consumer
The reproducer above is single-handle for minimality, and it directly calls libzfs_mnttab_add to stand in for what zfs_mount() does internally. In our actual usage we don't call libzfs_mnttab_add directly — we hold multiple libzfs_handle_ts and use them across threads, and zfs_mount() does the add for us under the hood. The same contract violation falls out naturally:
- Handle A calls
zfs_mount(zhp_x, ...) to mount dataset X. After do_mount() succeeds, zfs_mount_at calls libzfs_mnttab_add(hdl_a, ...) at lib/libzfs/libzfs_mount.c:582. Handle A's AVL now has X.
- Handle B calls
zfs_mount(zhp_y, ...) to mount dataset Y. Same path; handle B's AVL has Y.
- Some time later, handle A queries
zfs_prop_get_int(zhp_y, ZFS_PROP_MOUNTED) for dataset Y.
- Pre-
0ecf5e3f6: A's cache is disabled, so libzfs_mnttab_find consults /etc/mtab directly, finds Y, returns mounted = 1.
- Post-
0ecf5e3f6: A's AVL has X (non-empty), so libzfs_mnttab_find skips the /etc/mtab refresh, doesn't see Y in its AVL, returns ENOENT, and ZFS_PROP_MOUNTED evaluates to 0 — even though Y is plainly mounted.
That's the failure mode we're hitting. Mounts via one handle silently invalidate ZFS_PROP_MOUNTED queries on every other handle in the process.. no warning, no logged error, just wrong answers.
Include any warning/errors/backtraces from the system logs
Verified A/B on two otherwise-identical systems (same OS, same compiler, same reproducer build) differing only in the version of openzfs/zfs master they were built from:
Pre-0ecf5e3f6 (built from master at upstream 545d66204d, 2025-09-17 — before 0ecf5e3f6):
$ ./repro rpool/ROOT/<bootenv>/root
OK: ZFS_PROP_MOUNTED reports rpool/ROOT/<bootenv>/root as mounted
$ echo $?
0
Post-0ecf5e3f6 (built from master at upstream 4655bdd8ab, 2026-03-17 — after 0ecf5e3f6):
$ ./repro rpool/ROOT/<bootenv>/root
BUG: ZFS_PROP_MOUNTED reports rpool/ROOT/<bootenv>/root as NOT mounted
but rpool/ROOT/<bootenv>/root IS mounted (see /etc/mtab and `zfs get mounted`).
libzfs_mnttab_cache(hdl, B_FALSE) did not actually disable the cache.
$ echo $?
1
Three independent signals (/etc/mtab, zfs get mounted, zfs_prop_get_int(ZFS_PROP_MOUNTED)) all agree that the dataset is mounted on both systems; only the post-0ecf5e3f6 ZFS_PROP_MOUNTED disagrees, and only because the per-handle cache the consumer asked to disable is actually still on.
In the originally-affected production code path, this ZFS_PROP_MOUNTED == 0 trips a "filesystems are not mounted" assertion that aborts the workflow.
Background — how we use libzfs
We're a long-running, multi-threaded userland (Delphix Engine) that links libzfs directly. Nothing exotic at the libzfs boundary:
- We call
libzfs_init() to allocate a libzfs_handle_t. Handles are pooled; the pool grows lazily with concurrent demand and typically reaches 5–20 handles on a busy process. Handles are long-lived and reused.
- Immediately after each
libzfs_init() we call libzfs_mnttab_cache(hdl, B_FALSE).
- Each handle is then used for ordinary operations:
zfs_open, zfs_mount, zfs_prop_get_int(ZFS_PROP_MOUNTED), etc.
We disable the cache for exactly the case the cache can't handle correctly: mounts performed via one handle invisibly populate that handle's AVL, but other handles' AVLs don't see them. With the cache disabled, libzfs_mnttab_find is supposed to consult /etc/mtab — the actual source of truth, shared across handles and processes — on every call. That's the invariant that makes multi-handle usage correct.
It's the very first thing our getHandle() does after libzfs_init(), and has been since 2011, when we moved from forking the zfs(8) CLI to linking libzfs directly.
The simplest fix from our perspective would be to restore the prior cache-disabled mode so consumers can still opt out via libzfs_mnttab_cache(hdl, B_FALSE) — i.e., libzfs_mnttab_find consulting /etc/mtab directly when the cache is disabled, and libzfs_mnttab_add not populating the AVL in that mode. The field renames in 0ecf5e3f6 (libzfs_mnttab_cache AVL → zh_mnttab, libzfs_mnttab_update → mnttab_update, etc.) don't really matter; an implementer can keep the new names.
The commit message for 0ecf5e3f6 says "the zfs command always enables it anyway, and right now there's multiple places that do mount work that don't go through the cache anyway".. which is true for the CLI but doesn't really cover library consumers that hold multiple handles and explicitly opt out. We've been doing this since 2011, so I'd be surprised if we're the only ones — anything that links libzfs from a long-lived multi-threaded process and uses multiple handles concurrently would hit the same thing. Other consumers may simply not have picked up the change yet.
Happy to send a patch if it'd help.
References
System information
masterat or after0ecf5e3f6; verified A/B againstmasterat545d66204d(pre-0ecf5e3f6) and4655bdd8ab(post-0ecf5e3f6)The bug is independent of distribution and kernel — it is in
lib/libzfs/libzfs_mnttab.cand reproduces against any libzfs built from a tree that contains commit0ecf5e3f6.This regression is currently on
masteronly. No released or staging branch (zfs-2.2.x,zfs-2.3.x,zfs-2.4.x) contains0ecf5e3f6yet, so reverting before the next release branch is cut keeps downstream impact at zero.Describe the problem you're observing
After our latest merge from upstream
master, we started seeingZFS_PROP_MOUNTEDreturn0for filesystems that are plainly mounted.. which trips a "filesystems not mounted" check in our snapshot path and aborts the workflow.After some investigation, I believe the cause is
0ecf5e3f6libzfs/mnttab: always enable the cache (PR #18296), which silently turnedlibzfs_mnttab_cache(hdl, B_FALSE)into a no-op. The function is still part of the publiclibzfs.hAPI:But after
0ecf5e3f6its body is:The ABI is preserved, but the behavior the function used to provide — disabling the per-handle mnttab cache so that
libzfs_mnttab_findconsults/etc/mtabdirectly on every call — isn't there anymore. For consumers that hold more than onelibzfs_handle_tin a process and rely on the cache being disabled for cross-handle correctness, this leads to wrong answers fromZFS_PROP_MOUNTEDfor filesystems that are mounted.Looking at the commit, two pieces of the cache-disabled path got removed together:
libzfs_mnttab_findthat didfopen(MNTTAB) + getmntany()per call (and defensivelylibzfs_mnttab_fini'd any stray AVL state).if (avl_numnodes != 0)guard inlibzfs_mnttab_addthat kept the AVL empty when the cache was disabled.Without those two pieces, the AVL is unconditionally populated by
libzfs_mnttab_add. Once the AVL has any entries,libzfs_mnttab_findskips the/etc/mtabre-read (it only re-reads when the AVL is empty). So a handle that has done onezfs_mountwill then returnENOENTfromlibzfs_mnttab_findfor any other dataset that another handle (or out-of-process actor) has mounted — even though that dataset is plainly in/etc/mtab.For us this shows up as
ZFS_PROP_MOUNTEDreturning0for filesystems that are mounted, which aborts our snapshot workflow with a fatal "filesystems not mounted" check failure.Describe how to reproduce the problem
Single-file C program below using only public
libzfs.hAPI. No real mounts and no root permissions are needed; just supply the name of any currently-mounted ZFS dataset.Expected output before
0ecf5e3f6:Expected output after
0ecf5e3f6:Why this matters in a multi-handle consumer
The reproducer above is single-handle for minimality, and it directly calls
libzfs_mnttab_addto stand in for whatzfs_mount()does internally. In our actual usage we don't calllibzfs_mnttab_adddirectly — we hold multiplelibzfs_handle_ts and use them across threads, andzfs_mount()does theaddfor us under the hood. The same contract violation falls out naturally:zfs_mount(zhp_x, ...)to mount dataset X. Afterdo_mount()succeeds,zfs_mount_atcallslibzfs_mnttab_add(hdl_a, ...)atlib/libzfs/libzfs_mount.c:582. Handle A's AVL now has X.zfs_mount(zhp_y, ...)to mount dataset Y. Same path; handle B's AVL has Y.zfs_prop_get_int(zhp_y, ZFS_PROP_MOUNTED)for dataset Y.0ecf5e3f6: A's cache is disabled, solibzfs_mnttab_findconsults/etc/mtabdirectly, finds Y, returnsmounted = 1.0ecf5e3f6: A's AVL has X (non-empty), solibzfs_mnttab_findskips the/etc/mtabrefresh, doesn't see Y in its AVL, returnsENOENT, andZFS_PROP_MOUNTEDevaluates to0— even though Y is plainly mounted.That's the failure mode we're hitting. Mounts via one handle silently invalidate
ZFS_PROP_MOUNTEDqueries on every other handle in the process.. no warning, no logged error, just wrong answers.Include any warning/errors/backtraces from the system logs
Verified A/B on two otherwise-identical systems (same OS, same compiler, same reproducer build) differing only in the version of openzfs/zfs
masterthey were built from:Pre-
0ecf5e3f6(built frommasterat upstream545d66204d, 2025-09-17 — before0ecf5e3f6):Post-
0ecf5e3f6(built frommasterat upstream4655bdd8ab, 2026-03-17 — after0ecf5e3f6):Three independent signals (
/etc/mtab,zfs get mounted,zfs_prop_get_int(ZFS_PROP_MOUNTED)) all agree that the dataset is mounted on both systems; only the post-0ecf5e3f6ZFS_PROP_MOUNTEDdisagrees, and only because the per-handle cache the consumer asked to disable is actually still on.In the originally-affected production code path, this
ZFS_PROP_MOUNTED == 0trips a "filesystems are not mounted" assertion that aborts the workflow.Background — how we use libzfs
We're a long-running, multi-threaded userland (Delphix Engine) that links
libzfsdirectly. Nothing exotic at the libzfs boundary:libzfs_init()to allocate alibzfs_handle_t. Handles are pooled; the pool grows lazily with concurrent demand and typically reaches 5–20 handles on a busy process. Handles are long-lived and reused.libzfs_init()we calllibzfs_mnttab_cache(hdl, B_FALSE).zfs_open,zfs_mount,zfs_prop_get_int(ZFS_PROP_MOUNTED), etc.We disable the cache for exactly the case the cache can't handle correctly: mounts performed via one handle invisibly populate that handle's AVL, but other handles' AVLs don't see them. With the cache disabled,
libzfs_mnttab_findis supposed to consult/etc/mtab— the actual source of truth, shared across handles and processes — on every call. That's the invariant that makes multi-handle usage correct.It's the very first thing our
getHandle()does afterlibzfs_init(), and has been since 2011, when we moved from forking thezfs(8)CLI to linking libzfs directly.The simplest fix from our perspective would be to restore the prior cache-disabled mode so consumers can still opt out via
libzfs_mnttab_cache(hdl, B_FALSE)— i.e.,libzfs_mnttab_findconsulting/etc/mtabdirectly when the cache is disabled, andlibzfs_mnttab_addnot populating the AVL in that mode. The field renames in0ecf5e3f6(libzfs_mnttab_cacheAVL →zh_mnttab,libzfs_mnttab_update→mnttab_update, etc.) don't really matter; an implementer can keep the new names.The commit message for
0ecf5e3f6says "thezfscommand always enables it anyway, and right now there's multiple places that do mount work that don't go through the cache anyway".. which is true for the CLI but doesn't really cover library consumers that hold multiple handles and explicitly opt out. We've been doing this since 2011, so I'd be surprised if we're the only ones — anything that links libzfs from a long-lived multi-threaded process and uses multiple handles concurrently would hit the same thing. Other consumers may simply not have picked up the change yet.Happy to send a patch if it'd help.
References
0ecf5e3f6 libzfs/mnttab: always enable the cacheinclude/libzfs.hline 235:_LIBZFS_H void libzfs_mnttab_cache(libzfs_handle_t *, boolean_t);