Skip to content

Add UID-based dataset zoning for rootless container support#18167

Open
li-nkSN wants to merge 1 commit intoopenzfs:masterfrom
li-nkSN:feature/zoned-uid
Open

Add UID-based dataset zoning for rootless container support#18167
li-nkSN wants to merge 1 commit intoopenzfs:masterfrom
li-nkSN:feature/zoned-uid

Conversation

@li-nkSN
Copy link

@li-nkSN li-nkSN commented Jan 30, 2026

This implements zoned_uid - a ZFS property that grants visibility of a dataset to any user namespace owned by a specific UID.

Usage: zfs set zoned_uid=1000 pool/dataset

This solves the chicken-and-egg problem with rootless Podman + ZFS:

  • Current: zfs zone requires an existing namespace PID
  • Problem: Podman creates a new namespace on each invocation
  • Solution: Zone to UID, any namespace created by that UID sees datasets

Kernel changes:

  • zone_dataset_attach_uid() / zone_dataset_detach_uid() functions
  • Modified zone_dataset_visible() to check UID-based zoning
  • New ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls

Userspace changes:

  • libzfs calls attach/detach ioctls when setting zoned_uid property
  • Pool import restores UID-based zoning from stored property values

Current limitations (WIP):

  • Read-only access: zfs list, mount, read work
  • Write operations (create, snapshot, destroy) still require root

Motivation and Context

Description

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@li-nkSN
Copy link
Author

li-nkSN commented Jan 31, 2026

As an update testing Write operations (create, snapshot, destroy) etc locally...

@li-nkSN
Copy link
Author

li-nkSN commented Feb 1, 2026

Add write delegation support for zoned_uid

  This extends zoned_uid from read-only visibility to full write
  delegation, allowing authorized user namespaces to create, destroy,
  snapshot, clone, rename, and modify properties on delegated datasets.

  This completes the zoned_uid feature for rootless container support:
  - Podman can now use native ZFS storage without root privileges
  - Container layers are created/destroyed via ZFS clone/destroy
  - All operations authorized by matching namespace owner to zoned_uid

  Kernel changes:
  - zone_dataset_admin_check() authorizes write operations in SPL
  - Callback registration allows SPL to look up zoned_uid property
  - Security policy integration in zfs_secpolicy_{setprop,destroy,
    rename,snapshot,create_clone}()
  - Sandbox protection: delegation root cannot be destroyed or escaped
  - Fixed inglobalzone() to use current_user_ns() correctly

  Userspace changes:
  - libzfs check_parents() defers to kernel when zoned_uid is set

  Security model:
  - Caller must be in user namespace owned by zoned_uid value
  - CAP_SYS_ADMIN required within the user namespace
  - Operations confined to delegation subtree (rename cannot escape)
  - Delegation root itself is protected from destruction

  Tests added:
  - zoned_uid_006_pos: create child datasets
  - zoned_uid_007_pos: create snapshots
  - zoned_uid_008_pos: destroy (child ok, root denied)
  - zoned_uid_009_pos: rename within subtree
  - zoned_uid_010_pos: set properties
  - zoned_uid_011_neg: wrong UID denied

@li-nkSN li-nkSN force-pushed the feature/zoned-uid branch 6 times, most recently from 9cfc9b9 to 37ff4ca Compare February 1, 2026 06:10
@li-nkSN li-nkSN marked this pull request as ready for review February 1, 2026 06:38
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Feb 1, 2026
li-nkSN referenced this pull request in oci-playground/freebsd-podman-testing Feb 4, 2026
@li-nkSN
Copy link
Author

li-nkSN commented Feb 7, 2026

Hi everyone. I have the perception (perhaps incorrect) maintainers would like me to fix for freebsd. Because I like to test this locally, I have reached out to the BSD community in the post above. Then if there are freebsd testers available. I would feel more confident regarding BSD support.

@li-nkSN
Copy link
Author

li-nkSN commented Feb 8, 2026

Looks like FreeBSD will need a similar jails implementation for rootless container support. I think they might find this approach helpful as a reference.

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
- zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
RENAME, SETPROP)
- zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
- zone_dataset_admin_check() — static inline, always returns
ZONE_ADMIN_NOT_APPLICABLE
- zone_get_zoned_uid_fn_t callback typedef
- zone_register_zoned_uid_callback() — static inline no-op
- zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
to fall through to existing jail-based permission checks

@li-nkSN
Copy link
Author

li-nkSN commented Feb 8, 2026

@robn @behlendorf would someone review?

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this functionality, this is looking good and it would be very nice to have! I made a first pass over the PR and posted a few questions inline. Thanks for aligning the implementation so closely with the existing namespace delegation support and including additional test cases.

@wca @0mp can you comment on this since you authored the namespace delegation support for Linux.

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 15, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 15, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 15, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 15, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 16, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@li-nkSN
Copy link
Author

li-nkSN commented Feb 25, 2026

Looks like this is failing the Linux built-in build on Fedora 42 (see Test Summary page):

    CC      fs/zfs/os/linux/spl/spl-zone.o
    CC      lib/raid6/recov_avx2.o
  fs/zfs/os/linux/spl/spl-zone.c: In function ‘zone_dataset_visible’:
  fs/zfs/os/linux/spl/spl-zone.c:627:30: error: unused variable ‘zuds’ [-Werror=unused-variable]
    627 |         zone_uid_datasets_t *zuds;
        |                              ^~~~
  fs/zfs/os/linux/spl/spl-zone.c: At top level:
  fs/zfs/os/linux/spl/spl-zone.c:439:1: error: ‘zone_dataset_is_zoned_uid_root’ defined but not used [-Werror=unused-function]
    439 | zone_dataset_is_zoned_uid_root(const char *dataset, uid_t zoned_uid)
        | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  fs/zfs/os/linux/spl/spl-zone.c:153:1: error: ‘zone_uid_datasets_lookup’ defined but not used [-Werror=unused-function]
    153 | zone_uid_datasets_lookup(kuid_t owner)
        | ^~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors
  make[4]: *** [scripts/Makefile.build:287: fs/zfs/os/linux/spl/spl-zone.o] Error 1
  make[3]: *** [scripts/Makefile.build:544: fs/zfs] Error 2
  make[2]: *** [scripts/Makefile.build:544: fs] Error 2
  make[2]: *** Waiting for unfinished jobs....

Will and Mateusz co-authored the existing Linux namespace delegation support, see commit 4ed5e25, so I wanted to make sure they had the opportunity to at least comment.

I agree, we can absolutely integrate this as a Linux only feature initially. I didn't mean to suggest it was blocked on an equivalent FreeBSD implementation. The FreeBSD folks are welcome to add support when they have the time and interest.

However, what is a blocker is addressing the build failures on Linux Tony mentioned in the previous comment. I should be able to get you hopefully a last round on review feedback this week.

Hi @tonyhutter and @behlendorf thanks for your feedback. Regarding the Fedora 42 error. I believe this is a regression introduced after removing the unused scripts and rebasing the build off of master. I don't believe it was due to the script removal. I will investigate and try to address this issue today.

I will review the original namespace commit shared. I look forward to hearing from @wca @0mp if they find themselves available . Thank you.

Kind Regards,

Colin Williams

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 26, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@tonyhutter
Copy link
Contributor

@li-nkSN I've been trying to run these tests locally from my Fedora 42 VM, but I keep getting the same failures:

$ ./scripts/zfs-tests.sh -x -T zoned_uid
[2026-02-26T10:01:06.526039] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/setup (run as root) [00:00] [PASS]
[2026-02-26T10:01:06.660852] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_001_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.111071] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_002_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.325407] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_003_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.508003] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_004_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.634722] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_005_neg (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.789891] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_006_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:07.957293] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_007_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:08.240231] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_008_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:08.596570] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_009_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:08.796259] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_010_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:09.077836] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_011_neg (run as root) [00:00] [PASS]
[2026-02-26T10:01:09.295776] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_012_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:09.490968] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_013_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:09.714637] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/cleanup (run as root) [00:00] [PASS]

Here's the first failure from zoned_uid_006_pos:

[2026-02-26T10:01:07.789891] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_006_pos (run as root) [00:00] [FAIL]
10:01:07.64 ASSERTION: Authorized user namespace can create child datasets       
10:01:07.67 SUCCESS: zfs create testpool/testfs/deleg_root                       
10:01:07.68 SUCCESS: set_zoned_uid testpool/testfs/deleg_root 956                
10:01:07.69 NOTE: Delegation root created with zoned_uid=956                     
10:01:07.69 NOTE: Attempting to create child dataset from user namespace...      
10:01:07.71 NOTE: Create output: unshare: failed to execute zfs: No such file or directory
10:01:07.71 Failed to create child dataset from user namespace (status=127)      

To reproduce from ZFS source dir:

./autogen.sh && ./configure --enable-debug && make
sudo ./scripts/zfs.sh -r
sudo ./scripts/zfs-helpers.sh -i
./scripts/zfs-tests.sh -x -T zoned_uid

Any ideas?

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 27, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@li-nkSN
Copy link
Author

li-nkSN commented Feb 27, 2026

Looks like this is failing the Linux built-in build on Fedora 42 (see Test Summary page):

    CC      fs/zfs/os/linux/spl/spl-zone.o
    CC      lib/raid6/recov_avx2.o
  fs/zfs/os/linux/spl/spl-zone.c: In function ‘zone_dataset_visible’:
  fs/zfs/os/linux/spl/spl-zone.c:627:30: error: unused variable ‘zuds’ [-Werror=unused-variable]
    627 |         zone_uid_datasets_t *zuds;
        |                              ^~~~
  fs/zfs/os/linux/spl/spl-zone.c: At top level:
  fs/zfs/os/linux/spl/spl-zone.c:439:1: error: ‘zone_dataset_is_zoned_uid_root’ defined but not used [-Werror=unused-function]
    439 | zone_dataset_is_zoned_uid_root(const char *dataset, uid_t zoned_uid)
        | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  fs/zfs/os/linux/spl/spl-zone.c:153:1: error: ‘zone_uid_datasets_lookup’ defined but not used [-Werror=unused-function]
    153 | zone_uid_datasets_lookup(kuid_t owner)
        | ^~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors
  make[4]: *** [scripts/Makefile.build:287: fs/zfs/os/linux/spl/spl-zone.o] Error 1
  make[3]: *** [scripts/Makefile.build:544: fs/zfs] Error 2
  make[2]: *** [scripts/Makefile.build:544: fs] Error 2
  make[2]: *** Waiting for unfinished jobs....

Will and Mateusz co-authored the existing Linux namespace delegation support, see commit 4ed5e25, so I wanted to make sure they had the opportunity to at least comment.
I agree, we can absolutely integrate this as a Linux only feature initially. I didn't mean to suggest it was blocked on an equivalent FreeBSD implementation. The FreeBSD folks are welcome to add support when they have the time and interest.
However, what is a blocker is addressing the build failures on Linux Tony mentioned in the previous comment. I should be able to get you hopefully a last round on review feedback this week.

Hi @tonyhutter and @behlendorf thanks for your feedback. Regarding the Fedora 42 error. I believe this is a regression introduced after removing the unused scripts and rebasing the build off of master. I don't believe it was due to the script removal. I will investigate and try to address this issue today.

I will review the original namespace commit shared. I look forward to hearing from @wca @0mp if they find themselves available . Thank you.

Kind Regards,

Colin Williams

Hi @behlendorf and @tonyhutter

I made the investigation and look at the latest commit message for details. The tinybuild kernel config change set CONFIG_USER_NS=n. Then this suggested better error messages.

If I understand correctly, this feature is now passing in all of the CI environments.

@li-nkSN I've been trying to run these tests locally from my Fedora 42 VM, but I keep getting the same failures:

$ ./scripts/zfs-tests.sh -x -T zoned_uid
[2026-02-26T10:01:06.526039] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/setup (run as root) [00:00] [PASS]
[2026-02-26T10:01:06.660852] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_001_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.111071] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_002_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.325407] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_003_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.508003] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_004_pos (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.634722] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_005_neg (run as root) [00:00] [PASS]
[2026-02-26T10:01:07.789891] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_006_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:07.957293] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_007_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:08.240231] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_008_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:08.596570] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_009_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:08.796259] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_010_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:09.077836] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_011_neg (run as root) [00:00] [PASS]
[2026-02-26T10:01:09.295776] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_012_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:09.490968] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_013_pos (run as root) [00:00] [FAIL]
[2026-02-26T10:01:09.714637] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/cleanup (run as root) [00:00] [PASS]

Here's the first failure from zoned_uid_006_pos:

[2026-02-26T10:01:07.789891] Test (Linux): /var/tmp/zfs/tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_006_pos (run as root) [00:00] [FAIL]
10:01:07.64 ASSERTION: Authorized user namespace can create child datasets       
10:01:07.67 SUCCESS: zfs create testpool/testfs/deleg_root                       
10:01:07.68 SUCCESS: set_zoned_uid testpool/testfs/deleg_root 956                
10:01:07.69 NOTE: Delegation root created with zoned_uid=956                     
10:01:07.69 NOTE: Attempting to create child dataset from user namespace...      
10:01:07.71 NOTE: Create output: unshare: failed to execute zfs: No such file or directory
10:01:07.71 Failed to create child dataset from user namespace (status=127)      

To reproduce from ZFS source dir:

./autogen.sh && ./configure --enable-debug && make
sudo ./scripts/zfs.sh -r
sudo ./scripts/zfs-helpers.sh -i
./scripts/zfs-tests.sh -x -T zoned_uid

Any ideas?

@tonyhutter

failed to execute zfs: No such file or directory is the smoking gun. A test setup / path issue likely because zfs is not installed system wide.

Try sudo make install before running the tests.

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 27, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@tonyhutter
Copy link
Contributor

Try sudo make install before running the tests.

An install is not required to run ZTS locally. This fixed the issue for me (run within zfs source directory):

$ sed -i 's/sudo -u /user_run /g' tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_*

Can you update your PR with the sed change above?

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 28, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@li-nkSN
Copy link
Author

li-nkSN commented Feb 28, 2026

Try sudo make install before running the tests.

An install is not required to run ZTS locally. This fixed the issue for me (run within zfs source directory):

$ sed -i 's/sudo -u /user_run /g' tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_*

Can you update your PR with the sed change above?

Try sudo make install before running the tests.

An install is not required to run ZTS locally. This fixed the issue for me (run within zfs source directory):

$ sed -i 's/sudo -u /user_run /g' tests/zfs-tests/tests/functional/zoned_uid/zoned_uid_*

Can you update your PR with the sed change above?

@tonyhutter I was able to reproduce your issue but found the sed change didn't resolve for multiple runs. Then I created the helper: https://github.com/openzfs/zfs/pull/18167/changes#diff-abda02af92e31d80534ac3bf82322ab6165637a4a1c8ad18efb8bd5f27c46bc4R76 which resolved the issue.

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Feb 28, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Tests: zoned_uid_001 through zoned_uid_011

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@tonyhutter
Copy link
Contributor

@li-nkSN thanks, that fixed the local ZTS tests for me. I will take another look at this PR when I get back into the office.

@li-nkSN
Copy link
Author

li-nkSN commented Mar 3, 2026

@li-nkSN thanks, that fixed the local ZTS tests for me. I will take another look at this PR when I get back into the office.

@tonyhutter @behlendorf . I must be forward in that. I developed this feature initially on the Cachy OS ZFS 2.4.0 based source tree on my server. I furthermore wrote tests locally against the same server. The reason that I didn't start from the upstream ZFS master.

It wasn't clear (to me) when I started regarding what ZFS patching existed on the CachyOS build for ZFS. My other project sources being on the same platform I couldn't take a risk disabling the filesystem.

So after development, I then upstreamed the support back into this ZFS master project. If I was more informed regarding differences between CachyOS ZFS and it's supposed ZFS patches, etc. I might have tried another approach at the start. Because it is an effort patching between the versions.

I believe this contributed to some confusion regarding the testing processes. My original tests were "ported" to the upstream. And therefore, the CI system was perhaps the first attempt at integrating and porting the tests between the versions. However, I do believe that this did just lead to further testing of this code between different environments.

Anyhow I apologize if it has caused any confusion regarding local testing. I recently recognized I could run the source against my ubuntu laptop and developed the latest test fix based on that. I did also awhile back reach out to CachyOS regarding the CachyOS ZFS patches. I have yet heard back regarding that. But I am looking forward to this feature making it's way "down the tubes" so to speak. And perhaps I will hear back regarding my informal inquiries.

Finally, I have been running my modified patched versions against my server for quite some time now. Anyhow let me know if I can be of any further assistance.

@darkbasic
Copy link

I did also awhile back reach out to CachyOS regarding the CachyOS ZFS patches.

@li-nkSN try asking on Discord. They have been very friendly with me and they reviewed and merged my patches in a matter of days so if you have any questions I'm sure they will answer them.

@li-nkSN
Copy link
Author

li-nkSN commented Mar 4, 2026

I did also awhile back reach out to CachyOS regarding the CachyOS ZFS patches.

@li-nkSN try asking on Discord. They have been very friendly with me and they reviewed and merged my patches in a matter of days so if you have any questions I'm sure they will answer them.

@darkbasic I messaged you there. IMO the feature is well developed by above and ready for review / merge. But I will ask again for guidance on running ZFS / master via CachyOS.

@tonyhutter
Copy link
Contributor

Some first pass comments:

  1. You mention in the PR description:

Current limitations (WIP):

  • Write operations (create, snapshot, destroy) still require root

Can you talk about what you mean by that? I ask, because I'm able to create datasets as a user when I used the combined zoned_uid=<my_UID> + unshare --map-root-user ... Is that what you mean by "still require root" (--map-root-user) for create?

  1. I'm unable to create sub-datasets with zoned_uid=<my_UID>. Example:
# Create pool
hutter@fedora42:~/zfs$ sudo ./zpool create tank ./file

# Verify we can't create datasets as a normal user
hutter@fedora42:~/zfs$ ./zfs create tank/ds1
cannot create 'tank/ds1': permission denied

# Setup zoned_uid prop
hutter@fedora42:~/zfs$ id
uid=1000(hutter) gid=1000(hutter) groups=1000(hutter),10(wheel)

hutter@fedora42:~/zfs$ sudo ./zfs set zoned_uid=1000 tank

# Enter namespace
hutter@fedora42:~$ unshare --map-root-user bash
root@fedora42:~# 

# Success - I made a new dataset as via namespaces
root@fedora42:~/zfs$ ./zfs create tank/ds1
cannot mount '/tank/ds1': failed to create mountpoint: Permission denied
filesystem successfully created, but not mounted

root@fedora42:~/zfs# ./zfs list
NAME       USED  AVAIL  REFER  MOUNTPOINT
tank       150K  39.9M    24K  /tank
tank/ds1    24K  39.9M    24K  /tank/ds1

# Why can't I create a sub-dataset?
root@fedora42:~/zfs# ./zfs create tank/ds1/ds2
cannot create 'tank/ds1/ds2': permission denied

  1. When I try to create a dataset without permission, I correctly get the error:
hutter@fedora42:~/zfs$ ./zfs create tank/ds3
cannot create 'tank/ds3': permission denied

However, if you try to zfs create in a namespace, and the UID doesn't match zoned_uid:

hutter@fedora42:~/zfs$ unshare --map-root-user ./zfs create tank/ds3
cannot create 'tank/ds3': no such pool 'tank'

This is somewhat of a misleading message, as the pool named 'tank' does exist. Is it possible to return a "permission denied" here as well?

@@ -0,0 +1,75 @@
# SPDX-License-Identifier: CDDL-1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file (libtest_supplement.shlib) meant to be checked in? I didn't see where it gets included.

function get_zoned_uid
{
typeset dataset=$1
zfs get -H -p -o value zoned_uid $dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- zfs get -H -p -o value zoned_uid $dataset
+ get_prop zoned_uid $dataset

typeset uid=$1
shift
typeset zfs_cmd
zfs_cmd=$(which zfs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the off-chance that zfs is in a directory with spaces in its name:

- zfs_cmd=$(which zfs)
+ zfs_cmd="$(which zfs)"

if [[ "$actual_uq" != "50M" ]]; then
log_fail "Userquota not set correctly: expected 50M, got $actual_uq"
fi
log_note "Userquota set successfully to 50M"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tests where you are looking to see if a numerical value is mostly within an expected range, I would write it like this (untested):

typeset actual_uq=$(get_prop userquota@0 $TESTPOOL/$TESTFS/deleg_root/child)
if ! within_percent "$actual_uq" $((50 * 1048576)) 99 ; then
	log_fail "Userquota not set correctly: expected ~50M, got $actual_uq"
fi

(see functions in tests/zfs-tests/include/math.shlib)

@li-nkSN
Copy link
Author

li-nkSN commented Mar 5, 2026

Some first pass comments:

1. You mention in the PR description:

Current limitations (WIP):

  • Write operations (create, snapshot, destroy) still require root

Can you talk about what you mean by that? I ask, because I'm able to create datasets as a user when I used the combined zoned_uid=<my_UID> + unshare --map-root-user ... Is that what you mean by "still require root" (--map-root-user) for create?

The comment was from Jan 30 / 31 , at the time of the comment a WIP. All functionality was completed months ago. This was from when I opened as a draft.

2. I'm unable to create sub-datasets with `zoned_uid=<my_UID>`.  Example:
# Create pool
hutter@fedora42:~/zfs$ sudo ./zpool create tank ./file

# Verify we can't create datasets as a normal user
hutter@fedora42:~/zfs$ ./zfs create tank/ds1
cannot create 'tank/ds1': permission denied

# Setup zoned_uid prop
hutter@fedora42:~/zfs$ id
uid=1000(hutter) gid=1000(hutter) groups=1000(hutter),10(wheel)

hutter@fedora42:~/zfs$ sudo ./zfs set zoned_uid=1000 tank

# Enter namespace
hutter@fedora42:~$ unshare --map-root-user bash
root@fedora42:~# 

# Success - I made a new dataset as via namespaces
root@fedora42:~/zfs$ ./zfs create tank/ds1
cannot mount '/tank/ds1': failed to create mountpoint: Permission denied
filesystem successfully created, but not mounted

root@fedora42:~/zfs# ./zfs list
NAME       USED  AVAIL  REFER  MOUNTPOINT
tank       150K  39.9M    24K  /tank
tank/ds1    24K  39.9M    24K  /tank/ds1

# Why can't I create a sub-dataset?
root@fedora42:~/zfs# ./zfs create tank/ds1/ds2
cannot create 'tank/ds1/ds2': permission denied

Resolved by setting PROP_INHERIT vs PROP_DEFAULT . I did not test sub-datasets as not a feature I used. Test added 014

3. When I try to create a dataset without permission, I correctly get the error:
hutter@fedora42:~/zfs$ ./zfs create tank/ds3
cannot create 'tank/ds3': permission denied

However, if you try to zfs create in a namespace, and the UID doesn't match zoned_uid:

hutter@fedora42:~/zfs$ unshare --map-root-user ./zfs create tank/ds3
cannot create 'tank/ds3': no such pool 'tank'

This is somewhat of a misleading message, as the pool named 'tank' does exist. Is it possible to return a "permission denied" here as well?

This is existing behavior not specific to zoned_uid and here is a script for you to see that. The point is to not expose information about existence of resources when permission denied, etc....

#!/bin/ksh -p
#
# Manual demo: Shows how the existing ZFS user namespace visibility
# logic handles unauthorized access. Not included in the ZFS test suite
# and not specific to zoned_uid — this behavior is inherited from the
# existing namespace-based "zfs zone" delegation code path.
#
# Run: sudo env PATH="$PATH" ksh userns_error_msg_demo.ksh
#
# Demonstrates that unauthorized namespaces see "no such pool" instead
# of "permission denied", preventing information leakage about which
# pools/datasets exist on the system.
#

# Resolve absolute paths — caller must ensure zfs/zpool are in PATH
# (e.g., run with: sudo env PATH="$PATH" ksh userns_error_msg_demo.ksh)
ZFS=$(which zfs 2>/dev/null)
ZPOOL=$(which zpool 2>/dev/null)

if [[ -z "$ZFS" || -z "$ZPOOL" ]]; then
	echo "ERROR: zfs/zpool not found in PATH" >&2
	echo "Try: sudo env PATH=\"\$PATH\" ksh $0" >&2
	exit 1
fi

POOL="testpool_errmsg"
FILE="/var/tmp/errmsg-vdev"

# Use UIDs that don't need to exist as real users.
# We create temporary accounts so sudo -u works.
AUTH_USER="ztest_auth"
UNAUTH_USER="ztest_unauth"
AUTH_UID=""
UNAUTH_UID=""

function create_test_users
{
	useradd -M -N -s /usr/sbin/nologin "$AUTH_USER" 2>/dev/null
	useradd -M -N -s /usr/sbin/nologin "$UNAUTH_USER" 2>/dev/null
	AUTH_UID=$(id -u "$AUTH_USER")
	UNAUTH_UID=$(id -u "$UNAUTH_USER")
}

function remove_test_users
{
	userdel "$AUTH_USER" 2>/dev/null
	userdel "$UNAUTH_USER" 2>/dev/null
}

function cleanup
{
	$ZFS destroy -rf $POOL/delegated 2>/dev/null
	$ZPOOL destroy $POOL 2>/dev/null
	rm -f $FILE
	remove_test_users
}

trap cleanup EXIT

echo "=== Error message test for unauthorized namespace access ==="

# Create temporary users
create_test_users
echo "Using auth UID=$AUTH_UID ($AUTH_USER), unauth UID=$UNAUTH_UID ($UNAUTH_USER)"

# Setup
truncate -s 256M $FILE
$ZPOOL create $POOL $FILE
$ZFS create $POOL/delegated
$ZFS set zoned_uid=$AUTH_UID $POOL/delegated

echo ""
echo "--- Pool and dataset created ---"
$ZFS list -r $POOL
echo "zoned_uid on $POOL/delegated: $($ZFS get -H -o value zoned_uid $POOL/delegated)"

echo ""
echo "--- Test 1: Matching UID ($AUTH_UID) - should succeed ---"
result1=$(sudo -u "$AUTH_USER" unshare --user --mount --map-root-user \
    "$ZFS" create $POOL/delegated/authorized_child 2>&1)
status1=$?
echo "  zfs create $POOL/delegated/authorized_child"
echo "  status=$status1 output: ${result1:-success}"

echo ""
echo "--- Test 2: Wrong UID ($UNAUTH_UID) creating under delegated dataset ---"
result2=$(sudo -u "$UNAUTH_USER" unshare --user --mount --map-root-user \
    "$ZFS" create $POOL/delegated/unauthorized_child 2>&1)
status2=$?
echo "  zfs create $POOL/delegated/unauthorized_child"
echo "  status=$status2 output: $result2"

echo ""
echo "--- Test 3: Wrong UID ($UNAUTH_UID) listing delegated dataset ---"
result3=$(sudo -u "$UNAUTH_USER" unshare --user --mount --map-root-user \
    "$ZFS" list $POOL/delegated 2>&1)
status3=$?
echo "  zfs list $POOL/delegated"
echo "  status=$status3 output: $result3"

echo ""
echo "--- Test 4: Wrong UID ($UNAUTH_UID) listing pool ---"
result4=$(sudo -u "$UNAUTH_USER" unshare --user --mount --map-root-user \
    "$ZFS" list $POOL 2>&1)
status4=$?
echo "  zfs list $POOL"
echo "  status=$status4 output: $result4"

echo ""
echo "--- Test 5: Wrong UID ($UNAUTH_UID) creating on nonexistent pool ---"
result5=$(sudo -u "$UNAUTH_USER" unshare --user --mount --map-root-user \
    "$ZFS" create nosuchpool/ds1 2>&1)
status5=$?
echo "  zfs create nosuchpool/ds1"
echo "  status=$status5 output: $result5"

echo ""
echo "--- Summary ---"
echo "Test 2 vs Test 5 shows whether 'wrong UID on real pool' is"
echo "distinguishable from 'nonexistent pool'. If both return the"
echo "same error, no information is leaked."
echo ""
echo "Done."

li-nkSN added a commit to li-nkSN/zfs that referenced this pull request Mar 5, 2026
This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk
- zoned_uid registered as PROP_INHERIT so child datasets
  inherit the delegation, enabling sub-dataset creation
- zfs_get_zoned_uid() uses dsl_prop_get setpoint to identify
  the true delegation root, correctly distinguishing inherited
  values from locally-set ones for destroy/rename policy checks

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Test improvements:
- run_in_userns helper resolves absolute zfs path to handle
  environments where PATH does not include zfs (source builds)
- Test 004 updated: zoned_uid now inherits (PROP_INHERIT), test
  verifies inheritance and override behavior
- Test 013 uses within_percent with parseable byte output (-Hp)
  for robust quota value comparison across environments
- Test 014 added: verifies grandchild dataset creation from user
  namespace, confirming inherited zoned_uid delegation works
- Shellcheck SC2155 fixes across all test scripts

Tests: zoned_uid_001 through zoned_uid_014

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@li-nkSN li-nkSN force-pushed the feature/zoned-uid branch from e3c0da7 to cbcd64e Compare March 5, 2026 08:50
@li-nkSN
Copy link
Author

li-nkSN commented Mar 5, 2026

@tonyhutter above I responded to all of your comments. Then regarding your suggested changes. I added them in be7a78d . If you are satisfied would you click resolve or do you want me to do that?

This implements zoned_uid - a ZFS property that delegates dataset
visibility and administration to user namespaces owned by a specific
UID, enabling rootless Podman/Docker with native ZFS storage.

Usage: zfs set zoned_uid=1000 pool/dataset

Problem solved:
- zfs zone requires an existing namespace PID
- Podman creates a new namespace on each container start
- Solution: delegate to UID, any namespace owned by that UID is
  authorized

Delegated operations:
- Visibility: zfs list, get, mount (read-only access)
- Create: child datasets and clones
- Snapshot: create snapshots
- Destroy: children only (delegation root protected)
- Rename: within delegation subtree only
- Properties: set on delegated datasets

Security model:
- Namespace owner UID must match zoned_uid value
- CAP_SYS_ADMIN required within the user namespace
- Delegation root cannot be destroyed or escaped via rename

Kernel changes:
- zone_dataset_attach_uid()/detach_uid() in SPL
- zone_dataset_admin_check() for write authorization
- Callback registration for zoned_uid property lookup
- Security policy hooks in zfs_secpolicy_*() functions
- Fixed inglobalzone() to use current_user_ns()
- zfs_prop_set_special() handles attach/detach as property
  side-effects, eliminating the need for dedicated ioctls
- spa_import_os() restores zoned_uid delegations kernel-side
  on pool import via dmu_objset_find() walk
- zoned_uid registered as PROP_INHERIT so child datasets
  inherit the delegation, enabling sub-dataset creation
- zfs_get_zoned_uid() uses dsl_prop_get setpoint to identify
  the true delegation root, correctly distinguishing inherited
  values from locally-set ones for destroy/rename policy checks

Userspace changes:
- check_parents() defers to kernel when zoned_uid set

FreeBSD compatibility:
- include/os/freebsd/spl/sys/zone.h — Added FreeBSD stubs:
  - zone_uid_op_t enum (ZONE_OP_CREATE, SNAPSHOT, CLONE, DESTROY,
    RENAME, SETPROP)
  - zone_admin_result_t enum (NOT_APPLICABLE, ALLOWED, DENIED)
  - zone_dataset_admin_check() — static inline, always returns
    ZONE_ADMIN_NOT_APPLICABLE
  - zone_dataset_attach_uid() — static inline, returns ENXIO
  - zone_dataset_detach_uid() — static inline, returns ENXIO
  - zone_get_zoned_uid_fn_t callback typedef
  - zone_register_zoned_uid_callback() — static inline no-op
  - zone_unregister_zoned_uid_callback() — static inline no-op
- On FreeBSD, every zone_dataset_admin_check() call returns
  ZONE_ADMIN_NOT_APPLICABLE, causing all security policy functions
  to fall through to existing jail-based permission checks
- Setting zoned_uid on FreeBSD returns ENXIO since user namespace
  delegation requires Linux user namespaces

Addressed review feedback from PR openzfs#18167:
- Removed dedicated ZFS_IOC_USERNS_ATTACH_UID/DETACH_UID ioctls;
  attach/detach is now handled kernel-side as a property side-effect
  in zfs_prop_set_special()
- Moved pool import delegation restoration from userspace
  (zpool_restore_zoned) to kernel-side in spa_import_os()
- Removed unnecessary suppression file additions
- Reverted ABI files to upstream (will regenerate from CI)
- Added test scripts to tests/zfs-tests/tests/Makefile.am

Fix CONFIG_USER_NS=n build failure and improve error reporting:

Upstream CI commit 640a217 ("CI: Test & fix Linux ZFS built-in
build", Tony Hutter) added a tinyconfig built-in kernel build test
to Fedora runners, which compiles with CONFIG_USER_NS disabled,
exposing unguarded static functions and variables that cause fatal
-Werror=unused-function/-Werror=unused-variable errors.

- Fixed #ifdef CONFIG_USER_NS guards for zone_uid_datasets_lookup(),
  zone_dataset_is_zoned_uid_root(), and the zuds variable in
  zone_dataset_visible()
- Added ZFS_ERR_NO_USER_NS_SUPPORT error code so users get a clear
  message ("kernel was built without user namespace support") instead
  of a generic "I/O error" when CONFIG_USER_NS is disabled
- Translate ENXIO from zone_dataset_attach_uid()/detach_uid() in
  zfs_prop_set_special() to ZFS_ERR_NO_USER_NS_SUPPORT
- Also fixes a pre-existing bug in the upstream
  zfs_ioc_userns_attach()/zfs_ioc_userns_detach() where ENXIO from
  zone_dataset_attach()/detach() was not translated, producing the
  same confusing "I/O error" on kernels without CONFIG_USER_NS
- Synced pyzfs constants with zfs.h (added missing
  ZFS_ERR_ASHIFT_MISMATCH, ZFS_ERR_STREAM_LARGE_MICROZAP,
  ZFS_ERR_TOO_MANY_SITOUTS, and the new
  ZFS_ERR_NO_USER_NS_SUPPORT)

Test improvements:
- run_in_userns helper resolves absolute zfs path to handle
  environments where PATH does not include zfs (source builds)
- Test 004 updated: zoned_uid now inherits (PROP_INHERIT), test
  verifies inheritance and override behavior
- Test 013 uses within_percent with parseable byte output (-Hp)
  for robust quota value comparison across environments
- Test 014 added: verifies grandchild dataset creation from user
  namespace, confirming inherited zoned_uid delegation works
- Shellcheck SC2155 fixes across all test scripts

Tests: zoned_uid_001 through zoned_uid_014

Signed-off-by: Colin K. Williams <colin@li-nk.org>
@li-nkSN li-nkSN force-pushed the feature/zoned-uid branch from cbcd64e to be7a78d Compare March 5, 2026 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants