Skip to content

Report space metrics per allocation class#18238

Open
ryan-moeller wants to merge 2 commits intoopenzfs:masterfrom
KlaraSystems:csr-dalloc
Open

Report space metrics per allocation class#18238
ryan-moeller wants to merge 2 commits intoopenzfs:masterfrom
KlaraSystems:csr-dalloc

Conversation

@ryan-moeller
Copy link

@ryan-moeller ryan-moeller commented Feb 19, 2026

Motivation and Context

The existing zpool properties accounting pool space (size, allocated,
fragmentation, expandsize, free, capacity) are based on the normal
metaslab class or are cumulative properties of several classes combined.

It would be useful to have visibility into per-class metaslab allocator stats.

Sponsored by Klara, Inc.

Description

Add properties reporting the existing space accounting metrics for each
metaslab class individually.

Also add pool-wide USABLE and USED properties reporting deflated size and allocated space, respectively.

Update ZTS to recognize the new properties and validate reported values.

While here I noticed an incorrect format description for the vdev capacity
property. This is fixed in a separate commit.

I also found that "fragmentation" was missing from the list of parsable
pool properties in ZTS, so I've added that while reformatting
zpool_get_parsable.cfg to add the new properties.

How Has This Been Tested?

ZTS test added and full test suite passed. The test tries to exercise every
statistic of the metaslab allocation classes.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@github-actions github-actions bot added the Status: Work in Progress Not yet ready for general review label Feb 19, 2026
@ryan-moeller
Copy link
Author

@amotin Here is my addition of the metrics, on top of your commit from #18222

I have not added properties exposing the new dalloc stat yet. Let me know what you would like added or changed, if you have some idea already.

@ryan-moeller
Copy link
Author

  • Ignore cstyle bugs

@behlendorf
Copy link
Contributor

@ryan-moeller it's good to see you back. I've just merged #18222 so you can rebase this and drop that commit from your stack.

@ryan-moeller
Copy link
Author

Thanks!

  • Rebased

@ryan-moeller
Copy link
Author

ryan-moeller commented Feb 19, 2026

For additional context, here is an early test script and some output demonstrating the new properties:

test.sh

Snippets of new zpool properties after exercising each allocation class
## Setup
vfs.zfs.embedded_slog_min_ms: 64 -> 8

## Normal Class
normal_size                         11G
normal_capacity                     0%
normal_free                         11.0G
normal_allocated                    344K
normal_available                    7.33G
normal_expandsize                   -
normal_fragmentation                0%
special_size                        0
special_capacity                    0%
special_free                        0
special_allocated                   0
special_available                   0
special_expandsize                  -
special_fragmentation               -
dedup_size                          0
dedup_capacity                      0%
dedup_free                          0
dedup_allocated                     0
dedup_available                     0
dedup_expandsize                    -
dedup_fragmentation                 -
log_size                            0
log_capacity                        0%
log_free                            0
log_allocated                       0
log_available                       0
log_expandsize                      -
log_fragmentation                   -
embedded_log_size                   512M
embedded_log_capacity               0%
embedded_log_free                   512M
embedded_log_allocated              0
embedded_log_available              341M
embedded_log_expandsize             -
embedded_log_fragmentation          0%
special_embedded_log_size           0
special_embedded_log_capacity       0%
special_embedded_log_free           0
special_embedded_log_allocated      0
special_embedded_log_available      0
special_embedded_log_expandsize     -
special_embedded_log_fragmentation  -

## Embedded Log Class
normal_size                         11G
normal_capacity                     8%
normal_free                         10.1G
normal_allocated                    923M
normal_available                    7.33G
normal_expandsize                   -
normal_fragmentation                0%
special_size                        0
special_capacity                    0%
special_free                        0
special_allocated                   0
special_available                   0
special_expandsize                  -
special_fragmentation               -
dedup_size                          0
dedup_capacity                      0%
dedup_free                          0
dedup_allocated                     0
dedup_available                     0
dedup_expandsize                    -
dedup_fragmentation                 -
log_size                            0
log_capacity                        0%
log_free                            0
log_allocated                       0
log_available                       0
log_expandsize                      -
log_fragmentation                   -
embedded_log_size                   512M
embedded_log_capacity               2%
embedded_log_free                   498M
embedded_log_allocated              14.2M
embedded_log_available              341M
embedded_log_expandsize             -
embedded_log_fragmentation          0%
special_embedded_log_size           0
special_embedded_log_capacity       0%
special_embedded_log_free           0
special_embedded_log_allocated      0
special_embedded_log_available      0
special_embedded_log_expandsize     -
special_embedded_log_fragmentation  -

## Special Class
normal_size                         11G
normal_capacity                     8%
normal_free                         10.1G
normal_allocated                    939M
normal_available                    7.33G
normal_expandsize                   -
normal_fragmentation                0%
special_size                        2.50G
special_capacity                    16%
special_free                        2.08G
special_allocated                   434M
special_available                   1.67G
special_expandsize                  -
special_fragmentation               2%
dedup_size                          0
dedup_capacity                      0%
dedup_free                          0
dedup_allocated                     0
dedup_available                     0
dedup_expandsize                    -
dedup_fragmentation                 -
log_size                            0
log_capacity                        0%
log_free                            0
log_allocated                       0
log_available                       0
log_expandsize                      -
log_fragmentation                   -
embedded_log_size                   512M
embedded_log_capacity               0%
embedded_log_free                   512M
embedded_log_allocated              6K
embedded_log_available              341M
embedded_log_expandsize             -
embedded_log_fragmentation          0%
special_embedded_log_size           256M
special_embedded_log_capacity       0%
special_embedded_log_free           256M
special_embedded_log_allocated      0
special_embedded_log_available      170M
special_embedded_log_expandsize     -
special_embedded_log_fragmentation  0%

## Special Embedded Log Class
normal_size                         11G
normal_capacity                     8%
normal_free                         10.1G
normal_allocated                    938M
normal_available                    7.33G
normal_expandsize                   -
normal_fragmentation                0%
special_size                        2.50G
special_capacity                    34%
special_free                        1.64G
special_allocated                   886M
special_available                   1.67G
special_expandsize                  -
special_fragmentation               3%
dedup_size                          0
dedup_capacity                      0%
dedup_free                          0
dedup_allocated                     0
dedup_available                     0
dedup_expandsize                    -
dedup_fragmentation                 -
log_size                            0
log_capacity                        0%
log_free                            0
log_allocated                       0
log_available                       0
log_expandsize                      -
log_fragmentation                   -
embedded_log_size                   512M
embedded_log_capacity               0%
embedded_log_free                   512M
embedded_log_allocated              6K
embedded_log_available              341M
embedded_log_expandsize             -
embedded_log_fragmentation          0%
special_embedded_log_size           256M
special_embedded_log_capacity       5%
special_embedded_log_free           243M
special_embedded_log_allocated      13.0M
special_embedded_log_available      170M
special_embedded_log_expandsize     -
special_embedded_log_fragmentation  0%

## Log Class
normal_size                         11G
normal_capacity                     11%
normal_free                         9.70G
normal_allocated                    1.30G
normal_available                    7.33G
normal_expandsize                   -
normal_fragmentation                0%
special_size                        2.50G
special_capacity                    36%
special_free                        1.58G
special_allocated                   941M
special_available                   1.67G
special_expandsize                  -
special_fragmentation               3%
dedup_size                          0
dedup_capacity                      0%
dedup_free                          0
dedup_allocated                     0
dedup_available                     0
dedup_expandsize                    -
dedup_fragmentation                 -
log_size                            240M
log_capacity                        100%
log_free                            0
log_allocated                       240M
log_available                       240M
log_expandsize                      -
log_fragmentation                   17%
embedded_log_size                   512M
embedded_log_capacity               0%
embedded_log_free                   512M
embedded_log_allocated              6K
embedded_log_available              341M
embedded_log_expandsize             -
embedded_log_fragmentation          0%
special_embedded_log_size           256M
special_embedded_log_capacity       18%
special_embedded_log_free           208M
special_embedded_log_allocated      47.5M
special_embedded_log_available      170M
special_embedded_log_expandsize     -
special_embedded_log_fragmentation  0%

## Dedup Class
normal_size                         11G
normal_capacity                     12%
normal_free                         9.63G
normal_allocated                    1.37G
normal_available                    7.33G
normal_expandsize                   -
normal_fragmentation                0%
special_size                        2.50G
special_capacity                    37%
special_free                        1.57G
special_allocated                   949M
special_available                   1.67G
special_expandsize                  -
special_fragmentation               3%
dedup_size                          1.38G
dedup_capacity                      0%
dedup_free                          1.37G
dedup_allocated                     6K
dedup_available                     938M
dedup_expandsize                    -
dedup_fragmentation                 1%
log_size                            240M
log_capacity                        0%
log_free                            240M
log_allocated                       0
log_available                       240M
log_expandsize                      -
log_fragmentation                   17%
embedded_log_size                   512M
embedded_log_capacity               0%
embedded_log_free                   512M
embedded_log_allocated              6K
embedded_log_available              341M
embedded_log_expandsize             -
embedded_log_fragmentation          0%
special_embedded_log_size           256M
special_embedded_log_capacity       0%
special_embedded_log_free           256M
special_embedded_log_allocated      108K
special_embedded_log_available      170M
special_embedded_log_expandsize     -
special_embedded_log_fragmentation  0% 

@amotin
Copy link
Member

amotin commented Feb 20, 2026

I've created another PR (#18245) adding properties about dedup, so depending which PR will go first, we'll both need to update our ABI files in respective order to make checkstyle CI happy.

@ryan-moeller
Copy link
Author

  • Added missing *_size properties to zpoolprops.7
  • Clarified description of allocation class fragmentation -> free space fragmentation
  • Check space under vdev config lock when calculating fragmentation
  • Properly report AVAIL space as the difference between dspace and dalloc (i.e. deflated free space not deflated size)
  • Added USABLE and USED pool-wide and per-class properties
  • Updated documentation and tests accordingly

@ryan-moeller
Copy link
Author

  • Add usable/used properties to zpool_get_parsable.cfg

@ryan-moeller
Copy link
Author

I'll let #18245 land first, then I can update ABI files and remove the draft status from this PR.

@ryan-moeller
Copy link
Author

I also decided to add the pool-wide AVAIL property as usable - used for completeness, which I will include in the next push. It should not be too confusing in the context of also having per-class metrics to explain where the space is actually available.

@amotin
Copy link
Member

amotin commented Feb 25, 2026

@ryan-moeller You may rebase.

@ryan-moeller
Copy link
Author

ryan-moeller commented Feb 25, 2026

  • Added "fragmentation" to zpool_get_parsable.cfg
  • Rebased

I'll grab the updated ABI file from the CI, then push again and remove the draft status.

@ryan-moeller ryan-moeller marked this pull request as ready for review February 25, 2026 15:43
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Feb 25, 2026
@ryan-moeller
Copy link
Author

  • Added the ABI changes

(There are a bunch of unrelated changes in the ABI that I've left out.)

@ixhamza
Copy link
Member

ixhamza commented Feb 25, 2026

Also seeing zpool_get_003_pos and zpool_get_005_pos failing on dedup_used as it's shown as dedupused:
zpool_get_003_pos:

SUCCESS: eval zpool get dedup_allocated testpool > /var/tmp/values.32423
SUCCESS: grep -q dedup_allocated /var/tmp/values.32423
NOTE: Checking for dedup_available property
SUCCESS: eval zpool get dedup_available testpool > /var/tmp/values.32423
SUCCESS: grep -q dedup_available /var/tmp/values.32423
NOTE: Checking for dedup_usable property
SUCCESS: eval zpool get dedup_usable testpool > /var/tmp/values.32423
SUCCESS: grep -q dedup_usable /var/tmp/values.32423
NOTE: Checking for dedup_used property
SUCCESS: eval zpool get dedup_used testpool > /var/tmp/values.32423
ERROR: grep -q dedup_used /var/tmp/values.32423 exited 1
NOTE: Performing test-fail callback (/usr/local/share/zfs/zfs-tests/callbacks/zfs_dmesg.ksh)

zpool_get_005_pos:

NOTE: Checking for parsable dedup_available property
SUCCESS: eval zpool get -p dedup_available testpool >/tmp/value.33764
SUCCESS: grep -q dedup_available /tmp/value.33764
SUCCESS: test -n 0
NOTE: Checking for parsable dedup_usable property
SUCCESS: eval zpool get -p dedup_usable testpool >/tmp/value.33764
SUCCESS: grep -q dedup_usable /tmp/value.33764
SUCCESS: test -n 0
NOTE: Checking for parsable dedup_used property
SUCCESS: eval zpool get -p dedup_used testpool >/tmp/value.33764
ERROR: grep -q dedup_used /tmp/value.33764 exited 1
NOTE: Performing test-fail callback (/usr/local/share/zfs/zfs-tests/callbacks/zfs_dmesg.ksh)

@ryan-moeller
Copy link
Author

That's what I'm currently looking into. Apparently "dedupused" that was added in this latest rebase conflicts with "dedup_used" in the property lookup logic in libzfs? That's a bit of a problem, as both names were chosen to align with existing names. Maybe I could name the properties like "dedup_class_used" instead? Though they're long enough names already, I'll look at whether there is another way to solve this.

@behlendorf
Copy link
Contributor

There are a bunch of unrelated changes in the ABI that I've left out.

If you pulled the ABI files from the CI there shouldn't be unrelated. What kind of changes are you seeing here?

@ryan-moeller
Copy link
Author

If you pulled the ABI files from the CI there shouldn't be unrelated. What kind of changes are you seeing here?

+dump_nvlist, -posix_memalign, and the functions moved in #18133

@amotin
Copy link
Member

amotin commented Feb 26, 2026

Maybe I could name the properties like "dedup_class_used" instead? Though they're long enough names already, I'll look at whether there is another way to solve this.

I think whether we could trade abbreviation of "embedded_log" to "elog", as the longest part, for the addition of "_class". I am not sure we can/should avoid the last in the long run, while the first should make the lengths more uniform.

@ryan-moeller
Copy link
Author

  • Use <class>_class_<metric> long property names while keeping short column names
  • Shorten embedded_log_class_<metric> to elog_class_<metric> and special_embedded_log_class_<metric> to selog_class_<metric>
  • Fix test to create dedup vdev as a mirror as intended and validate expandsize accordingly
  • Correct copy-paste error in manpage list of property names
  • Correct copy-pasted noraidz vs nonraidz typo in test

@amotin
Copy link
Member

amotin commented Feb 26, 2026

  • special_embedded_log_class_<metric> to selog_class_<metric>

I was thinking about special_elog_class_<metric>, but whatever people say. I'm fine either way.

@amotin
Copy link
Member

amotin commented Feb 26, 2026

  • Use <class>_class_<metric> long property names while keeping short column names

I suppose it still leaves collision on DEDUP_USED, since zpool get allows to use column names:

# zpool get dedupused
NAME    PROPERTY   VALUE       SOURCE
optane  dedupused  0           -
# zpool get dedup_used
NAME    PROPERTY   VALUE       SOURCE
optane  dedupused  0           -

@ryan-moeller
Copy link
Author

Ah that was it, the dedupused column is named DEDUP_USED. So, I will alter the new column names as well. And I had a feeling you might have meant special_elog_. I like it, so I'll make that change, too.

@ryan-moeller
Copy link
Author

ryan-moeller commented Feb 26, 2026

  • Use the name special_elog_class_<metric>
  • Name columns <CLASS>_CLASS_<METRIC> (keeping short SELOG_CLASS_<METRIC>)

@amotin
Copy link
Member

amotin commented Feb 27, 2026

Any other comments, primarily on property names? Otherwise this looks good to me.

@behlendorf
Copy link
Contributor

What do you think about moving the class_ to the front as the prefix so they're naturally grouped together when alphabetically sorted? This would visually align the output more with what was done for feature flags feature@, and user properties where a common convention is to use the same <prefix:> for related user properties. Overall it feels me consistent to me.

class_normal_size
class_normal_capacity
class_normal_free
class_normal_allocated
class_normal_available
class_normal_usable
class_normal_used
class_normal_expandsize
class_normal_fragmentation
class_special_size
class_special_capacity
class_special_free
class_special_allocated
class_special_available
class_special_usable
class_special_used
class_special_expandsize
class_special_fragmentation

Ryan Moeller added 2 commits March 2, 2026 08:35
Capacity is reported as a percentage not a size.

Sponsored-by: Klara, Inc.
Signed-off-by: Ryan Moeller <ryan.moeller@klarasystems.com>
The existing zpool properties accounting pool space (size, allocated,
fragmentation, expandsize, free, capacity) are based on the normal
metaslab class or are cumulative properties of several classes combined.

Add properties reporting the space accounting metrics for each metaslab
class individually.

Also introduce pool-wide AVAIL, USABLE, and USED properties reporting
values corresponding to FREE, SIZE, and ALLOC deflated for raidz.

Update ZTS to recognize the new properties and validate reported values.

While in zpool_get_parsable.cfg, add "fragmentation" to the list of
parsable properties.

Sponsored-by: Klara, Inc.
Signed-off-by: Ryan Moeller <ryan.moeller@klarasystems.com>
@ryan-moeller
Copy link
Author

  • Rebased
  • Prefix class_ and CLASS_ instead of infix in property and column names

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a lot of new pool properties, but I agree it'll be nice to have the visibility in to the allocation classes. Extending zpool get to accept keywords like default instead of just all may be a nice way to limit the output to the mostly commonly used properties. That of course would be for a different PR.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants