Skip to content

More consistent use of TREE_* macros in AVL comparators#18259

Open
robn wants to merge 1 commit intoopenzfs:masterfrom
robn:avl-comparator-tidy
Open

More consistent use of TREE_* macros in AVL comparators#18259
robn wants to merge 1 commit intoopenzfs:masterfrom
robn:avl-comparator-tidy

Conversation

@robn
Copy link
Member

@robn robn commented Feb 24, 2026

[Sponsors: TrueNAS]

Motivation and Context

Just some drive-by code uplift.

Description

Where is it appropriate and obvious, use TREE_CMP(), TREE_ISIGN() and TREE_PCMP() instead of direct comparisons. It can make the code a lot smaller, less error prone, and easier to read.

Commit should be fairly self-explanatory, however there are three functions that have at least caused me to raise an eyebrow:

  • vdev_queue_to_compare() has a curious construction at the end, where it will always consider two IOs with the same timestamp and offset to be the same if the first one is not queued. At least, I think that is the intent. The same construction is present in zfs_refcount_compare(), but there it considered two references with same var and size to be the same during a search. the difference is, vdev_queue_to_compare() uses a bitwise OR |, while zfs_refcount_compare() uses a conditional OR ||. I don't love this shorthand either way; I'd prefer it written out, but I think || is (marginally) more correct for the intent. I didn't want to just change anything, though, in case its load-bearing in ways I'm not seeing.

  • spa_error_entry_compare() uses memcmp() to compare two bookmarks, which isn't wrong, I guess, but compare to zio_bookmark_compare() or the bottom half of recent_events_compare(), where we spell it out. Those read a lot better, I think.

  • zbookmark_compare() is not directly an AVL comparator, but is wrapped by scan_prefetch_queue_compare(), redact_node_compare_start() and redact_node_compare_end(), which are. It has a set of comparisons near the end that are begging to be converted to TREE_CMP(), however note this subtlety:

        /* Now that we have a canonical representation, do the comparison. */
        if (zb1obj != zb2obj)
                return (zb1obj < zb2obj ? -1 : 1);
        else if (zb1L0 != zb2L0)
                return (zb1L0 < zb2L0 ? -1 : 1);
        else if (zb1level != zb2level)
                return (zb1level > zb2level ? -1 : 1);

The zbNlevel comparison at the end actually has opposite direction to the other two. I can't obviously tell if this is intentional or not.

How Has This Been Tested?

ZTS run completed on Linux.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@amotin
Copy link
Member

amotin commented Feb 24, 2026

IIRC the trick behind vdev_queue_to_compare() and zfs_refcount_compare(), and generally introduction of TREE_CMP() was to make the comparisons branch-less to avoid pipeline resets due to mispredicted branching. Would be good to compare the generated code. Otherwise seems fine.

@robn robn force-pushed the avl-comparator-tidy branch from 523d01e to 82a2479 Compare February 27, 2026 01:22
@robn
Copy link
Member Author

robn commented Feb 27, 2026

Here's source-annotated disassembly for vdev_queue_to_compare() without/with this change for kernel and userspace: https://gist.github.com/robn/eab9604c562c8db157be44c384740be2

GCC 14.2, production builds (which is why I didn't check zfs_refcount_compare(): only exists on debug builds).

tl;dr: literally the same assembly.

Screenshot_2026-02-27_12-27-40 Screenshot_2026-02-27_12-28-02

@robn
Copy link
Member Author

robn commented Feb 27, 2026

Same for spa_mapping_key_compare(): https://gist.github.com/robn/7e35f9b8c9e6aebb61054abc8a68e1f5

diff --git module/zfs/dsl_crypt.c module/zfs/dsl_crypt.c
index f519b937ed..9cb1536642 100644
--- module/zfs/dsl_crypt.c
+++ module/zfs/dsl_crypt.c
@@ -306,12 +301,7 @@ spa_key_mapping_compare(const void *a, const void *b)
 {
 	const dsl_key_mapping_t *kma = a;
 	const dsl_key_mapping_t *kmb = b;
-
-	if (kma->km_dsobj < kmb->km_dsobj)
-		return (-1);
-	if (kma->km_dsobj > kmb->km_dsobj)
-		return (1);
-	return (0);
+	return (TREE_CMP(kma->km_dsobj, kmb->km_dsobj));
 }
 
 static int

This one is much nicer: no branches at all.

Screenshot_2026-02-27_12-37-27 Screenshot_2026-02-27_12-37-52

@amotin
Copy link
Member

amotin commented Feb 27, 2026

Here's source-annotated disassembly for vdev_queue_to_compare() without/with this change for kernel and userspace: https://gist.github.com/robn/eab9604c562c8db157be44c384740be2

GCC 14.2, production builds (which is why I didn't check zfs_refcount_compare(): only exists on debug builds).

tl;dr: literally the same assembly.

With Clang 19.1.7 I do see only one branching with current code, which should always be true and predicted as such, though I haven't benchmarked it to see the difference:

Dump of assembler code for function vdev_queue_to_compare:
   0xffffffff805b89e0 <+0>:     push   %rbp
   0xffffffff805b89e1 <+1>:     mov    %rsp,%rbp
   0xffffffff805b89e4 <+4>:     mov    0x270(%rdi),%rcx
   0xffffffff805b89eb <+11>:    mov    0x278(%rdi),%rax
   0xffffffff805b89f2 <+18>:    sar    $0x1d,%rax
   0xffffffff805b89f6 <+22>:    mov    0x278(%rsi),%rdx
   0xffffffff805b89fd <+29>:    sar    $0x1d,%rdx
   0xffffffff805b8a01 <+33>:    xor    %r8d,%r8d
   0xffffffff805b8a04 <+36>:    xor    %r9d,%r9d
   0xffffffff805b8a07 <+39>:    cmp    %rdx,%rax
   0xffffffff805b8a0a <+42>:    setg   %r8b
   0xffffffff805b8a0e <+46>:    setl   %r9b
   0xffffffff805b8a12 <+50>:    sub    %r9d,%r8d
   0xffffffff805b8a15 <+53>:    xor    %eax,%eax
   0xffffffff805b8a17 <+55>:    cmp    0x270(%rsi),%rcx
   0xffffffff805b8a1e <+62>:    seta   %al
   0xffffffff805b8a21 <+65>:    sbb    $0x0,%eax
   0xffffffff805b8a24 <+68>:    test   %r8d,%r8d
   0xffffffff805b8a27 <+71>:    cmovne %r8d,%eax
   0xffffffff805b8a2b <+75>:    xor    %ecx,%ecx
   0xffffffff805b8a2d <+77>:    cmpl   $0x0,0x238(%rdi)
   0xffffffff805b8a34 <+84>:    sete   %cl
   0xffffffff805b8a37 <+87>:    or     %eax,%ecx
   0xffffffff805b8a39 <+89>:    je     0xffffffff805b8a3d <vdev_queue_to_compare+93>
   0xffffffff805b8a3b <+91>:    pop    %rbp
   0xffffffff805b8a3c <+92>:    ret
   0xffffffff805b8a3d <+93>:    xor    %eax,%eax
   0xffffffff805b8a3f <+95>:    cmp    %rsi,%rdi
   0xffffffff805b8a42 <+98>:    seta   %al
   0xffffffff805b8a45 <+101>:   sbb    $0x0,%eax
   0xffffffff805b8a48 <+104>:   pop    %rbp
   0xffffffff805b8a49 <+105>:   ret

@amotin
Copy link
Member

amotin commented Feb 27, 2026

The same construction is present in zfs_refcount_compare(), but there it considered two references with same var and size to be the same during a search. the difference is, vdev_queue_to_compare() uses a bitwise OR |, while zfs_refcount_compare() uses a conditional OR ||. I don't love this shorthand either way; I'd prefer it written out, but I think || is (marginally) more correct for the intent.

I see for || Clang produces second branching, while for | it does not. I agree that || is semantically cleaner, but those compare functions can be executed a lot, so it may matter in some cases. Even though this is not in production builds, having debug builds usable is also important.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a nice bit of cleanup.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Feb 28, 2026
Where is it appropriate and obvious, use TREE_CMP(), TREE_ISIGN() and
TREE_PCMP() instead or direct comparisons. It can make the code a lot
smaller, less error prone, and easier to read.

Sponsored-by: TrueNAS
Signed-off-by: Rob Norris <rob.norris@truenas.com>
@robn robn force-pushed the avl-comparator-tidy branch from 82a2479 to 707ae0d Compare March 1, 2026 22:59
@robn
Copy link
Member Author

robn commented Mar 1, 2026

Last push rebases to master, and changes that || in zfs_refcount_compare to |.

Again on GCC, if (cmp || r1->ref_search) produces:

	test   %eax,%eax
	jne    <zfs_refcount_compare+0x49>
	mov    0x28(%rdi),%edx
	test   %edx,%edx
	jne    <zfs_refcount_compare+0x49>

While if (cmp | r1->ref_search) is:

	mov    %eax,%edx
	or     0x28(%rdi),%edx
	jne    <zfs_refcount_compare+0x45>

So, trading a second branch opportunity for a load, though for reference_t that's on the same cacheline.

I haven't measured it at all, and its not quite the point of this PR, but I'm fine with from a consistency point of view. I've been pondering whether it would be good to be able to provide a different comparator for search vs insertion, but I'll need to play more to see if that makes any sense, certainly not for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants