Skip to content

adt: split interval tree by right endpoint on matched left endpoints #19768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 38 additions & 10 deletions pkg/adt/interval_tree.go
Original file line number Diff line number Diff line change
Expand Up @@ -472,8 +472,16 @@
x := ivt.root
for x != ivt.sentinel {
y = x
if z.iv.Ivl.Begin.Compare(x.iv.Ivl.Begin) < 0 {
// Split on left endpoint. If left endpoints match, instead split on right endpoint.
Copy link
Member

@serathius serathius May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cormen "Introduction to Algorithms", Chapter 14 Exercise 14.3.5 is I think relevant:

14.3-5
Suggest modifications to the interval-tree procedures to support the new operation INTERVAL-SEARCH-EXACTLY(T,i), where T is an interval tree and i is an interval. The operation should return a pointer to a node x in T such that x.int.low = i.low and x.int.high = i.high, or T.nil if T contains no such node. All operations, including INTERVAL-SEARCH-EXACTLY, should run in O(lg n) time on an n-node interval tree.

Can you update the function docstring?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @redwrasse @ahrtr
Can we address this in followup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed this comment. The comment for the Insert method isn't accurate anymore, we need to update it. We also need to add comment for the find method. @redwrasse

// TODO: make this consistent with textbook implementation
//
// "Introduction to Algorithms" (Cormen et al, 3rd ed.), chapter 13.3, p315
//
// RB-INSERT(T, z)
//
// y = T.nil
// x = T.root
//
// while x ≠ T.nil
// y = x
// if z.key < x.key
// x = x.left
// else
// x = x.right
//
// z.p = y
//
// if y == T.nil
// T.root = z
// else if z.key < y.key
// y.left = z
// else
// y.right = z
//
// z.left = T.nil
// z.right = T.nil
// z.color = RED
//
// RB-INSERT-FIXUP(T, z)
// Insert adds a node with the given interval into the tree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius @ahrtr I'll open an MR for updating the comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a pull request with docstring updates: #20015

beginCompare := z.iv.Ivl.Begin.Compare(x.iv.Ivl.Begin)
if beginCompare < 0 {
x = x.left
} else if beginCompare == 0 {
if z.iv.Ivl.End.Compare(x.iv.Ivl.End) < 0 {
x = x.left
} else {
x = x.right
}
Comment on lines +479 to +484
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You changed the behaviour, so it's a breaking change.

Previously it only checks the interval's Begin; an intervalTree inserts a new node to the left if newNode.Ivl.Begin < x.Ivl.Begin, otherwise inserts it to the right.

Now you not only checks the Begin, but also checks the End, and inserts the new node to the left if Begin matches and newNode's End is less.

This PR's purpose is to optimize the exact search (find), I don't see a reason why update the Insert. So please consider to revert the change. Pls also see comment for the find method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr thanks very much for reviewing.

So now I'm confused. The goal I had with this MR / issue is indeed to speed up Find to logarithmic time (as, for example, the Cormen 14.3-5 excercise addresses), instead of the existing visitor implementation. To do this, in say a textbook implementation, I thought requires updating both Find and Insert operations to further split on right endpoint if the left endpoints are matched?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to further split on right endpoint if the left endpoints are matched

As mentioned above,

  • You can optimize the find, no matter you update the Insert. Please let me know if it isn't true.
  • Changing the Insert changes the behaviour. So I suggest not to change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're proposing: keep the existing interval tree structure of splitting left if less than left endpoint, else split right. Update the Find implementation to follow this split logic, hence an optimization in the sense that Find will never be checking both left and right subtrees.

With this logic I see the existing etcd interval tree tests fail. The issue is I think this approach doesn't preserve red-black tree rotation invariance.

(code changes for below snippets on this branch 8412021)

Here is updated find, using the proposed logic of split left if less than left endpoint, else split right:

// find the exact node for a given interval
func (ivt *intervalTree) find(ivl Interval) *intervalNode {
	x := ivt.root
	// Search until hit sentinel or exact match.
	for x != ivt.sentinel {
		beginCompare := ivl.Begin.Compare(x.iv.Ivl.Begin)
		endCompare := ivl.End.Compare(x.iv.Ivl.End)
		if beginCompare == 0 && endCompare == 0 {
			// Found a match.
			return x
		}
		// Split on left endpoint: if less than, go left, else go right.
		if beginCompare < 0 {
			x = x.left
		} else {
			x = x.right
		}
	}
	return x
}

and a corresponding unit test (which fails) I think illustrating the point about rotations:

func TestProposedFindExample(t *testing.T) {

	//won't work  because won't satisfy tree rotation invariance property required of red-black trees, eg. if [2,7] is written then can't asssme all subsequently written [2,*] entities will remain in right subtree of [2,7]- won't, because tree rotations are possible.

	// OTOH with the proposed approach, all subsequent [2, x] writes with x > 7 will land in right/after of [2,7], and those with x < 7 will land to left /before. Under tree rotation ordering is preserved.

	ivt := NewIntervalTree()

	lEndp := int64(2)
	rEndps := []int64{7, 3, 9}
	val := 123

	for _, re := range rEndps {
		ivt.Insert(NewInt64Interval(lEndp, re), val)
	}

	// What we would expect from 'insert left if less than left endpoint, else insert right' without tree rotations:
	// (we can generate this tree by commenting out the `ivt.insertFixup(z)` line in the `Insert` op)
	//  Insert [2, 7), then Insert [2, 3), then Insert [2, 9) becomes:
	//   	[2, 7)
	//	        \
	//  		[2, 3)
	//				\
	//			  [2, 9)

	// Instead, due to rotations (rb-fixup), we get:
	//      [2, 3)
	//     /     \
	//  [2, 7)   [2, 9)

	// Find fails because it assumes the former tree structure, eg. thinks it can always search right in the above example:
	for _, re := range rEndps {
		ivl := NewInt64Interval(lEndp, re)
		assert.NotNil(t, ivt.Find(ivl))
		assert.Equal(t, ivl, ivt.Find(ivl).Ivl)
	}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I was thinking the find should work as long as it matches logic (search path) as Insert, but actually it might not match due to insertFixup.

The question for now is will insertFixup cause the same problem for this PR?

In this PR, you updated both find and Insert, and follow the same logic (search path): splitting both left(Begin) and right(End) endpoints. Due to insertFixup, is it possible that it may also cause the find fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert on these algorithms, but my working assumption has been that the total ordering introduced by secondary split on right endpoints allows satisfying the tree rotation invariance needed of red-black tree structure. I think this is the textbook approach (eg. the Cormen exercise referenced earlier.)

Any thoughts for how to further test/guarantee correctness? During development I relied on the TestIntervalTreeRandom, which is parameterized by # of nodes maxv, set to 128. Increasing that by an order of magnitude, and rerunning, I didn't encounter any test failures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts for how to further test/guarantee correctness?

Good question.

  • I think both rotateLeft and rotateRight always keep the invariable property: x.left < x < x.right

rotateLeft (a):

       a                                                          b
         \                                                     /      \
           b                     --->                   a          d
        /      \                                               \
    c          d                                              c

rotateRight (a):

       a                                                      b
      /                                                     /      \
    b                     --->                       c          a
  /      \                                                        /
 c          d                                              d
  • I ran go test -run TestIntervalTreeRandom -v -count 200 -failfast multiple times, and always passed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #19768 (comment), it changes the behaviour, but I think we are good as long as we don't break the IntervalTree API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for reviewing and investigating @ahrtr!

} else {
x = x.right
}
Expand All @@ -483,8 +491,15 @@
if y == ivt.sentinel {
ivt.root = z
} else {
if z.iv.Ivl.Begin.Compare(y.iv.Ivl.Begin) < 0 {
beginCompare := z.iv.Ivl.Begin.Compare(y.iv.Ivl.Begin)
if beginCompare < 0 {
y.left = z
} else if beginCompare == 0 {
if z.iv.Ivl.End.Compare(y.iv.Ivl.End) < 0 {
y.left = z
} else {
y.right = z
}
} else {
y.right = z
}
Expand Down Expand Up @@ -701,16 +716,29 @@

// find the exact node for a given interval
func (ivt *intervalTree) find(ivl Interval) *intervalNode {
ret := ivt.sentinel
f := func(n *intervalNode) bool {
if n.iv.Ivl != ivl {
return true
x := ivt.root
// Search until hit sentinel or exact match.
for x != ivt.sentinel {
beginCompare := ivl.Begin.Compare(x.iv.Ivl.Begin)
endCompare := ivl.End.Compare(x.iv.Ivl.End)
if beginCompare == 0 && endCompare == 0 {
return x
}
// Split on left endpoint. If left endpoints match,
// instead split on right endpoints.
if beginCompare < 0 {
x = x.left
} else if beginCompare == 0 {
if endCompare < 0 {
x = x.left

Check warning on line 733 in pkg/adt/interval_tree.go

View check run for this annotation

Codecov / codecov/patch

pkg/adt/interval_tree.go#L733

Added line #L733 was not covered by tests
} else {
x = x.right
}
} else {
x = x.right
Comment on lines +729 to +738
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls revert the change to the Insert method

Suggested change
if beginCompare < 0 {
x = x.left
} else if beginCompare == 0 {
if endCompare < 0 {
x = x.left
} else {
x = x.right
}
} else {
x = x.right
if beginCompare < 0 {
x = x.left
} else {
x = x.right

}
ret = n
return false
}
ivt.root.visit(&ivl, ivt.sentinel, f)
return ret
return x
}

// Find gets the IntervalValue for the node matching the given interval
Expand Down