Skip to content

adt: split interval tree on right endpoints for matched left endpoints #19769

Open
@redwrasse

Description

@redwrasse

What would you like to be added?

Perhaps it was intentional in the current interval tree implementation, but the currentIntervalTree.Find operation for exact interval matching appears to rely on a visitor function that visits all overlapping intervals.

It's instead possible and I think a standard approach for exact interval matching to, if left endpoints match, split on right endpoints. This additional ordering structure is added to the insert and find operations. In particular, the additional ordering structure should speed up the IntervalTree.find() operation, which is used by IntervalTree.Find() and IntervalTree.Delete().

I've attached an MR with the code changes to support this updated interval tree structure: #19768

Additionally I've attached below code snippets and outputs for non-rigorous (Go benchmarks on before/after performance of Find() and Insert() on randomly built interval trees with multiple occurrences of given left/right endpoint values, which is the regime where I think a performance difference may appear. Run on my laptop, there appears to be no change to the Insert() op performance, but a roughly 10x performance increase in Find(), on such interval trees.

// Benchmark of the Insert op on random interval trees of up to 128 nodes with endpoint values in 0..9
func BenchmarkIntervalTreeRandomInsert(b *testing.B) {
	ivs := make(map[xy]struct{})
	ivt := NewIntervalTree()
	maxv := 128

	xmax := 10
	ymax := 10

	for b.Loop() {
		// Setup random tree.
		for i := rand.Intn(maxv) + 1; i != 0; i-- {
			x, y := int64(rand.Intn(xmax)), int64(rand.Intn(ymax))
			if x > y {
				t := x
				x = y
				y = t
			} else if x == y {
				y++
			}
			iv := xy{x, y}
			if _, ok := ivs[iv]; ok {
				// don't double insert
				continue
			}
			ivt.Insert(NewInt64Interval(x, y), 123)
			ivs[iv] = struct{}{}
		}
	}

}


BenchmarkIntervalTreeRandomInsert on etcd main branch, on my Apple M1 Pro.

goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomInsert
BenchmarkIntervalTreeRandomInsert-10    	  511443	      2335 ns/op
PASS


BenchmarkIntervalTreeRandomInsert with updated code, on my Apple M1 Pro.

goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomInsert
BenchmarkIntervalTreeRandomInsert-10    	  542929	      2200 ns/op
PASS

// Benchmark of the Find op on random interval trees of up to 128 nodes with endpoint values in 0..9
func BenchmarkIntervalTreeRandomFind(b *testing.B) {
	ivs := make(map[xy]struct{})
	ivt := NewIntervalTree()
	maxv := 128

	xmax := 10
	ymax := 10

	// Setup random tree.
	for i := rand.Intn(maxv) + 1; i != 0; i-- {
		x, y := int64(rand.Intn(xmax)), int64(rand.Intn(ymax))
		if x > y {
			t := x
			x = y
			y = t
		} else if x == y {
			y++
		}
		iv := xy{x, y}
		if _, ok := ivs[iv]; ok {
			// don't double insert
			continue
		}
		ivt.Insert(NewInt64Interval(x, y), 123)
		ivs[iv] = struct{}{}
	}

	// Benchmark Find() operation time.
	for b.Loop() {
		for ab := range ivs {
			_ = ivt.Find(NewInt64Interval(ab.x, ab.y))
		}
	}
}
BenchmarkIntervalTreeRandomFind on etcd main branch, on my Apple M1 Pro.

goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomFind
BenchmarkIntervalTreeRandomFind-10    	   66139	     18044 ns/op
PASS


BenchmarkIntervalTreeRandomFind with updated code, on my Apple M1 Pro.

goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomFind
BenchmarkIntervalTreeRandomFind-10    	  747651	      1602 ns/op
PASS

Why is this needed?

Improve exact interval matching performance of the interval tree.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions