Description
What would you like to be added?
Perhaps it was intentional in the current interval tree implementation, but the currentIntervalTree.Find
operation for exact interval matching appears to rely on a visitor function that visits all overlapping intervals.
It's instead possible and I think a standard approach for exact interval matching to, if left endpoints match, split on right endpoints. This additional ordering structure is added to the insert and find operations. In particular, the additional ordering structure should speed up the IntervalTree.find()
operation, which is used by IntervalTree.Find()
and IntervalTree.Delete()
.
I've attached an MR with the code changes to support this updated interval tree structure: #19768
Additionally I've attached below code snippets and outputs for non-rigorous (Go benchmarks on before/after performance of Find()
and Insert()
on randomly built interval trees with multiple occurrences of given left/right endpoint values, which is the regime where I think a performance difference may appear. Run on my laptop, there appears to be no change to the Insert()
op performance, but a roughly 10x performance increase in Find()
, on such interval trees.
// Benchmark of the Insert op on random interval trees of up to 128 nodes with endpoint values in 0..9
func BenchmarkIntervalTreeRandomInsert(b *testing.B) {
ivs := make(map[xy]struct{})
ivt := NewIntervalTree()
maxv := 128
xmax := 10
ymax := 10
for b.Loop() {
// Setup random tree.
for i := rand.Intn(maxv) + 1; i != 0; i-- {
x, y := int64(rand.Intn(xmax)), int64(rand.Intn(ymax))
if x > y {
t := x
x = y
y = t
} else if x == y {
y++
}
iv := xy{x, y}
if _, ok := ivs[iv]; ok {
// don't double insert
continue
}
ivt.Insert(NewInt64Interval(x, y), 123)
ivs[iv] = struct{}{}
}
}
}
BenchmarkIntervalTreeRandomInsert on etcd main branch, on my Apple M1 Pro.
goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomInsert
BenchmarkIntervalTreeRandomInsert-10 511443 2335 ns/op
PASS
BenchmarkIntervalTreeRandomInsert with updated code, on my Apple M1 Pro.
goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomInsert
BenchmarkIntervalTreeRandomInsert-10 542929 2200 ns/op
PASS
// Benchmark of the Find op on random interval trees of up to 128 nodes with endpoint values in 0..9
func BenchmarkIntervalTreeRandomFind(b *testing.B) {
ivs := make(map[xy]struct{})
ivt := NewIntervalTree()
maxv := 128
xmax := 10
ymax := 10
// Setup random tree.
for i := rand.Intn(maxv) + 1; i != 0; i-- {
x, y := int64(rand.Intn(xmax)), int64(rand.Intn(ymax))
if x > y {
t := x
x = y
y = t
} else if x == y {
y++
}
iv := xy{x, y}
if _, ok := ivs[iv]; ok {
// don't double insert
continue
}
ivt.Insert(NewInt64Interval(x, y), 123)
ivs[iv] = struct{}{}
}
// Benchmark Find() operation time.
for b.Loop() {
for ab := range ivs {
_ = ivt.Find(NewInt64Interval(ab.x, ab.y))
}
}
}
BenchmarkIntervalTreeRandomFind on etcd main branch, on my Apple M1 Pro.
goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomFind
BenchmarkIntervalTreeRandomFind-10 66139 18044 ns/op
PASS
BenchmarkIntervalTreeRandomFind with updated code, on my Apple M1 Pro.
goos: darwin
goarch: arm64
pkg: go.etcd.io/etcd/pkg/v3/adt
cpu: Apple M1 Pro
BenchmarkIntervalTreeRandomFind
BenchmarkIntervalTreeRandomFind-10 747651 1602 ns/op
PASS
Why is this needed?
Improve exact interval matching performance of the interval tree.