Skip to content

Commit cd6411c

Browse files
authored
Merge pull request #383 from liulanzheng/main
update doc for lsmt lookup
2 parents 5472a0e + 9f44633 commit cd6411c

File tree

4 files changed

+24
-2
lines changed

4 files changed

+24
-2
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Overlaybd is based on [PhotonLibOS](https://github.com/alibaba/PhotonLibOS), whi
1212

1313
Overlaybd has 2 core component:
1414
* **Overlaybd**
15-
is a block-device based image format, provideing a merged view of a sequence of block-based layers as a virtual block device.
15+
is a block-device based image format, provideing a merged view of a sequence of block-based layers as a virtual block device. The LBA lookup algorithm employs a linearized B+ tree and AVX-512 to optimize performance, significantly accelerating search speed up to 10X. [Lookup Performance](https://github.com/containerd/overlaybd/blob/main/docs/lsmt_lookup.md)
1616

1717
* **Zfile**
1818
is a compression file format which support seekalbe online decompression.

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ Now this service contains an implementation of overlaybd based on [TCMU](https:/
2828

2929
This service is based on [PhotonLibOS](https://github.com/alibaba/PhotonLibOS), which is a high-efficiency LibOS framework.
3030

31+
The LBA lookup algorithm employs a linearized B+ tree and AVX-512 to optimize performance, significantly accelerating search speed up to 10X. [Lookup Performance](https://github.com/containerd/overlaybd/blob/main/docs/lsmt_lookup.md)
32+
3133
## Accelerated container image
3234

3335
[GitHub](https://github.com/containerd/accelerated-container-image)

docs/lsmt_lookup.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Lookup Algorithm in LSMT
2+
3+
## Description
4+
5+
LBA lookup in LSMT can be abstracted as a segment search problem, searching within a sorted set of non-overlapping intervals. Previously, we used binary search via std::lower_bound. Now, we've adopted a linearized B+ tree combined with AVX-512, which better exploits CPU cache efficiency and delivers over a 10X speedup in lookup performance. Even in environments without AVX-512 support, using a loop optimized with bitmask still yields significant performance gains.
6+
7+
8+
## Performance
9+
10+
| segment count | b+tree + avx512 | b+tree + loop + bitmask | lower bound |
11+
|---------------|-----------------|---------------|-------------|
12+
| 1k | 220 M/s | 42.2 M/s | 18.3 M/s |
13+
| 10k | 160 M/s | 30.7 M/s | 12.8 M/s |
14+
| 100k | 108 M/s | 21.8 M/s | 8.6 M/s |
15+
| 1M | 57.4 M/s | 15.2 M/s | 5.6 M/s |

src/overlaybd/lsmt/index.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,12 @@ static constexpr uint32_t LEVEL_START_ID[MAX_LEVEL] = {0, 8, 80, 728, 6560,
7474

7575
struct DefaultInnerSearch {
7676
static uint32_t inner_search(const uint64_t *base, uint64_t x) {
77-
return std::upper_bound(base, base + 8, x) - base;
77+
uint8_t mask = 0;
78+
#pragma GCC unroll 20
79+
for (uint32_t i = 0; i < ORDER; i++) {
80+
mask |= ( (base[i] <= x) << i );
81+
}
82+
return __builtin_popcount(mask);
7883
}
7984
};
8085

0 commit comments

Comments
 (0)