Skip to content

Commit 6301dc6

Browse files
committed
wip
1 parent fa504ed commit 6301dc6

File tree

2 files changed

+71
-8
lines changed

2 files changed

+71
-8
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -297,7 +297,7 @@ The worst-case in-memory size of an LSM-tree is $`O(n)`$.
297297
`confWriteBufferAlloc`. Regardless of write buffer allocation
298298
strategy, the size of the write buffer may never exceed 4GiB.
299299

300-
`AllocNumEntries maxEntries`
300+
`AllocNumEntries maxEntries`
301301
The maximum size of the write buffer is the maximum number of entries
302302
multiplied by the average size of a key–operation pair.
303303

@@ -309,11 +309,11 @@ The worst-case in-memory size of an LSM-tree is $`O(n)`$.
309309
filter allocation strategy, which is determined by the `TableConfig`
310310
parameter `confBloomFilterAlloc`.
311311

312-
`AllocFixed bitsPerPhysicalEntry`
312+
`AllocFixed bitsPerPhysicalEntry`
313313
The number of bits per physical entry is specified as
314314
`bitsPerPhysicalEntry`.
315315

316-
`AllocRequestFPR requestedFPR`
316+
`AllocRequestFPR requestedFPR`
317317
The number of bits per physical entry is determined by the requested
318318
false-positive rate, which is specified as `requestedFPR`.
319319

@@ -336,12 +336,12 @@ The worst-case in-memory size of an LSM-tree is $`O(n)`$.
336336
described in reference to the size of the database in [*memory
337337
pages*](https://en.wikipedia.org/wiki/Page_%28computer_memory%29 "https://en.wikipedia.org/wiki/Page_%28computer_memory%29").
338338

339-
`OrdinaryIndex`
339+
`OrdinaryIndex`
340340
An ordinary index stores the maximum serialised key for each memory
341341
page. The total in-memory size of all indexes is proportional to the
342342
average size of one serialised key per memory page.
343343

344-
`CompactIndex`
344+
`CompactIndex`
345345
A compact index stores the 64 most significant bits of the minimum
346346
serialised key for each memory page, as well as 1 bit per memory page
347347
to resolve clashes, 1 bit per memory page to mark overflow pages, and
@@ -361,7 +361,7 @@ is the in-memory write buffer and all following levels are sequences of
361361
on-disk runs. Each level has a maximum size. The maximum size of the
362362
write buffer is determined by the configuration parameter
363363
`confWriteBufferAlloc`. The maximum size of every other level $`l`$ is
364-
$`l \times T \times B`$. The constant $`B`$ refers to the write buffer
364+
$`l \cdot T \cdot B`$. The constant $`B`$ refers to the write buffer
365365
size and the constant $`T`$ refers to the size ratio. (See
366366
[Performance](#performance "#performance").)
367367

lsm-tree.cabal

Lines changed: 65 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -185,15 +185,78 @@ description:
185185
The total size of an LSM-tree must not exceed \(2^{41}\) physical entries.
186186
Violation of this condition /is/ checked and will throw a 'TableTooLargeError'.
187187

188-
=== Fine-tuning #fine_tuning#
188+
=== Fine-tuning Table Layout #fine_tuning#
189+
190+
The configuration parameters @confMergePolicy@, @confMergeSchedule@, @confSizeRatio@, and @confWriteBufferAlloc@ affect the way in which the table organises its data.
191+
To understand what effect these parameters have, one must have a basic understand of how an LSM-tree stores its data.
192+
An LSM-tree stores key–operation pairs, which pair a key with an operation such as an @Insert@ with a value or a @Delete@.
193+
These key–operation pairs are organised into /runs/, which are sequences of key–operation pairs sorted by their key.
194+
Runs are organised into /levels/, which are unordered sequences or runs.
195+
Levels are organised hierarchically.
196+
Level 0 is kept in memory, and is referred to as the /write buffer/.
197+
All subsequent levels are stored on disk, with each run stored in its own file.
198+
The following shows an example LSM-tree layout, with each run as a boxed sequence of keys and each level as a row.
189199

200+
\[
201+
\begin{array}{l:l}
202+
\text{Level}
203+
&
204+
\text{Data}
205+
\\
206+
0
207+
&
208+
\fbox{\(\texttt{4}\,\_\)}
209+
\\
210+
1
211+
&
212+
\fbox{\(\texttt{1}\,\texttt{3}\)}
213+
\quad
214+
\fbox{\(\texttt{2}\,\texttt{7}\)}
215+
\\
216+
2
217+
&
218+
\fbox{\(\texttt{0}\,\texttt{2}\,\texttt{3}\,\texttt{4}\,\texttt{5}\,\texttt{6}\,\texttt{8}\,\texttt{9}\)}
219+
\end{array}
220+
\]
221+
222+
The data in an LSM-tree is /partially sorted/: only the key–operation pairs within each run are sorted and deduplicated.
223+
As a rule of thumb, keeping more of the data sorted means lookup operations are faster but update operations are slower.
224+
225+
The configuration parameters @confMergePolicy@, @confSizeRatio@, and @confWriteBufferAlloc@ directly affect the table layout.
226+
Let \(B\) refer to the value of @confWriteBufferAlloc@.
227+
Let \(T\) refer to the value of @confiSizeRatio@.
228+
The write buffer can contain at most \(B\) entries.
229+
The size ratio \(T\) determines the ratio between the maxmimum number of entries in each level.
230+
For instance, if \(B = 2\) and \(T = 2\), then
231+
232+
\[
233+
\begin{array}{l:l}
234+
\text{Level} & \text{Maximum Size}
235+
\\
236+
0 & B \cdot T^0 = 2
237+
\\
238+
1 & B \cdot T^1 = 4
239+
\\
240+
2 & B \cdot T^2 = 8
241+
\\
242+
\ell & B \cdot T^\ell
243+
\end{array}
244+
\]
245+
246+
The @confSizeRatio@ determines the /size ratio/ between subsequent levels.
247+
The write buffer can contain at most \(B\) entries.
248+
This limit is determined by the @TableConfig@ parameter .
249+
Level 1 can contain at
250+
251+
252+
In the example below, each run is shown as a boxed sequence of keys, e.g., \(\fbox{\(\texttt{1}\,\texttt{3}\)}\), without showing the operation.
190253
An LSM-tree stores its data in a partially-sorted structure.
191254
The key–operation pairs are stored in /runs/, which are sorted sequences of key–operation pairs.
192255
The runs are organised in /levels/.
193256
The 0th level is the in-memory write buffer and all following levels are sequences of on-disk runs.
194257
Each level has a maximum size.
195258
The maximum size of the write buffer is determined by the configuration parameter @confWriteBufferAlloc@.
196-
The maximum size of every other level \(l\) is \(l \times T \times B\).
259+
The maximum size of every other level \(l\) is \(l \cdot T \cdot B\).
197260
The constant \(B\) refers to the write buffer size and the constant \(T\) refers to the size ratio.
198261
(See [Performance](#performance).)
199262

0 commit comments

Comments
 (0)