Add metadata benchmarks#1055
Conversation
c7bfdda to
d5bba30
Compare
There was a problem hiding this comment.
File names must be random since we use them to traverse the fs tree when creating the DB and so RandString does not use our seeded random. We still use a fixed size, so there shouldn't be any variance between runs.
There was a problem hiding this comment.
If we used the seeded random all our filenames would be the same and the only thing differentiating them would be depth level (eg: file vs file/file vs file/file/file).
When looping through the TOC we maintain a map of node ID to metadata entry. The metadata entry has a map of the nodes children where the child name is the key. When we are adding children to the metadata entry of a node we will end up overwriting any existing children if they share the same name, which never happens in practice since you cannot multiple children nodes/files with the same name under a single parent node/directory. This means that our metadata/nodes bucket will not be fully populated, since a parent can really only have 1 child.
To avoid this, we use rand so we can get an actual pseudo random string. We still use a fixed length of 10 for the filename, to ensure their isn't any variance in bbolt write performance between benchmark runs. (bbolt doesn't care about the content of a KV pair since they are just interpreted as byte slices; the length, however, does matter since it controls how nodes/pages are split before writing to disk).
d5bba30 to
df9ec63
Compare
There was a problem hiding this comment.
Note: Since we want to measure write performance to disk, we have to write to non tmpfs location
df9ec63 to
8f8eb12
Compare
f2d02b9 to
692b984
Compare
692b984 to
86bf1fa
Compare
Add benchmarks functions that benchmark sequential and concurrent writes to the underlying metadata db. Signed-off-by: Yasin Turan <turyasin@amazon.com>
86bf1fa to
9ffd6e7
Compare
Issue #, if available:
Description of changes:
Add benchmark tests that benchmark metadata DB insertion performance. Added a helper function to generate random TAR file (TOC) with given number of files/entries.
Testing performed:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.