-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Ingest sst files rather than their keyvalue content. #12079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Made this a draft. The imported SST files are not 'visible' to the db. |
@@ -58,7 +58,7 @@ function start_fdb_cluster { | |||
"${local_build_dir}" \ | |||
--knobs "${knobs}" \ | |||
--stateless_count 1 --replication_count 1 --logs_count 1 \ | |||
--storage_count "${ss_count}" --storage_type ssd \ | |||
--storage_count "${ss_count}" --storage_type ssd-rocksdb-v1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to randomly choose between sqlite and rocksdb? for testing coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno. rocksdb is the standard, not sqllite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want bulkload feature to stably work for both sqlite and rocksdb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me make a new PR for this that allows setting of storage engine to use in ctest.
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
options.snapshot_consistency = false; | ||
options.allow_global_seqno = false; | ||
options.allow_blocking_flush = false; | ||
options.verify_checksums_before_ingest = true; // Verify checksums before ingestion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to setup anything particularly for the file checksum when creating the sst files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a rocksdb thing -- checksums on blocks inside the SST. Nothing for us to do on our part. The flag says whether to check before making the data live. Restore sets it to true. sharded rocks has it as a command-line option.
} | ||
|
||
// Verify the ingestion was successful | ||
for (const auto& file : sstFiles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is verifying the sst files have been downloaded rather than ingestion succeeded. Do I understand correctly? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verifying the files have been moved into rocksdb. We set this flag options.move_files = true; // Move files instead of copying
... and then do the rocksdb rocksdb::Status status = db->IngestExternalFile(sstFiles, options);
This verify is checking rocksdb moved the files...that they are no longer in their old location because rocksdb moved them.
@@ -3870,6 +3870,77 @@ struct ShardedRocksDBKeyValueStore : IKeyValueStore { | |||
Future<Void> refreshRocksDBBackgroundWorkHolder; | |||
Future<Void> cleanUpJob; | |||
Future<Void> counterLogger; | |||
|
|||
Future<Void> ingestSSTFiles(std::string bulkLoadLocalDir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May consider to include the shardedrocksdb support in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm excluding sharded rocksdb from this PR because of your other suggestion later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
tests/fast/BulkDumping.toml
Outdated
@@ -1,6 +1,4 @@ | |||
[configuration] | |||
storageEngineExcludeTypes = [5] #FIXME: remove after allowing to do bulkloading with fetchKey and shardedrocksdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we still want to exclude sharded rocksdb because the ss metadata change for the sharded rocksdb engine is still working in progress. Currently, we do not support sharded rocksdb for bulk loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Let me fix.
state bool sstIngestionPerformed = false; | ||
// If SST ingestion is enabled and we have SST files, try to ingest them directly | ||
if (SERVER_KNOBS->BULK_LOAD_USE_SST_INGEST && bulkLoadTaskState.getLoadType() == BulkLoadType::SST && | ||
data->storage.getKeyValueStore()->supportsSstIngestion()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this supportsSstIngestion()
!
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
fdbserver/storageserver.actor.cpp
Outdated
.detail("Phase", "File download"); | ||
hold = tryGetRangeForBulkLoad(results, keys, localBulkLoadFileSets); | ||
rangeEnd = keys.end; | ||
state bool sstIngestionPerformed = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put state sstIngestionPerformed at the beginning of the fetchKey? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7 ... Result: ❌ FAILED Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1 : && /usr/bin/cc -O3 -DNDEBUG -gz -static-libstdc++ -static-libgcc bindings/c/CMakeFiles/fdb_c_performance_test.dir/test/performance_test.c.o -o bin/fdb_c_performance_test -Wl,-rpath,/codebuild/output/src188334276/src/github.com/apple/foundationdb/build_output/lib lib/libfdb_c.so && : /usr/bin/ld: lib/libfdb_c.so: undefined reference to `std::ios_base_library_init()'
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Up the log number count so tests pass joshua 500k run.
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Here are 500k Joshua jobs w/ BulkDumping and then BulkLoading:
` ` I did a load, dump, load and the ingest jobs take just over 1% less time (small potatoes!) Here are two bulk loads w/o ingest enabled:
Here is w/ ingest enabled:
Outstanding is making the ingest async but will do in a follow-on PR. |
data->storage.getKeyValueStore()->clear(keys); | ||
// Now wait on the durableVersion to be updated so clear has been committed. | ||
wait(data->durableVersion.whenAtLeast(data->storageVersion() + 1)); | ||
// TODO: this is a blocking call. We should make this as async call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this comment
// Now wait on the durableVersion to be updated so clear has been committed. | ||
wait(data->durableVersion.whenAtLeast(data->storageVersion() + 1)); | ||
// TODO: this is a blocking call. We should make this as async call. | ||
// TODO: Should we compact the range before ingestion? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so
fdbclient/ServerKnobs.cpp
fdbclient/include/fdbclient/ServerKnobs.h
Add BULK_LOAD_USE_SST_INGEST knob. Its enabled by default.
fdbclient/include/fdbclient/IKeyValueStore.actor.h
Add ingestSSTFiles and supportSStIngestion.
fdbserver/KeyValueStoreRocksDB.actor.cpp
fdbserver/KeyValueStoreShardedRocksDB.actor.cpp
Implement ingestSSTFiles and supportSStIngestion
fdbserver/storageserver.actor.cpp
If BULK_LOAD_USE_SST_INGEST and BulkLoadType::SST, ingest sst file rather than read keyvalues.
fdbclient/tests/fdb_cluster_fixture.sh
Use rocksdb instead of sqllite in tests.