Skip to content

Ingest sst files rather than their keyvalue content. #12079

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

saintstack
Copy link
Contributor

@saintstack saintstack commented Apr 7, 2025

  • fdbclient/ServerKnobs.cpp

  • fdbclient/include/fdbclient/ServerKnobs.h
    Add BULK_LOAD_USE_SST_INGEST knob. Its enabled by default.

  • fdbclient/include/fdbclient/IKeyValueStore.actor.h
    Add ingestSSTFiles and supportSStIngestion.

  • fdbserver/KeyValueStoreRocksDB.actor.cpp

  • fdbserver/KeyValueStoreShardedRocksDB.actor.cpp
    Implement ingestSSTFiles and supportSStIngestion

  • fdbserver/storageserver.actor.cpp
    If BULK_LOAD_USE_SST_INGEST and BulkLoadType::SST, ingest sst file rather than read keyvalues.

  • fdbclient/tests/fdb_cluster_fixture.sh
    Use rocksdb instead of sqllite in tests.

@saintstack saintstack marked this pull request as draft April 7, 2025 22:32
@saintstack
Copy link
Contributor Author

Made this a draft. The imported SST files are not 'visible' to the db.

@saintstack saintstack requested review from kakaiu April 7, 2025 22:33
@@ -58,7 +58,7 @@ function start_fdb_cluster {
"${local_build_dir}" \
--knobs "${knobs}" \
--stateless_count 1 --replication_count 1 --logs_count 1 \
--storage_count "${ss_count}" --storage_type ssd \
--storage_count "${ss_count}" --storage_type ssd-rocksdb-v1 \
Copy link
Member

@kakaiu kakaiu Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to randomly choose between sqlite and rocksdb? for testing coverage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno. rocksdb is the standard, not sqllite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want bulkload feature to stably work for both sqlite and rocksdb

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make a new PR for this that allows setting of storage engine to use in ctest.

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 809adcc
  • Duration 0:18:18
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

options.snapshot_consistency = false;
options.allow_global_seqno = false;
options.allow_blocking_flush = false;
options.verify_checksums_before_ingest = true; // Verify checksums before ingestion
Copy link
Member

@kakaiu kakaiu Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to setup anything particularly for the file checksum when creating the sst files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a rocksdb thing -- checksums on blocks inside the SST. Nothing for us to do on our part. The flag says whether to check before making the data live. Restore sets it to true. sharded rocks has it as a command-line option.

}

// Verify the ingestion was successful
for (const auto& file : sstFiles) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is verifying the sst files have been downloaded rather than ingestion succeeded. Do I understand correctly? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifying the files have been moved into rocksdb. We set this flag options.move_files = true; // Move files instead of copying ... and then do the rocksdb rocksdb::Status status = db->IngestExternalFile(sstFiles, options); This verify is checking rocksdb moved the files...that they are no longer in their old location because rocksdb moved them.

@@ -3870,6 +3870,77 @@ struct ShardedRocksDBKeyValueStore : IKeyValueStore {
Future<Void> refreshRocksDBBackgroundWorkHolder;
Future<Void> cleanUpJob;
Future<Void> counterLogger;

Future<Void> ingestSSTFiles(std::string bulkLoadLocalDir,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May consider to include the shardedrocksdb support in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excluding sharded rocksdb from this PR because of your other suggestion later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@@ -1,6 +1,4 @@
[configuration]
storageEngineExcludeTypes = [5] #FIXME: remove after allowing to do bulkloading with fetchKey and shardedrocksdb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we still want to exclude sharded rocksdb because the ss metadata change for the sharded rocksdb engine is still working in progress. Currently, we do not support sharded rocksdb for bulk loading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Let me fix.

state bool sstIngestionPerformed = false;
// If SST ingestion is enabled and we have SST files, try to ingest them directly
if (SERVER_KNOBS->BULK_LOAD_USE_SST_INGEST && bulkLoadTaskState.getLoadType() == BulkLoadType::SST &&
data->storage.getKeyValueStore()->supportsSstIngestion()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this supportsSstIngestion()!

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 809adcc
  • Duration 0:37:55
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 809adcc
  • Duration 0:39:15
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

.detail("Phase", "File download");
hold = tryGetRangeForBulkLoad(results, keys, localBulkLoadFileSets);
rangeEnd = keys.end;
state bool sstIngestionPerformed = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put state sstIngestionPerformed at the beginning of the fetchKey? Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 809adcc
  • Duration 0:47:07
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 809adcc
  • Duration 0:49:11
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 809adcc
  • Duration 0:57:50
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 809adcc
  • Duration 1:05:08
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: ca98e4e
  • Duration 0:22:29
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: ca98e4e
  • Duration 0:35:10
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: ca98e4e
  • Duration 0:39:08
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: ca98e4e
  • Duration 0:47:11
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: ca98e4e
  • Duration 0:47:51
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: ca98e4e
  • Duration 1:03:08
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: ca98e4e
  • Duration 1:06:03
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: aa349c9
  • Duration 0:22:20
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: aa349c9
  • Duration 0:23:14
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: aa349c9
  • Duration 0:28:57
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: aa349c9
  • Duration 0:29:54
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: aa349c9
  • Duration 0:29:57
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: b01100a
  • Duration 0:48:31
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Result of foundationdb-pr-cluster-tests on Linux CentOS 7
...
Result: ❌ FAILED
Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1

: && /usr/bin/cc -O3 -DNDEBUG -gz -static-libstdc++ -static-libgcc bindings/c/CMakeFiles/fdb_c_performance_test.dir/test/performance_test.c.o -o bin/fdb_c_performance_test  -Wl,-rpath,/codebuild/output/src188334276/src/github.com/apple/foundationdb/build_output/lib  lib/libfdb_c.so && :
/usr/bin/ld: lib/libfdb_c.so: undefined reference to `std::ios_base_library_init()'
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: b01100a
  • Duration 1:06:16
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: ccaefec
  • Duration 0:11:40
  • Result: ❌ FAILED
  • Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: ccaefec
  • Duration 0:25:14
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: ccaefec
  • Duration 0:36:34
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: ccaefec
  • Duration 0:37:23
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: ccaefec
  • Duration 0:49:55
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: ccaefec
  • Duration 1:07:04
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: ccaefec
  • Duration 1:07:39
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: fbef2af
  • Duration 0:12:21
  • Result: ❌ FAILED
  • Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: fbef2af
  • Duration 0:23:32
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: fbef2af
  • Duration 0:37:13
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: fbef2af
  • Duration 0:37:45
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: fbef2af
  • Duration 0:51:11
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: fbef2af
  • Duration 1:04:25
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: fbef2af
  • Duration 1:06:21
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Up the log number count so tests pass joshua 500k run.
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 68fcae0
  • Duration 0:12:17
  • Result: ❌ FAILED
  • Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 68fcae0
  • Duration 0:24:44
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 68fcae0
  • Duration 0:36:24
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 68fcae0
  • Duration 0:39:03
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 68fcae0
  • Duration 0:49:38
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 68fcae0
  • Duration 1:04:21
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 68fcae0
  • Duration 1:06:40
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@saintstack
Copy link
Contributor Author

Here are 500k Joshua jobs w/ BulkDumping and then BulkLoading:

 j start --tarball  /root/BulkDumping.tar.gz  --max-run 500000
  20250420-033250-stack-be9928cc336589cc             compressed=True data_size=41160278 duration=14821806 ended=500000 fail_fast=10 max_runs=500000 pass=500000 priority=100 remaining=0 runtime=2:15:43 sanity=False started=500000 stopped=20250420-054833 submitted=20250420-033250 timeout=5400 username=stack

`
20250420-224822-stack-9d09ef187b985791 compressed=True data_size=41160042 duration=7969925 ended=500000 fail_fast=10 max_runs=500000 pass=500000 priority=100 remaining=0 runtime=2:33:58 sanity=False started=500000 stopped=20250421-012220 submitted=20250420-224822 timeout=5400 username=stack

`

I did a load, dump, load and the ingest jobs take just over 1% less time (small potatoes!)

Here are two bulk loads w/o ingest enabled:

Job 554e3d77004491f9ae913466a6a09a7a submitted at 1744797947.089033 for range { begin=  end=\xff }. The job has 131490 tasks. The job ran for 465.223679 mins and exited with status Complete.
Job 996978645a55f766aab287c9fb7670a9 submitted at 1744765417.841615 for range { begin=  end=\xff }. The job has 287955 tasks. The job ran for 449.324693 mins and exited with status Complete.

Here is w/ ingest enabled:


Job 1f492e488499f6759528cf7f7e9ac3ad submitted at 1745028163.639070 for range { begin=  end=\xff }. The job has 131280 tasks. The job ran for 459.522028 mins and exited with status Complete.
Job 996978645a55f766aab287c9fb7670a9 submitted at 1744996049.799243 for range { begin=  end=\xff }. The job has 287955 tasks. The job ran for 444.571992 mins and exited with status Complete.

Outstanding is making the ingest async but will do in a follow-on PR.

@saintstack saintstack mentioned this pull request Apr 21, 2025
5 tasks
data->storage.getKeyValueStore()->clear(keys);
// Now wait on the durableVersion to be updated so clear has been committed.
wait(data->durableVersion.whenAtLeast(data->storageVersion() + 1));
// TODO: this is a blocking call. We should make this as async call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment

// Now wait on the durableVersion to be updated so clear has been committed.
wait(data->durableVersion.whenAtLeast(data->storageVersion() + 1));
// TODO: this is a blocking call. We should make this as async call.
// TODO: Should we compact the range before ingestion?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants