-
Notifications
You must be signed in to change notification settings - Fork 132
Symbol list and storage lock improvements #2359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Label error. Requires exactly 1 of: patch, minor, major. Found: |
d6556b5
to
d464ce0
Compare
|
||
|
||
def compact_symbol_list_worker(real_s3_storage_factory, lib_name, run_time): | ||
# Decrease the lock wait times to make lock failures more likely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about attaching a FailureSimulator to the real_s3_storage
and introducing slowdowns instead of having to decrease the timeout?
I'm 80% sure that with such a low WaitMs
almost none of the storage locks will succeed because they'll take more than 1.5ms
.
while time.time() - start_time < run_time: | ||
id = cnt * step_symbol_id + first_symbol_id | ||
lib.write(f"sym_{id}", df) | ||
cnt += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we introduce some (maybe optional) sleeps? Feels like a very high stress test if we do writes and compactions in a very hot loop.
I think without sleeps the different threads might be less interleaved than we expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will just remove the loop and use FailureSimulator with some probability, so each process writes/compacts only once as it is more deterministic
while not results_queue.empty(): | ||
first_id, cnt = results_queue.get() | ||
expected_symbol_list.update([f"sym_{first_id + i*num_writers}" for i in range(cnt)]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to verify the state of the symbol list entries after all threads are finished.
E.g. there is just one compacted symbol list entry. This would mean no racing compactions happened and at least one compaction succeeded (which I think is not the case right now because of the previous comment about a too low WaitMs
)
do_wait: | ||
while (ref_key_exists(store)) { | ||
|
||
while (!try_acquire_lock(store)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have introduced a new failure mode which can be caused by slow writes. We've seen writes getting very slow for prolonged intervals of time on vast. I'm worried that the default timeout_ms = std::nullopt
can cause us to loop indefinetely without any indication as to what's going on.
I think we should either:
- Not allow
nullopt
timeout - Introduce logging inside the loop saying
Failed to acqure lock, will attampt again in Xms
- Promote the debug logging due to long write to at least an info log
ARCTICDB_DEBUG(log::lock(), "Waited for {} ms, thread id: {}", lock_sleep_ms, std::this_thread::get_id()); | ||
auto read_ts = read_timestamp(store); | ||
auto duration = ClockType::coarse_nanos_since_epoch() - start; | ||
auto duration_in_ms = duration / ONE_MILLISECOND; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think you'll need a [[maybe_unused]]
because the preprocessing in the debug logs can hide its usage.
@@ -184,6 +194,7 @@ class StorageLock { | |||
|
|||
timestamp create_ref_key(const std::shared_ptr<Store>& store) { | |||
auto ts = ClockType::nanos_since_epoch(); | |||
StorageFailureSimulator::instance()->go(FailureType::WRITE_LOCK); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that having a separate failure type for the lock is useful.
I think that:
- Allowing slowdowns for other write operations is useful
- Allowing regular write exception simulations could also be useful for the storage lock
- Calling
FailureSimulator->go
on a higher level than the other calls can be confusing.
I don't feel too strongly about this, so happy to leave as is if it is a lot of work to refactor.
fbfd7dd
to
1cfa988
Compare
Reference Issues/PRs
What does this implement or fix?
Any other comments?
Checklist
Checklist for code changes...