fix(nemesis): increase CREATE_MV and CREATE_INDEX soft timeouts to 5 hours#15177
fix(nemesis): increase CREATE_MV and CREATE_INDEX soft timeouts to 5 hours#15177yarongilor wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis change increases the Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Test Summary: PASSED✅ Precommit: PASSED
✅ Tests: PASSED
|
fruch
left a comment
There was a problem hiding this comment.
https://scylladb.atlassian.net/browse/SCT-353 is a bunch of reports from all kind of causes and release.
I don't see in any of those the analysis that suggests we should wait longer, but we should focus of what is slowing it down (disk/cpu/tombstones), and fix the source of the problem, waiting more is waste of our time and resources.
we doing work to do those wait by factor of something base on feature, we should align those as well to something, it should be increased on every slight issue.
…hours Observed CREATE_MV taking ~15453s and CREATE_INDEX (wait for index build) taking ~15857s, both exceeding the previous 14400s (4h) soft timeout and triggering SoftTimeoutEvent followed by FailedResultEvent from Argus validation. Increase the soft timeout from 14400s to 18000s (5h) for all three adaptive_timeout calls in disrupt_create_index, disrupt_add_remove_mv, and disrupt_add_drop_mv_with_node_restarts. Fixes: https://scylladb.atlassian.net/browse/SCT-353
069cddd to
ae561de
Compare
@fruch ,
|
It's continuing now, you are just getting notifications about the soft limit Charging this would be to bury the discussions. It's clearly in some of the reproduce runs, that the case doesn't have enough CPU, and the MV is probably doesn't have enough resources to be able to do it in a timely fashion, we should run longevity with a close to 100% CPU utilization, its one of many things that wouldn't work. |
Observed CREATE_MV taking ~15453s and CREATE_INDEX (wait for index build) taking ~15857s, both exceeding the previous 14400s (4h) soft timeout and triggering SoftTimeoutEvent followed by FailedResultEvent from Argus validation.
Increase the soft timeout from 14400s to 18000s (5h) for all three adaptive_timeout calls in disrupt_create_index, disrupt_add_remove_mv, and disrupt_add_drop_mv_with_node_restarts.
Fixes: https://scylladb.atlassian.net/browse/SCT-353
Testing
PR pre-checks (self review)
backportlabelsReminders
sdcm/sct_config.py)unit-test/folder)