full-scan read timeout issues #9994
Replies: 4 comments 2 replies
-
cc: @pehala ,@temichus , @aleksbykov , @fruch , @roydahan |
Beta Was this translation helpful? Give feedback.
-
This also looks wrong - full-scan report a successful run on a decommissioned node:
|
Beta Was this translation helpful? Give feedback.
-
@dkropachev , following scylladb/scylladb#22911 resolution, could this issue also be related to scylladb/cassandra-stress#30 ? |
Beta Was this translation helpful? Give feedback.
-
C-S is not using the retry machinzem of the driver (Also Scylla bench and latte) It might be a driver issue, the retry being used i.e the exponential backoffice one, Also, why are discussing it here and not in an issue ? |
Beta Was this translation helpful? Give feedback.
-
This discussion is related to #9284
The problem: There are read-timeouts for the full-scan thread during rolling restart nemesis. other tools like c-s and s-b don't experience such issues.
There could be few directions to follow:
The following scenario was tested in order to prove the cql patient connection works as expected:
master...yarongilor:scylla-cluster-tests:check_qcl_connection
Now this test "basically almost" passed ok.
So we can possibly conclude, for example, that the "node" parameter of the cql-patient-connection is unneeded and confusing.
The unexpected "bug" found in the test code is that i expected :
to remove the 3 original cluster nodes - 1,2, and 3.
But it removed nodes 1,3 and 5 instead.
Argus link
As a next step, we can think of a minimal reproducer for a rolling restart nemesis + background full-scan queries.
Beta Was this translation helpful? Give feedback.
All reactions