Description
Installation details
Kernel Version: 5.15.0-1026-aws
Scylla version (or git commit hash): 5.2.0~dev-20221209.6075e01312a5
with build-id 0e5d044b8f9e5bdf7f53cc3c1e959fab95bf027c
Cluster size: 9 nodes (i3.2xlarge)
Scylla Nodes used in this run:
- longevity-counters-multidc-master-db-node-7785df01-9 (54.157.115.162 | 10.12.2.62) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-8 (3.238.92.3 | 10.12.2.95) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-7 (3.236.190.51 | 10.12.0.119) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-6 (54.212.64.38 | 10.15.0.77) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-5 (35.92.94.31 | 10.15.3.207) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-4 (34.219.193.110 | 10.15.3.94) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-3 (52.213.121.166 | 10.4.0.42) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-2 (54.229.18.181 | 10.4.2.143) (shards: 7)
- longevity-counters-multidc-master-db-node-7785df01-1 (34.245.75.18 | 10.4.0.195) (shards: 7)
OS / Image: ami-0b85d6f35bddaff65 ami-0a1ff01b931943772 ami-08e5c2ae0089cade3
(aws: eu-west-1)
Test: longevity-counters-6h-multidc-test
Test id: 7785df01-a1fe-483a-beb7-2f63b9044b87
Test name: scylla-master/raft/longevity-counters-6h-multidc-test
Test config file(s):
Issue description
Counters test in multidc scenario is failing persistenlty after altering table.
E.g. after running ALTER TABLE scylla_bench.test_counters WITH bloom_filter_fp_chance = 0.45374057709882093
or ALTER TABLE scylla_bench.test_counters WITH read_repair_chance = 0.9;
, or even ALTER TABLE scylla_bench.test_counters WITH comment = 'IHQS6RAYS5VQ6CQZYBYEX1GP';
after such changes, scylla-bench is failing tests due error:
2022/12/09 15:26:29 error: failed to connect to "[HostInfo hostname=\"10.12.0.119\" connectAddress=\"10.12.0.119\" peer=\"<nil>\" rpc_address=\"10.12.0.119\" broadcast_address=\"10.12.0.119\" preferred_ip=\"<nil>\" connect_addr=\"10.12.0.119\" connect_addr_source=\"connect_address\" port=9042 data_centre=\"us-eastscylla_node_east\" rack=\"1a\" host_id=\"ec773dfb-ef87-4ab8-abbf-190e3e082e4c\" version=\"v3.0.8\" state=DOWN num_tokens=256]" due to error: gocql: no response to connection startup within timeout
later it looks connection is recovered - so connection issues are not permanent. But it is enough to fail test critically ending the test.
- Restore Monitor Stack command:
$ hydra investigate show-monitor 7785df01-a1fe-483a-beb7-2f63b9044b87
- Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs 7785df01-a1fe-483a-beb7-2f63b9044b87
Logs:
| 20221209_161654 | grafana | https://cloudius-jenkins-test.s3.amazonaws.com/7785df01-a1fe-483a-beb7-2f63b9044b87/20221209_161654/grafana-screenshot-longevity-counters-6h-multidc-test-scylla-per-server-metrics-nemesis-20221209_161803-longevity-counters-multidc-master-monitor-node-7785df01-1.png |
| 20221209_161654 | grafana | https://cloudius-jenkins-test.s3.amazonaws.com/7785df01-a1fe-483a-beb7-2f63b9044b87/20221209_161654/grafana-screenshot-overview-20221209_161654-longevity-counters-multidc-master-monitor-node-7785df01-1.png |
| 20221209_162553 | db-cluster | https://cloudius-jenkins-test.s3.amazonaws.com/7785df01-a1fe-483a-beb7-2f63b9044b87/20221209_162553/db-cluster-7785df01.tar.gz |
| 20221209_162553 | loader-set | https://cloudius-jenkins-test.s3.amazonaws.com/7785df01-a1fe-483a-beb7-2f63b9044b87/20221209_162553/loader-set-7785df01.tar.gz |
| 20221209_162553 | monitor-set | https://cloudius-jenkins-test.s3.amazonaws.com/7785df01-a1fe-483a-beb7-2f63b9044b87/20221209_162553/monitor-set-7785df01.tar.gz |
| 20221209_162553 | sct | https://cloudius-jenkins-test.s3.amazonaws.com/7785df01-a1fe-483a-beb7-2f63b9044b87/20221209_162553/sct-runner-7785df01.tar.gz