forked from apache/cassandra-gocql-driver
-
Notifications
You must be signed in to change notification settings - Fork 70
Open
Description
Packages
Scylla version: 2026.1.0~dev-20251104.fc37518affc8 with build-id d0aed830ec418cca7e757a143b3b85b120c9e396
Kernel Version: 6.14.0-1016-aws
Issue description
versions:
gemini-gocql-driver v1.15.3 2025-09-06T16:49:42Z e35803084ebafd200e3f7fd74a5be5dfdb409b2d
gemini 2.1.5 2025-10-14T18:23:43Z 1bad12f14f6832dbdc2211627079d91d8c610bf3
The gemini stress tool ran on loader-1.
At some point it failed for wrong number of shards, trying to connect to db-node-2:
2025-11-05 20:20:56.012: (GeminiStressEvent Severity.ERROR) period_type=end event_id=05d2fbed-cff6-42d9-b164-6c59729889d1 during_nemesis=NoCorruptRepair duration=12h23m40s: node=Node gemini-tombstones-sequence-gemini-t-loader-node-95c071d7-1 [108.130.4.141 | 10.4.2.15] (Type: m6i.2xlarge) (rack: RACK0) gemini_cmd=gemini --test-cluster="10.4.0.102,10.4.0.86,10.4.0.43,10.4.3.77,10.4.2.13,10.4.2.150" --seed=64 --schema-seed=64 --profiling-port=6060 --bind=0.0.0.0:2112 --outfile=/gemini_result_e3f7489c-5f71-481d-a5f5-126647e610b9.log --replication-strategy="{'class': 'NetworkTopologyStrategy', 'replication_factor': '3'}" --oracle-replication-strategy="{'class': 'NetworkTopologyStrategy', 'replication_factor': '1'}" --oracle-cluster="10.4.3.80" --test-statement-log-file=/gemini_test_statements_e3f7489c-5f71-481d-a5f5-126647e610b9.log --oracle-statement-log-file=/gemini_oracle_statements_e3f7489c-5f71-481d-a5f5-126647e610b9.log --level=info --request-timeout=3s --connect-timeout=60s --consistency=QUORUM --async-objects-stabilization-backoff=10ms --async-objects-stabilization-attempts=10 --dataset-size=large --oracle-host-selection-policy=token-aware --test-host-selection-policy=token-aware --drop-schema=true --cql-features=normal --materialized-views=false --use-server-timestamps=true --use-lwt=false --use-counters=false --max-tables=1 --max-columns=16 --min-columns=8 --max-partition-keys=6 --min-partition-keys=2 --max-clustering-keys=4 --min-clustering-keys=2 --partition-key-distribution=uniform --partition-key-buffer-reuse-size=128 --statement-log-file-compression=zstd --duration 24h --warmup 10m --concurrency 200 --mode mixed --max-mutation-retries-backoff 10s --max-mutation-retries 30 --token-range-slices 10000 --max-errors-to-store 1 --statement-ratios '{"mutation":{"insert":0.6,"update":0.2,"delete":0.2}}'
result=Exit code: 2
Command output: ['{"level":"info","ts":"2025-11-05T13:02:33.039352708Z","logger":"store.test_store.gocql","msg":"gocql: unable to dial control conn 10.4.2.84:9042: dial tcp 10.4.2.84:9042: connect: connection refused","cluster":"test"}', '{"level":"info","ts":"2025-11-05T13:13:40.148581036Z","logger":"store.test_store.gocql","msg":"gocql: unable to dial control conn 10.4.0.86:9042: dial tcp 10.4.0.86:9042: connect: connection refused","cluster":"test"}']
errors=['Command error: panic: scylla: 10.4.0.86:9042 invalid number of shards\n\ngoroutine 22495328 [running]:\ngithub.com/gocql/gocql.(*scyllaConnPicker).Put(0xc071554000, 0xc0685a4000)\n\t/home/runner/go/pkg/mod/github.com/scylladb/[email protected]/scylla.go:466 +0x409\ngithub.com/gocql/gocql.(*hostConnPool).connect(0xc05a8c8150)\n\t/home/runner/go/pkg/mod/github.com/scylladb/[email protected]/connectionpool.go:550 +0x287\ngithub.com/gocql/gocql.(*hostConnPool).fill(0xc05a8c8150)\n\t/home/runner/go/pkg/mod/github.com/scylladb/[email protected]/connectionpool.go:389 +0x14f\ngithub.com/gocql/gocql/debounce.(*SimpleDebouncer).Debounce(0xc06a84dcd0, 0xe3bc80?)\n\t/home/runner/go/pkg/mod/github.com/scylladb/[email protected]/debounce/simple_debouncer.go:30 +0x5f\ngithub.com/gocql/gocql.(*hostConnPool).fill_debounce(...)\n\t/home/runner/go/pkg/mod/github.com/scylladb/[email protected]/connectionpool.go:421\ncreated by github.com/gocql/gocql.(*hostConnPool).Pick in goroutine 118\n\t/home/runner/go/pkg/mod/github.com/scylladb/[email protected]/connectionpool.go:309 +0x109\n\n']
- This issue is a regression.
- It is unknown if this issue is a regression.
Describe your issue in detail and steps it took to produce it.
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Cluster size: 6 nodes (i4i.large)
Scylla Nodes used in this run:
- gemini-tombstones-sequence-gemini-t-oracle-db-node-95c071d7-1 (34.244.234.197 | 10.4.3.80) (shards: 30)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-9 (108.130.105.61 | 10.4.2.227) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-8 (34.252.52.130 | 10.4.2.84) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-7 (52.31.125.232 | 10.4.2.115) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-6 (34.244.118.147 | 10.4.2.150) (shards: -1)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-5 (176.34.151.34 | 10.4.2.13) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-4 (34.242.12.193 | 10.4.3.77) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-3 (34.249.21.149 | 10.4.0.43) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-2 (34.250.73.158 | 10.4.0.86) (shards: -1)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-11 (108.130.127.120 | 10.4.0.169) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-10 (176.34.75.209 | 10.4.1.111) (shards: 2)
- gemini-tombstones-sequence-gemini-t-db-node-95c071d7-1 (3.250.188.132 | 10.4.0.102) (shards: 2)
OS / Image: ami-0037eef98022b60df (aws: N/A)
Test: gemini-sequence-nemesis
Test id: 95c071d7-8cb1-424a-b328-7b7922bc6c25
Test name: scylla-staging/yarongilor/gemini-sequence-nemesis
Test method: `gemini_test.GeminiTest.test_load_random_with_nemesis`
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor 95c071d7-8cb1-424a-b328-7b7922bc6c25 - Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs 95c071d7-8cb1-424a-b328-7b7922bc6c25