Skip to content

MgmtRepair nemeses failed with cause: get repair target: get cluster views: gocql: no response received from cassandra within timeout period #3612

Open
@temichus

Description

@temichus

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

MgmtRepair nemeses failed too fast by error: get repair target: get cluster views: gocql: no response received from Cassandra within timeout period

next nemesis run finished too fast and looks false positive:

disrupt_mgmt_repair_cli longevity-5gb-1h-MgmtRepair-master-db-node-5681c472-1 Succeeded 2023-10-18 00:00:22 2023-10-18 00:48:42

the following runs failed by the other error:

Cause: another task is running

Impact

The repair task failed.

How frequently does it reproduce?

This is a intermittent problem.

Installation details

Kernel Version: 5.15.0-1047-aws
Scylla version (or git commit hash): 5.4.0~dev-20231006.498e3ec435be with build-id 16c6112202348a8adba536b4195d48adfdf958f9

Cluster size: 3 nodes (i4i.large)

Scylla Nodes used in this run:

  • longevity-5gb-1h-MgmtRepair-master-db-node-5681c472-3 (34.220.158.221 | 10.15.2.190) (shards: 2)
  • longevity-5gb-1h-MgmtRepair-master-db-node-5681c472-2 (54.244.67.109 | 10.15.2.6) (shards: 2)
  • longevity-5gb-1h-MgmtRepair-master-db-node-5681c472-1 (54.200.196.131 | 10.15.0.87) (shards: 2)

OS / Image: ami-057fff75e186f7fa9 (aws: undefined_region)

Test: longevity-5gb-1h-MgmtRepair-aws-test
Test id: 5681c472-7f68-44fc-91bc-5ae9bd4feec1
Test name: scylla-master/nemesis/longevity-5gb-1h-MgmtRepair-aws-test
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 5681c472-7f68-44fc-91bc-5ae9bd4feec1
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 5681c472-7f68-44fc-91bc-5ae9bd4feec1

Logs:

Jenkins job URL
Argus

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions