Skip to content

Need to increase wait timeout for disrupt_refuse_connection_with_block_scylla_ports_on_banned_node nemesis #10434

Open
@timtimb0t

Description

@timtimb0t

https://argus.scylladb.com/tests/scylla-cluster-tests/8e0f83fc-cc50-48d7-ba27-f0a13c98195f

disrupt_refuse_connection_with_block_scylla_ports_on_banned_node nemesis failed with the following error:

Failure reason

Traceback (most recent call last):
  File "/home/ubuntu/scylla-cluster-tests/sdcm/wait.py", line 70, in wait_for
    res = retry(func, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 475, in __call__
    do = self.iter(retry_state=retry_state)
  File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 376, in iter
    result = action(retry_state)
  File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 418, in exc_check
    raise retry_exc.reraise()
  File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 186, in reraise
    raise self
tenacity.RetryError: RetryError[<Future at 0x76716ebb5180 state=finished returned bool>]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 5652, in wrapper
    result = method(*args[1:], **kwargs)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 5452, in disrupt_refuse_connection_with_block_scylla_ports_on_banned_node
    self._refuse_connection_from_banned_node(use_iptables=True)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 5499, in _refuse_connection_from_banned_node
    wait_for(node_operations.is_node_seen_as_down, timeout=600, throw_exc=True,
  File "/home/ubuntu/scylla-cluster-tests/sdcm/wait.py", line 86, in wait_for
    raise raising_exc from ex
sdcm.exceptions.WaitForTimeoutError: Wait for: Wait other nodes see parallel-topology-schema-changes-mu-d

but all nodes reported that target node is down:

2025-03-15T05:42:27.625+00:00 parallel-topology-schema-changes-mu-db-node-8e0f83fc-5     !INFO | scylla[5559]:  [shard 0: gms] gossip - InetAddress 2e41a1df-584f-437a-aea4-c6863ace52bb/2a05:d018:12e3:f002:23d8:e06f:6598:530a is now DOWN, status = shutdown
2025-03-15T05:42:27.636+00:00 parallel-topology-schema-changes-mu-db-node-8e0f83fc-2     !INFO | scylla[5569]:  [shard 0: gms] gossip - InetAddress 2e41a1df-584f-437a-aea4-c6863ace52bb/2a05:d018:12e3:f002:23d8:e06f:6598:530a is now DOWN, status = shutdown
2025-03-15T05:42:27.593+00:00 parallel-topology-schema-changes-mu-db-node-8e0f83fc-6     !INFO | scylla[5559]:  [shard 0: gms] gossip - InetAddress 2e41a1df-584f-437a-aea4-c6863ace52bb/2a05:d018:12e3:f002:23d8:e06f:6598:530a is now DOWN, status = shutdown
2025-03-15T05:42:27.657+00:00 parallel-topology-schema-changes-mu-db-node-8e0f83fc-1     !INFO | scylla[5582]:  [shard 0: gms] gossip - InetAddress 2e41a1df-584f-437a-aea4-c6863ace52bb/2a05:d018:12e3:f002:23d8:e06f:6598:530a is now DOWN, status = shutdown
2025-03-15T05:42:27.523+00:00 parallel-topology-schema-changes-mu-db-node-8e0f83fc-4     !INFO | scylla[5559]:  [shard 0: gms] gossip - InetAddress 2e41a1df-584f-437a-aea4-c6863ace52bb/2a05:d018:12e3:f002:23d8:e06f:6598:530a is now DOWN, status = shutdown

SCT started to simulate node unavailability at 05:32:

< t:2025-03-15 05:32:16,048 f:node_operations.py l:16   c:sdcm.cluster_aws     p:DEBUG > Node parallel-topology-schema-changes-mu-db-node-8e0f83fc-3 [3.252.226.116 | 10.4.9.224 | 2a05:d018:12e3:f002:23d8:e06f:6598:530a] (dc name: eu-westscylla_node_west): Block connections parallel-topology-schema-changes-mu-db-node-8e0f83fc-3

Need to increase timeout from 600 seconds to higher value

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions