Skip to content

Scylla failed to start after some sstable files deleted: init - Startup failed: std::runtime_error .. (error system:2, filesystem error: stat failed: No such file or directory  #5431

Open
@yarongilor

Description

@yarongilor

Installation details

Kernel Version: 5.15.0-1019-aws
Scylla version (or git commit hash): 2022.2.0~rc2-20220919.75d087a2b75a with build-id 463f1a57b82041a6c6b6441f0cbc26c8ad93091e
Relocatable Package: http://downloads.scylladb.com/downloads/scylla-enterprise/relocatable/scylladb-2022.2/scylla-enterprise-x86_64-package-2022.2.0-rc2.0.20220919.75d087a2b75a.tar.gz
Cluster size: 5 nodes (i3.4xlarge)

Scylla Nodes used in this run:

  • longevity-mv-si-4d-2022-2-db-node-4a622274-9 (34.241.80.176 | 10.4.2.114) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-8 (34.244.72.179 | 10.4.3.19) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-7 (54.75.8.156 | 10.4.0.116) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-6 (52.50.165.228 | 10.4.3.76) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-5 (34.247.163.201 | 10.4.0.224) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-4 (3.250.153.245 | 10.4.1.64) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-3 (34.241.21.128 | 10.4.0.248) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-2 (63.32.45.197 | 10.4.3.83) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-14 (54.194.149.238 | 10.4.2.43) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-13 (54.75.69.155 | 10.4.3.195) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-12 (54.246.37.34 | 10.4.1.150) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-11 (54.154.54.123 | 10.4.1.231) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-10 (34.247.156.237 | 10.4.0.167) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-4a622274-1 (3.250.23.185 | 10.4.0.254) (shards: 14)

OS / Image: ami-00bd31f22bcf5ae1a (aws: eu-west-1)

Test: ics-longevity-mv-si-4days-test
Test id: 4a622274-af57-417f-a1ec-4cc4c89af60e
Test name: enterprise-2022.2/SCT_Enterprise_Features/ICS/ics-longevity-mv-si-4days-test
Test config file(s):

Issue description

>>>>>>>
Scenario:

  1. Run a configuration of MV + SI.
  2. Run a disrupt_rebuild_streaming_err nemesis:
  3. stop node-13
  4. delete some sstable files.
  5. start node-13
  6. node got the following errors:
< t:2022-10-01 12:50:15,509 f:nemesis.py      l:1005 c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: Set current_disruption -> RebuildStreamingErr Node longevity-mv-si-4d-2022-2-db-node-4a622274-13 [54.75.69.155 | 10.4.3.195] (seed: False)
< t:2022-10-01 12:50:15,514 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > 2022-10-01 12:50:15.509: (DisruptionEvent Severity.NORMAL) period_type=begin event_id=f394ffa3-701c-475a-9ed0-104f96cbf862: nemesis_name=RebuildStreamingErr target_node=Node longevity-mv-si-4d-2022-2-db-node-4a622274-13 [54.75.69.155 | 10.4.3.195] (seed: False)
< t:2022-10-01 12:51:39,450 f:nemesis.py      l:986  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: Files /var/lib/scylla/data/mview/users_by_email-25f851803f1411ed91e552472854bf91/me-3254-* were destroyed
< t:2022-10-01 12:51:40,000 f:nemesis.py      l:986  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: Files /var/lib/scylla/data/mview/users_by_initials-da5d43803f6711ed800e8e21fe1dd5a1/me-3245-* were destroyed
< t:2022-10-01 12:51:40,511 f:nemesis.py      l:986  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: Files /var/lib/scylla/data/mview/users_by_password-27135e703f1411ed8c10fc81479e6e02/me-3247-* were destroyed
< t:2022-10-01 12:51:41,064 f:nemesis.py      l:986  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: Files /var/lib/scylla/data/mview/users_by_address-d98f6c303f6711ed876201c995b8b238/me-3430-* were destroyed
< t:2022-10-01 12:51:41,575 f:nemesis.py      l:986  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: Files /var/lib/scylla/data/sec_index/users_last_access_ind_index-244d2e613f6811ed800e8e21fe1dd5a1/me-3682-* were destroyed
< t:2022-10-01 12:52:49,420 f:db_log_reader.py l:113  c:sdcm.db_log_reader   p:DEBUG > 2022-10-01T12:52:49+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[115686]:  [shard  0] database - Exception while populating keyspace 'system_schema' with column family 'view_virtual_columns' from file '/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/snapshots/sm_20220930165632UTC/me-982982-big-Summary.db])
< t:2022-10-01 12:52:49,423 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:ERROR > 2022-10-01 12:52:49.421 <2022-10-01 12:52:49.000>: (DatabaseLogEvent Severity.ERROR) period_type=one-time event_id=43db8d81-4ac4-4650-9b6b-25d32dd4c26b: type=FILESYSTEM_ERROR regex=filesystem_error line_number=2967121 node=longevity-mv-si-4d-2022-2-db-node-4a622274-13
< t:2022-10-01 12:52:49,423 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:ERROR > 2022-10-01T12:52:49+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[115686]:  [shard  0] database - Exception while populating keyspace 'system_schema' with column family 'view_virtual_columns' from file '/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/snapshots/sm_20220930165632UTC/me-982982-big-Summary.db])
< t:2022-10-01 13:03:48,806 f:db_log_reader.py l:113  c:sdcm.db_log_reader   p:DEBUG > 2022-10-01T13:03:48+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[115686]:  [shard  0] init - Startup failed: std::runtime_error (Exception while populating keyspace 'system_schema' with column family 'view_virtual_columns' from file '/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/snapshots/sm_20220930165632UTC/me-982982-big-Summary.db]))
< t:2022-10-01 13:03:48,808 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:ERROR > 2022-10-01T13:03:48+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[115686]:  [shard  0] init - Startup failed: std::runtime_error (Exception while populating keyspace 'system_schema' with column family 'view_virtual_columns' from file '/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/snapshots/sm_20220930165632UTC/me-982982-big-Summary.db]))
< t:2022-10-01 13:05:18,081 f:db_log_reader.py l:113  c:sdcm.db_log_reader   p:DEBUG > 2022-10-01T13:05:18+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[116163]:  [shard  0] database - Exception while populating keyspace 'system_schema' with column family 'aggregates' from file '/var/lib/scylla/data/system_schema/aggregates-924c55872e3a345bb10c12f37c1ba895': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/aggregates-924c55872e3a345bb10c12f37c1ba895/snapshots/sm_20220930165632UTC/me-1007076-big-CompressionInfo.db])
< t:2022-10-01 13:05:18,083 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:ERROR > 2022-10-01 13:05:18.082 <2022-10-01 13:05:18.000>: (DatabaseLogEvent Severity.ERROR) period_type=one-time event_id=43db8d81-4ac4-4650-9b6b-25d32dd4c26b: type=FILESYSTEM_ERROR regex=filesystem_error line_number=2996033 node=longevity-mv-si-4d-2022-2-db-node-4a622274-13
< t:2022-10-01 13:05:18,083 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:ERROR > 2022-10-01T13:05:18+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[116163]:  [shard  0] database - Exception while populating keyspace 'system_schema' with column family 'aggregates' from file '/var/lib/scylla/data/system_schema/aggregates-924c55872e3a345bb10c12f37c1ba895': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/aggregates-924c55872e3a345bb10c12f37c1ba895/snapshots/sm_20220930165632UTC/me-1007076-big-CompressionInfo.db])
< t:2022-10-01 13:05:23,785 f:nemesis.py      l:3691 c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.SisyphusMonkey: RebuildStreamingErr Node longevity-mv-si-4d-2022-2-db-node-4a622274-13 [54.75.69.155 | 10.4.3.195] (seed: False) duration -> 908 s

2022-10-01 12:52:49.421 <2022-10-01 12:52:49.000>: (DatabaseLogEvent Severity.ERROR) period_type=one-time event_id=43db8d81-4ac4-4650-9b6b-25d32dd4c26b: type=FILESYSTEM_ERROR regex=filesystem_error line_number=2967121 node=longevity-mv-si-4d-2022-2-db-node-4a622274-13
2022-10-01T12:52:49+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[115686]:  [shard  0] database - Exception while populating keyspace 'system_schema' with column family 'view_virtual_columns' from file '/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/snapshots/sm_20220930165632UTC/me-982982-big-Summary.db])

2022-10-01 13:03:48.806 <2022-10-01 13:03:48.000>: (DatabaseLogEvent Severity.ERROR) period_type=one-time event_id=afd54525-70ca-4030-9014-5e3687075929: type=RUNTIME_ERROR regex=std::runtime_error line_number=2992360 node=longevity-mv-si-4d-2022-2-db-node-4a622274-13
2022-10-01T13:03:48+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[115686]:  [shard  0] init - Startup failed: std::runtime_error (Exception while populating keyspace 'system_schema' with column family 'view_virtual_columns' from file '/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/snapshots/sm_20220930165632UTC/me-982982-big-Summary.db]))
2022-10-01 13:05:18.082 <2022-10-01 13:05:18.000>: (DatabaseLogEvent Severity.ERROR) period_type=one-time event_id=43db8d81-4ac4-4650-9b6b-25d32dd4c26b: type=FILESYSTEM_ERROR regex=filesystem_error line_number=2996033 node=longevity-mv-si-4d-2022-2-db-node-4a622274-13
2022-10-01T13:05:18+00:00 longevity-mv-si-4d-2022-2-db-node-4a622274-13      !ERR | scylla[116163]:  [shard  0] database - Exception while populating keyspace 'system_schema' with column family 'aggregates' from file '/var/lib/scylla/data/system_schema/aggregates-924c55872e3a345bb10c12f37c1ba895': std::filesystem::__cxx11::filesystem_error (error system:2, filesystem error: stat failed: No such file or directory [/var/lib/scylla/data/system_schema/aggregates-924c55872e3a345bb10c12f37c1ba895/snapshots/sm_20220930165632UTC/me-1007076-big-CompressionInfo.db])

<<<<<<<

  • Restore Monitor Stack command: $ hydra investigate show-monitor 4a622274-af57-417f-a1ec-4cc4c89af60e
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 4a622274-af57-417f-a1ec-4cc4c89af60e

Logs:

Jenkins job URL

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions