Skip to content

Meta server with authentication enabled failed and could not be started normally after dropping table #2149

Open
@empiredan

Description

@empiredan

There were a pegasus cluster with 3 meta servers and 5 replica servers. The authentication was enabled. And a script is written to drop a great number of tables.

While the script was being executed, the meta server failed with nothing but got signal id: 11 and following dmesg:

[Tue Nov 12 15:32:39 2024]  meta.meta_stat[681978]: segfault at 40 ip 00007faa351ea839 sp 00007faa0d48abc0 error 4 in libdsn_utils.so[7faa35124000+115000]
[Tue Nov 12 15:32:39 2024] Code: 23 f9 ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 48 8d 45 cf 53 4c 8d 67 08 48 83 ec 28 <4c> 8b 7f 18 48 89 45 b8 48 8d 45 ce 4d 39 e7 48 89 45 b0 74 6c 48

In the logs of failed meta server (namely primary meta server), lots of errors are also found:

E2024-11-12 15:32:45.711 (1731396765711870108 a67f5)   meta.meta_server4.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:32:45.713 (1731396765713890084 a67f5)   meta.meta_server4.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:32:52.225 (1731396772225348205 a67f6)   meta.meta_server5.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:32:52.226 (1731396772226529887 a67f6)   meta.meta_server5.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:32:58.919 (1731396778919427343 a67f3)   meta.meta_server2.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:32:58.921 (1731396778921276545 a67f3)   meta.meta_server2.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:33:06.374 (1731396786374687523 a67f6)   meta.meta_server5.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:33:06.376 (1731396786376019669 a67f6)   meta.meta_server5.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:33:14.775 (1731396794775332362 a67f2)   meta.meta_server1.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:33:14.777 (1731396794777299007 a67f2)   meta.meta_server1.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:33:30.679 (1731396810679840313 a67f3)   meta.meta_server2.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:33:30.681 (1731396810681638580 a67f3)   meta.meta_server2.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:33:37.501 (1731396817501816052 a67f7)   meta.meta_server6.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:33:37.503 (1731396817503027320 a67f7)   meta.meta_server6.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.
E2024-11-12 15:33:44.338 (1731396824338693868 a67f4)   meta.meta_server3.01010000000009fa: ranger_resource_policy_manager.cpp:641:sync_policies_to_app_envs(): ERR_INVALID_PARAMETERS: set_app_envs failed.
E2024-11-12 15:33:44.339 (1731396824339976731 a67f4)   meta.meta_server3.01010000000009fa: ranger_resource_policy_manager.cpp:304:update_policies_from_ranger_service(): ERR_INVALID_PARAMETERS: Sync policies to app envs failed.

After that, other standby meta servers also failed while they tried to take over. See following logs:

E2024-11-12 15:34:33.624 (1731396873624300621 19c265)   meta.meta_server0.010200030000042e: server_state.cpp:689:operator()(): assertion expression: false
F2024-11-12 15:34:33.624 (1731396873624310529 19c265)   meta.meta_server0.010200030000042e: server_state.cpp:689:operator()(): invalid status(app_status::AS_DROPPING) for app(abc(1)) in remote storage

AS_DROPPING was found persistent on the remote meta storage (namely ZooKeeper) as the status of the table.

{"status":"app_status::AS_DROPPING","app_type":"pegasus","app_name":"abc","app_id":1,"partition_count":8, ...}

However, this state is just an intermediate state, which should not be found on ZooKeeper.

Then, all meta server were never be started normally: they exited immediately after they were started.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugThis issue reports a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions