Skip to content

Calculated value for osd target memory too high for deployments with multiple OSDs per device #7435

Open
osism/container-image-ceph-ansible
#488
@janhorstmann

Description

Bug Report

What happened:
osd target memory was set to a much higher value after upgrading to pacific, resulting in recurring out of memory kills of OSDs.

Cause:
Commit 225ae38ee2f74165e7d265817597fe451df3e919 changed the calculation of num_osds, which is used to calculate a sensible value for osd memory target. The new formula uses ansible's difference filter, which according to the docs returns a list with unique elements.
Thus on deployments with multiple OSDs per device, where the same device should be counted multiple times, the value for num_osds is too small and there is an overestimation of the available memory per OSD.

Apart from that DB devices are now also counted into num_osds

Workarounds:
Set a fixed value for osd memory target in ceph_conf_overrides.

Environment:

  • Ansible version (e.g. ansible-playbook --version):
    ansible-playbook 2.10.17
    config file = None
    configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
    ansible python module location = /usr/local/lib/python3.8/dist-packages/ansible
    executable location = /usr/local/bin/ansible-playbook
    python version = 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0]
    
  • ceph-ansible version (e.g. git head or tag or stable branch): stable-6.0 (same calculation in stable-7.0 and main, but unverified)
  • Ceph version (e.g. ceph -v): ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions