Skip to content

jemalloc on aarch64 with 16K page size #1805

@kylewlacy

Description

@kylewlacy

We recently saw a failure during the latest round of Live Update builds, here: #1791. The failure ultimately stemmed from Biome, and I described why I think the failure appeared for the rulesync PR in this comment: #1791 (comment). The important takeaway is we now have 2 aarch64 runners: one with 4K pages and one with 16K pages.

Here's the upstream chain of events that leads to this issue for us:

So, since the default is based on the host's page size and we're now using a mixture of runners, the page size for jemalloc used in our Biome build depends on which runner picks up the job.


For compatibility, we should always use a 16K page size ideally-- at least for aarch64, where using a 16K page size is pretty common.

For now, the easy fix will be to update the Biome package to set JEMALLOC_SYS_WITH_LG_PAGE to use 16K pages (I think that means it should be set to the value 14, i.e. log2 of 16K... but please double check this!)

I can't think of any good defenses to prevent this problem in general. The best case would be if jemalloc just used/supported a 16K page size by default. Given that it doesn't, tikv-jemalloc-sys's behavior seems pretty sensible, but leaves us vulnerable to this exact problem. In the Rust ecosystem, it'd be possible (but a bit annoying) to update the cargoBuild recipe to detect if tikv-jemalloc-sys is included as a dependency, and to error if JEMALLOC_SYS_WITH_LG_PAGE isn't set (probably with a flag to bypass this check). In the C world, I'm not sure if we can meaningfully detect this issue across the board.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions