Skip to content

Optimized codesize benchmarks do not clearly show the power of individual optimizations #5945

Open
@Manishearth

Description

@Manishearth

Came up in #5935

ICU4X has test-c-tiny and test-js-tiny to show how far codesize can be optimized.

These are incremental, applying optimization on top of optimization to slowly reduce codesize. This shows a nice progression, but it is not helpful when understanding what the effect of each optimization is in isolation.

I think this is an important function of such a benchmark: many of these techniques are not uniformly available and impose additional constraints upon the build: some require nightly, some required paired Rust/Clang versions, some force build-std, some require a particular C compiler, some reduce debuggability, and so on.

Furthermore, a lot of these benchmarks build on top of each other: using a release build will of course help LTO be more effective (percent-wise).

Providing numbers for every combination is going to be a lot of work and likely an overwhelming amount of data. However, I think what we could do is identify a list of optimizations that are potentially relevant but not necessarily always possible, and then provide numbers for:

  • plain release build
  • release build with each of these optimizations individually applied
    • for optimizations that build on each other; e.g. -Clinker-plugin-lto needs LTO, apply its dependencies too
  • release build with all but one of these optimizations applied
    • similar setup for dependent optimizations: remove both
  • release build with all optimizations applied

This would both give us an idea of the immediate wins of individual optimizations, and how they cumulatively work together.

The list of optimizations I can identify are:

  • LTO (off, on, thin, it seems like thin gives us the best perf?)
    • -Clinker-plugin-lto
  • --gc-sections
    • --strip-all
  • panic=abort
    • panic-abort std
      • panic-immediate-abort std
  • one-step vs two-step clang
  • use of lld (?)
  • inclusion of debug symbols in the first place (same as strip? unclear)

This list might be larger than necessary, so we could merge some entries if desired. I might also be missing something. I didn't include Rust debug vs release here because I don't think debug build codesize numbers really mean much, and I can't think of a usecase for caring about those numbers.

Thoughts? @sffc

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-test-infraComponent: Integration test infrastructureS-mediumSize: Less than a week (larger bug fix or enhancement)

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions