Rtti optimization by Graveflo · Pull Request #25707 · nim-lang/Nim

Graveflo · 2026-04-06T01:24:45Z

Drafted. Might need a compile switch. Did not increase compiler boot times.

Summary

This PR changes tiny RTTI display encoding and adds C backend lowering for common object-dispatch chains.

goal: make common runtime subtype disambiguation patterns generate C that is closer to enum-style dispatch, especially for sibling and same-depth chains.

What changed

RTTI display

TNimTypeV2.display now uses compiler-assigned counters instead of sparse hash-derived values.

Each display entry packs two 16-bit discriminants:

low 16 bits: dense per-depth token
high 16 bits: dense sibling token under the immediate parent

This keeps the current runtime footprint while letting:

ordinary of checks use the per-depth lane
sibling-oriented dispatch use the sibling lane

C backend lowering

For same-selector object if/elif chains, the C backend now recognizes common dispatch shapes and emits a switch when possible.

Currently this is only enabled when tiny RTTI is enabled.

The useful fast paths are:

direct sibling chains
same-selector exact-type chains at one depth

Mixed-generation chains still fall back to ordinary condition chains.

Why

The old display values were sparse and not especially helpful to backend optimization.

the packed display scheme improves the discriminants used by runtime checks
the switch lowering is needed to achieve performance targets (particularly GCC)

Limits and tradeoffs

This encoding packs two 16-bit lanes into each display slot, so it introduces a hard limit of high(uint16) for each packed discriminator space.

That means compilation fails if either of these exceeds 65535:

the number of types assigned in a per-depth bucket
the number of sibling ordinals assigned under one parent

This is enforced with a compile-time error.

This PR also does not try to optimize every object-dispatch shape. Mixed-generation chains have unaffected performance characteristics.

Benchmark notes

Two temporary benchmarks are included for investigation and review:

tests/benchmarks/typedispatch.nim
tests/benchmarks/typedispatch_shapes.nim

These are not intended to be permanent and will be removed later.

The benchmark is only a directional signal. It is small, shape-sensitive, and backend-sensitive. Similar trends were seen with Clang; the numbers below
are from GCC on the C backend.

A few representative results from typedispatch_shapes.nim:

baseline:
- family root of: 5.139 ns/op
- exact sibling of: 5.160 ns/op
- exact root of: 6.661 ns/op
new, packed display + trivial switch lowering:
- family root of: 0.913 ns/op
- exact sibling of: 0.906 ns/op
- exact root of: 7.113 ns/op
new, packed display + extended switch lowering:
- family root of: 1.228 ns/op
- exact sibling of: 1.161 ns/op
- exact root of: 1.232 ns/op

For reference, the corresponding kind-based baselines in the same benchmark were around 0.87-1.01 ns/op.

The main observation is:

sibling and family dispatch become much closer to kind-based dispatch
same-depth exact-root chains also improve once lowered to switch
mixed-generation chains remain the slow path

Graveflo · 2026-04-12T08:57:37Z

similar performance benefits observed using C++ back-end

edit: also PR still needs to be cleaned up

Araq · 2026-04-12T09:50:01Z

I worry how this interacts with the IC mechanism, you basically optimize these via fullprogram knowledge -- something which bites IC.

Graveflo · 2026-04-12T10:42:41Z

I worry how this interacts with the IC mechanism, you basically optimize these via fullprogram knowledge -- something which bites IC.

Yea I was thinking about this today. In the current state of the PR, type ordinals are stable when they are observed in the same order through sem. As I understand it, NIF should have all the information needed to build the tables. Right now I think this depends heavily on the backend artifacts being checked since the tables themselves are not persisted. Further, there might be sensitivity problems like changing the import order, and one change to an object hierarchy triggering every module the type tree touches to be recompiled. I haven't tested it though, so I don't know how bad it would be.

This is mostly an additive change, so maybe a compiler switch for a tradeoff?

Araq · 2026-04-12T10:59:50Z

Well, yes, in principle we have the full program as the set of NIF files, but we must ensure it keeps working and isn't too messy. Hard to test with the hardly working IC impl, I know...

Graveflo added 3 commits April 6, 2026 00:27

init

84efdca

switch statements

7ebce3f

benchmarks

3de1cff

Graveflo mentioned this pull request Apr 6, 2026

Object case syntax #25708

Open

Graveflo added 7 commits April 5, 2026 23:33

fix late sigsegv - dont rely on commonSuperclass bad sym

a4acf58

relent to structural type tracking

03b8ec2

fix UB overrun

aac8261

fix duplicate branches - remove dead code

34d524c

update tests

d65cd60

enable c++

dca8e5f

cleanup

8802b99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rtti optimization#25707

Rtti optimization#25707
Graveflo wants to merge 10 commits intonim-lang:develfrom
Graveflo:rtti-optimization

Graveflo commented Apr 6, 2026 •

edited

Loading

Uh oh!

Graveflo commented Apr 12, 2026 •

edited

Loading

Uh oh!

Araq commented Apr 12, 2026

Uh oh!

Graveflo commented Apr 12, 2026

Uh oh!

Araq commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Graveflo commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

RTTI display

C backend lowering

Why

Limits and tradeoffs

Benchmark notes

Uh oh!

Graveflo commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Araq commented Apr 12, 2026

Uh oh!

Graveflo commented Apr 12, 2026

Uh oh!

Araq commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Graveflo commented Apr 6, 2026 •

edited

Loading

Graveflo commented Apr 12, 2026 •

edited

Loading