Stack Allocation Enhancements

Stack allocation of non-escaping ref classes and boxed value classes was enabled in #103361, but only works in limited cases. This issue tracks further enhancements (see also #11192).

### Abilities:
- [x] stack allocation of constant-sized arrays of non-GC types
  - https://github.com/dotnet/runtime/pull/104906
  - #112527 
- [x] stack allocation of constant-sized arrays of GC types
  - https://github.com/dotnet/runtime/pull/111686
  - https://github.com/dotnet/runtime/pull/112250
  - #112676
  - #112711 
- [ ] enable array cases for R2R
- [ ] stack allocation of strings
- [ ] stack allocation of boxed nullables

![image](https://github.com/user-attachments/assets/c37a6441-544c-4ab8-ac8b-542058645c90)

- [ ] zeroing strategy (zero in prolog vs zero at first use)
- [x] stack allocation of some delegates (and perhaps their closures),
  - https://github.com/dotnet/runtime/pull/115172 
  - (static method delegates not yet handled)

```C#
    public static int Test()
    {
        int a = 100;
        Func<int> f = () => { return a; };
        return f();
    }
```

See [note](https://github.com/dotnet/runtime/issues/104936#issuecomment-2234009179) below.
- [ ] evaluate when/if it makes sense to transform stack objects to `localloc` instead of fixed allocations (at least for non-gc types; for GC types there's currently no way to do proper GC reporting)
  - https://github.com/dotnet/runtime/pull/112168
- [ ] Allocations in loops: 
  - https://github.com/dotnet/runtime/pull/106526
  - https://github.com/dotnet/runtime/pull/112168
  - https://devblogs.microsoft.com/java/improving-openjdk-scalar-replacement-part-1-3/
- [ ] Stack allocation of delegates in NAOT
  -   https://github.com/dotnet/runtime/issues/110847

### Analysis:
- [ ] make the analysis more sophisticated to handle increasingly more complex examples (eg fields of value classes, or fields of unescaped ref classes). See for instance
  - #84872
  - #61455
  - https://github.com/dotnet/runtime/pull/105162
  - #111838
  - [x] Field sensitive analysis for value classes. Some enabling PRs:
    - #113711 
    - #113772
    - #113808
    - And finally #113977
  - [ ] fields of unescaped ref classes
- [ ] special case calls to helpers where arguments don't escape (this will require ensuring that the helpers report gc arguments as interior to gc)
- [ ] or, add new helpers that do not need object references (eg box typecheck added in #103361, or say an array store covariance check helper that does not do a null check or bounds check or a store plus write barrier)
- [ ] make inliner more aware when it can enhance stack allocation
  - https://github.com/dotnet/runtime/issues/104479
  - https://github.com/dotnet/runtime/pull/110596
  - #114806 
  - https://github.com/dotnet/runtime/issues/116266
- [ ] improve address-exposed analysis (an exposed stack allocated ref class probably performs worse than a heap allocated one, in most cases) 
  - https://github.com/dotnet/runtime/issues/104250
- [ ] See if we can leverage `Span` capture to give us "cheap" interprocedural analysis
  - https://github.com/dotnet/runtime/pull/112543
  - Currently blocked as spans can "escape" up stack via unsafe constructs

### Implementation:
- [ ] stop relying on top-level `ALLOCOBJ`
- [ ] stop relying on `ALLOCOBJ` assigned to single-def temp in importer
- [x] use custom GC layout for boxes; stop fetching placeholder type from the runtime
  - https://github.com/dotnet/runtime/issues/103362.
  - https://github.com/dotnet/runtime/pull/114716
- [ ] make sure object stack allocation doesn't block fast tail call optimization unnecessarily (currently fast tail call optimization is disabled if there are any exposed local). #111397 handles most of this. But if we introduce `stackalloc` it also inhibits implicit tail calls.
- [x] #115017 
- [ ] Verify that Tier1 instr does not introduce probes that cause objects to escape
- [ ] Find some way to dead-store object fields that are never read (eg VTable slots, in current impl)
  - https://github.com/dotnet/runtime/issues/116163

### NAOT:
 - [ ] interprocedural escape analysis. May also be viable in jitted contexts, either as a hint for the inliner or as some sort of property we can guarantee on profiler-driven re-jit.

### Advanced:
 - [ ] partial escape analysis. Allocate objects that are unlikely to escape on the stack. Either compute the "escape frontier" of an object and copy it to the stack when that frontier is crossed, or else add capabilities to write barriers to note when a stack allocated object reference is going to be stored on the heap, and "promote" the object at that point (using GC info to rewrite the stack references to heap references). Likely requires PGO, to ensure we're not wrong too often.
 - [ ] for objects that don't escape but that we don't want to stack allocate, treat them as thread private: we can use more aggressive value numbering for instance, since we don't have to assume the field values can change asynchronously.

### Diagnostics:
 - [ ] if VM/GC/WriteBarriers catch an escaped object-on-stack they should provide a helpful assert instead of a generic GC hole like assert. We can either reserve a debug bit in the sync block or rely on object's address being within the code heap.

FYI @dotnet/jit-contrib 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stack Allocation Enhancements #104936

Abilities:

Analysis:

Implementation:

NAOT:

Advanced:

Diagnostics:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stack Allocation Enhancements #104936

Description

Abilities:

Analysis:

Implementation:

NAOT:

Advanced:

Diagnostics:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions