Skip to content

Optimize import substitution with zero-allocation ImportPathIter#100

Open
CrazyRoka wants to merge 1 commit intobevyengine:masterfrom
CrazyRoka:optimize-substitute-identifiers
Open

Optimize import substitution with zero-allocation ImportPathIter#100
CrazyRoka wants to merge 1 commit intobevyengine:masterfrom
CrazyRoka:optimize-substitute-identifiers

Conversation

@CrazyRoka
Copy link

Optimize import substitution with zero-allocation ImportPathIter

Background

I identified significant memory allocations in the import substitution method while profiling the Bevy game engine's "many cubes" example using the DHAT memory profiler. Specifically, the to_owned() function was being called frequently, responsible for 2.34% of overall allocated memory blocks.

Changes

This PR introduces a custom ImportPathIter enum (similar to Either enum crate) and refactors the substitute_identifiers function to use this new iterator. The main goals were to reduce allocations and improve overall performance.

Key changes include:

  • Introduced ImportPathIter enum to handle different types of iterators. That helps us to avoid allocating new Vec or Box.
  • Refactored substitute_identifiers to use ImportPathIter, reducing allocations.
  • Replaced vector cloning with an iterator-based approach for better performance.
  • Implemented minor optimizations, such as using contains instead of find for quote checks.

Impact

While the memory usage of the targeted code path was relatively low (0.02% of the overall program), this optimization provides a noticeable improvement:

  • Before: to_owned() was responsible for 2.34% of overall allocated memory blocks
  • After: The function no longer appears in profiling results, indicating successful optimization

Testing

Before

  ├─▶ PP 1.5/9 (2 children) {
  │     Total:     385,755 bytes (0.02%, 26,305.12/s) in 53,169 blocks (2.34%, 3,625.66/s), avg size 7.26 bytes, avg lifetime 2.46 µs (0% of program duration)
  │     At t-gmax: 0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │     At t-end:  0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │     Allocated at {
  │       #1: 0x5564ed256876: <alloc::alloc::Global as core::alloc::Allocator>::allocate (alloc/src/alloc.rs:243:9)
  │       #2: 0x5564ed256876: alloc::raw_vec::RawVec<T,A>::try_allocate_in (alloc/src/raw_vec.rs:230:45)
  │       #3: 0x5564ed256876: alloc::raw_vec::RawVec<T,A>::with_capacity_in (alloc/src/raw_vec.rs:158:15)
  │       #4: 0x5564ed256876: alloc::vec::Vec<T,A>::with_capacity_in (src/vec/mod.rs:699:20)
  │       #5: 0x5564ed256876: <T as alloc::slice::hack::ConvertVec>::to_vec (alloc/src/slice.rs:162:25)
  │       #6: 0x5564ed256876: alloc::slice::hack::to_vec (alloc/src/slice.rs:111:9)
  │       #7: 0x5564ed256876: alloc::slice::<impl [T]>::to_vec_in (alloc/src/slice.rs:441:9)
  │       #8: 0x5564ed256876: alloc::slice::<impl [T]>::to_vec (alloc/src/slice.rs:416:14)
  │       #9: 0x5564ed256876: alloc::slice::<impl alloc::borrow::ToOwned for [T]>::to_owned (alloc/src/slice.rs:823:14)
  │       #10: 0x5564ed256876: alloc::str::<impl alloc::borrow::ToOwned for str>::to_owned (alloc/src/str.rs:211:62)
  │       #11: 0x5564ed256876: naga_oil::compose::parse_imports::substitute_identifiers (src/compose/parse_imports.rs:135:47)
  │     }
  │   }

After

I ran the application again and conducted profiling. The to_owned() function was no longer present in the profiling results, confirming that we've successfully addressed the issue.

@robtfm
Copy link
Collaborator

robtfm commented Aug 18, 2024

does it have any impact on time? i think optimizing out individual string allocations is almost certainly not worthwhile.

the extra code complexity is quite minor though, if it was any more complex i think i would object more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants