Skip to content

Conversation

@NickGerleman
Copy link
Contributor

Summary:
LayoutableChildren<yoga::Node>::Iterator showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with -O2, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations

  1. Removes redundant copies
  2. Removes redundant index keeping
  3. Mark which branches are likely vs unlikely
  4. Shrink iterator size from 6 pointers to 3 pointers
  5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148

@vercel
Copy link

vercel bot commented Nov 1, 2024

Deployment failed with the following error:

You don't have permission to create a Preview Deployment for this project.

View Documentation: https://vercel.com/docs/accounts/team-members-and-roles

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65336148

NickGerleman added a commit to NickGerleman/react-native that referenced this pull request Nov 1, 2024
Summary:
X-link: facebook/yoga#1736


`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148
Summary:

X-link: facebook/react-native#47358

`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148
@vercel
Copy link

vercel bot commented Nov 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
yoga-website ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 1, 2024 8:34pm

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65336148

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8f69ac7.

facebook-github-bot pushed a commit to facebook/litho that referenced this pull request Nov 5, 2024
Summary:
X-link: facebook/yoga#1736

X-link: facebook/react-native#47358

`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

Note that, in real scenarios, measure functions may dominate layout time, so display: contents does not mean end-to-end 4% regression, even after this change.

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.77ms | 10.39ms | 10.17ms |

Changelog: [Internal]

Reviewed By: rozele

Differential Revision: D65336148

fbshipit-source-id: 01c592771ed7accf2d87dddd5a3a9e0225098b56
facebook-github-bot pushed a commit to facebook/react-native that referenced this pull request Nov 5, 2024
Summary:
X-link: facebook/yoga#1736

Pull Request resolved: #47358

`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

Note that, in real scenarios, measure functions may dominate layout time, so display: contents does not mean end-to-end 4% regression, even after this change.

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.77ms | 10.39ms | 10.17ms |

Changelog: [Internal]

Reviewed By: rozele

Differential Revision: D65336148

fbshipit-source-id: 01c592771ed7accf2d87dddd5a3a9e0225098b56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants