Skip to content

Conversation

@cameronwhite
Copy link
Contributor

Description of Change(s)

When inserting many scenes at once, the loop to add prefixes of the input scene roots could result in n^2 performance (from querying and searching through GetChildPrimPaths(parent) for each sibling path)

As an example, for native instancing with a large number of prototypes, the input scenes' roots may look like:

  • /UsdNiPropagatedPrototypes/__hash_1__/__Prototype_1/UsdNiInstancer
  • ...
  • /UsdNiPropagatedPrototypes/__hash_N__/__Prototype_1/UsdNiInstancer

This results in N calls to GetChildPrimPaths("/UsdNiPropagatedPrototypes"), which generates a list of N paths that is linearly searched through.

The change here is to make use the of the visited set to also record the existence of siblings that were discovered via GetChildPrimPaths(parent)

Attached is an example scene with 10k prototypes (usdskel_instanced.zip), along with before & after stats of the total time spent in InsertInputScenes().

Before:

    0.004 ms     0.002 ms       1 samples    |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.015 ms     0.012 ms       3 samples    |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.003 ms     0.001 ms       2 samples    |   |   | pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.000 ms     0.000 ms       1 samples    |   |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    2.358 ms     0.054 ms       2 samples    |   | pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
   12.630 ms     6.018 ms   20000 samples    |   |   |   |   |   |   | pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.267 ms     0.267 ms   10000 samples    |   |   |   |   |   |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
16965.960 ms   304.298 ms       1 samples    |   |   |   |   |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes

After:

    0.005 ms     0.002 ms       1 samples    |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.016 ms     0.012 ms       3 samples    |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.004 ms     0.002 ms       2 samples    |   |   | pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.000 ms     0.000 ms       1 samples    |   |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    2.875 ms     0.044 ms       2 samples    |   | pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
   11.634 ms     5.476 ms   20000 samples    |   |   |   |   |   |   | pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
    0.314 ms     0.314 ms   10000 samples    |   |   |   |   |   |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes
 2722.158 ms   284.471 ms       1 samples    |   |   |   |   |   |   pxrInternal_v0_25_8__pxrReserved__::HdMergingSceneIndex::InsertInputScenes

Note: most of the remaining time in InsertInputScenes() in this case is from rebuilding the path table, which #3757 addresses

Link to proposal (if applicable)

Fixes Issue(s)

N/A

Checklist

This could produce n^2 behavior from querying and searching through
GetChildPrimPaths(parent) for each sibling path.

As an example, for native instancing with a large number of prototypes,
the input scenes' roots are of the form:
- /UsdNiPropagatedPrototypes/__hash_1__/__Prototype_1/UsdNiInstancer
...
- /UsdNiPropagatedPrototypes/__hash_N__/__Prototype_1/UsdNiInstancer

This results in N calls to GetChildPrimPaths("/UsdNiPropagatedPrototypes"),
which generates a list of N paths that is linearly searched through.

The change here is to make use the of the `visited` set to record
the existence of siblings that are discovered via GetChildPrimPaths(parent)

For a test scene with native instances with 17500 prototypes, this
reduces the time spent in this block from around 50s to 42ms
@jesschimein
Copy link
Collaborator

Filed as internal issue #USD-11285

(This is an automated message. See here for more information.)

@unhyperbolic
Copy link
Member

unhyperbolic commented Sep 22, 2025

There was a change in the semantics of HdSceneIndexBase::GetPrim:

/// \a primPath if and only if datasource is a non-null pointer.

In particular, GetPrim has to return a non-null data source if GetChildPrimPaths is non-empty.

I refactored the HdMergingSceneIndex to make use of it - and I think that makes the optimization in this pull request unnecessary. @cameronwhite can you confirm and close this pull request?

@cameronwhite
Copy link
Contributor Author

In particular, GetPrim has to return a non-null data source if GetChildPrimPaths is non-empty.
I refactored the HdMergingSceneIndex to make use of it - and I think that makes the optimization in this pull request unnecessary.

Thanks! Yes, I agree this PR isn't needed any more with the latest changes 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants