You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, CUB tests run sequentially because some of them test large problem sizes requiring all the VRAM. This limits our coverage of concurrency-related issues. @pauleonix found cases where compute sanitizer and sequential test runs are fine, but parallel runs of ctest lead to time-sharing and expose a data race on CUB end.
By running CUB tests in parallel we'll get better coverage and faster CI. Current plan to achieve that is the following:
extract *_large tests into standalone TUs that require entire GPU and assign them appropriate RESOURCE_GROUPS
identify appropriate concurrency level
use concurrency level from (3) as opt-in for CI - some runners are RAM limited (orin etc.). We should avoid running concurrent tests by default to avoid OOM.
Is this a duplicate?
Overview
Today, CUB tests run sequentially because some of them test large problem sizes requiring all the VRAM. This limits our coverage of concurrency-related issues. @pauleonix found cases where compute sanitizer and sequential test runs are fine, but parallel runs of ctest lead to time-sharing and expose a data race on CUB end.
By running CUB tests in parallel we'll get better coverage and faster CI. Current plan to achieve that is the following:
*_largetests into standalone TUs that require entire GPU and assign them appropriateRESOURCE_GROUPSDetails
No response