-
Notifications
You must be signed in to change notification settings - Fork 10
Instance Redicstricting for Dynamic Frambuffer Memory #272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ci: Disable debug info in Rust binaries See merge request StanfordLegion/legion!1811
ci: Disable debug info in Rust binaries See merge request StanfordLegion/legion!1811 (cherry picked from commit 82c2ede) 227303c ci: Disable all debug info in Rust binaries to save space. Co-authored-by: Elliott Slaughter <[email protected]>
legion-ci: remove the specific tag definition for msvc jobs See merge request StanfordLegion/legion!1802
realm: fix examples and tests See merge request StanfordLegion/legion!1813
disable cuhook build by default, add option See merge request StanfordLegion/legion!1810
Realm: disable barrier broadcast See merge request StanfordLegion/legion!1809
disable cuhook build by default, add option See merge request StanfordLegion/legion!1810 (cherry picked from commit eb1ec26) f6d559c disable cuhook build by default, add option 08572ac Merge branch 'master' into sm/fix_cuhook_make Co-authored-by: Seema Mirchandaney <[email protected]>
Realm: disable barrier broadcast See merge request StanfordLegion/legion!1809 (cherry picked from commit 3eba361) 8bb9bfb realm: add broadcast kill-switch dd232c7 realm: add broadcast kill-switch 548e198 realm: remove assert 3609e63 realm: fix clang-format c33960f realm: undo cmake changes aa277d0 realm: remove empty line 487652d realm: disable fragmentation 3478e0e realm: disable broadcast_previous 539601f realm: disable tests 925078f realm: completely disable broadcast paths fe8daf5 realm: undo test changes bae24d7 realm: undo changes to runtime_impl.h ea18bec realm: fix radix 6542a35 realm: fix clang-format 166f0c2 realm: fix typo dace381 realm: undo lines 359e7c9 realm: undo namespace changes Co-authored-by: apryakhin <[email protected]>
[P0] build: Turn RDTSC off by default until we can put the PPC check back See merge request StanfordLegion/legion!1816
[P0] build: Turn RDTSC off by default until we can put the PPC check back See merge request StanfordLegion/legion!1816 (cherry picked from commit 9a4604e) 83ea9d6 build: Turn RDTSC off by default until we can put the PPC check back. Co-authored-by: Elliott Slaughter <[email protected]>
* cuGetProcAddress broke it's own rules for a couple of apis and kept back old versions of apis, but left cuGetProcAddress to return the newer ones, breaking source compatibility guarentees for 13.0 * For those apis that were held back, an explicit context or location was added, so modify the calls to pass in the required arguments, being compatible with older toolkits as well.
#243) …ring the build. This is required for clangd, et al, to function.
Also adds the gh copilot thinking file.
Change the behavior to check the target was defined and the feature was enabled before setting it to be used/built. This is useful when the project is built as part of other projects that may have found a package but do not necessarily want that feature to be built by this project.
* Add sanitizer tests for ASAN, TSAN, and UBSAN * Unfortunately we have a leak in the python module, so disable that. * Fix a couple of found leaks * Add ucx and gasnet builds to the ci
* Fix up cuda compile args to compile SASS for everything up to the last architecture, adding ptx for the latest (fixing an issue with r580 when both sass and ptx is available for a removed architecture) * Add a REALM_DEFAULT_ARGS environment and put the default test arguments here, prepended to what is set in the test environment at runtime. This allows us to set the default, but allow the test environment to override it. * This allows us to, by default, enable GPU tests with one gpu, but override it in the test environment for more than one GPU
This changes request_cancellation to return a bool stating success/fail and also refactors the flow to make it readable.
fixup! Add the ASSERT_REALM macro to machine_config_test to make it more readable (#259)
This eliminates the need for timed_wait as a separate function. Then update the tests to the new API definition
… gpu dynamic framebuffer memory
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #272 +/- ##
========================================
Coverage 26.48% 26.48%
========================================
Files 185 185
Lines 38774 38774
Branches 14316 14192 -124
========================================
+ Hits 10270 10271 +1
+ Misses 28127 27217 -910
- Partials 377 1286 +909 ☔ View full report in Codecov by Sentry. |
@lightsighter Say you have an initial 1GB CUDA allocation, and you redistrict it such that you're only using the first 100MB going forward. Can Realm still serve deferred allocations using the remaining 900MB? Or is it just dead memory at this point? |
At the moment, the memory is just unusable. It will require a more sophisticated implementation of the GPU dynamic memory to support that. |
@lightsighter Does this need to be reviewed now or we are in POC/testing stage? |
Depends how quickly @manopapad needs it to be supported? If he is willing to wait for @muraj's implementation of the new GPU dynamic framebuffer memory then we can close this request. If he needs a stop-gap solution so that instance redistricting is functional (but not "performant", in the memory capacity sense) then we will need to review and merge this. |
This branch adds supports for instance redistricting on instances allocated in the dynamic GPU framebuffer memory. It does this by keeping the underlying CUDA allocation alive as long as there is any instance alive that still refers to the underlying allocation. Once all the instances redistricted on top of the base allocation have been destroyed then the underlying CUDA allocation is freed back to the CUDA driver. This unfortunately means that there is a potential for fragmentation as a large initial allocation could result in just a few small instances surviving after redistricting, but it is at least functionally correct.