- 
                Notifications
    
You must be signed in to change notification settings  - Fork 21
 
Description
In rapidsai/cudf#19895, I hit an issue with how rapidsmpf creates and sets new memory resources as part of the worker setup for the dask and single shufflers. In that PR, cudf-polars wants all allocations to be made on a StatisticsResourceAdaptor so it can track memory usage. This runs into an error as part of a cudf-polars shuffle using rapidsmpf: rapidsmpf changes the memory resource out from under cudf-polars and an error is raised.
Concretely, I'd like to change the worker setup function to check what memory resource it gets from rmm.mr.get_current_device_resource(). As long as that Memory Resource satisfies rapidsmpf's needs (IIUC, that it's a RmmResourceAdaptor) then rapidsmpf should use it rather than creating a new one.
Alternatively, users could (optionally) pass in a concrete memory resource and have rapidsmpf use that, if it's valid, if we don't want to rely on passing around the MR via the global device resource. Either way, rapidsmpf will still set the memory resource as RMM's current device resource, so maybe getting it from the global isn't so bad.
There's one complication around handling RMM statistics. A couple spots in RMM specifically want a RmmResourceAdaptor rather than an arbitrary RMM memory resource. We'll need to either update those spots to handle RmmStatisticsAdaptor(RmmResourceAdaptor(...)), or extract the RmmResourceAdaptor from the RmmStatisticsAdaptor before handing them in.