Skip to content

Improve loading of AMD performace libraries on Windows #524

@vosen

Description

@vosen

The distribution story for AMD perf libraries on Windows is complicated.
While on Linux it's relatively simple: the library is in a system search path and if it's not then you get an error from loader on the command line.

On Windows we might want to ship the libraries ourselves and even if we do not we should allow other to make packages easily.

For that reason instead of linking directly to e.g. rocblas we need to hand-link manually at runtime. The steps are approximately this:

  • Add capability to zluda_bindgen to emit a cached FnTable. It should look something like this:
    Windows:
    struct FnTable {
        library: libloading::os::windows::Library,
        functions: Mutex<FnTableImpl>
    }
    
    impl FnTable {
        fn cuBlasFoo(&self, ...) { (take the mutex, load cached value or load from library) }
    }
    
    struct FnTableImpl {
        cuBLASFoo: Option<fn unsafe(...) -> ...>,
        cuBLASBar: Option<fn unsafe(...) -> ...>,
    }
    
    Linux:
    struct FnTable { }
    
    impl FnTable {
        fn cuBlasFoo(&self, ...) { (call cuBlasFoo directly) }
    }
    
  • Add Windows-specific library load mode which:
    • Searches for roc???.dll in the same directory as the current .dll, in system path, in HIP SDK env variable (HIP_PATH)
    • Shows a message box explaining that it could not find the library and mentions the paths it searched
  • Move all the code to use the new infrastructure

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions