Skip to content

Add GWS Auto-Tuner #8

@jtesta

Description

@jtesta

The global work size (GWS) parameter in OpenCL is used to tell a device how many pieces of work to do at a time. Tuning this parameter can result in big improvements in throughput (sometimes over 50%).

Currently, the optimal GWS for each GPU model is determined through manual experimentation and put into gws.c. This method does not scale well, as it leaves out many popular hardware models. A much better method is to add an auto-tuner that determines the optimal setting at run-time.

A proposed solution is this: each time the generation or lookup code is run, it will check if an optimal setting is already known from a previous invokation. This will be done with the following values as a unique key in a hash table: table parameters, device name, driver version (note that the table parameters have been noted to make a difference in optimal GWS; furthermore, driver improvements can make a difference as well). If an optimal setting is already known, it is used; otherwise, variations of the GWS will be tested until an optimal value is found.

The manual GWS command line argument ("-gws") must be preserved in case the user wishes to override this setting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions