-
Notifications
You must be signed in to change notification settings - Fork 5
Quick Start Guide
Nitro is a machine learning-based performance tuning system for GPU applications. It dynamically selects the optimal algorithmic variant to execute based on characteristics of the input data set. This guide provides an overview of the framework and describes how to tune your applications using Nitro. For more details on Nitro, including performance results, please refer to our IPDPS 2014 paper.
Nitro has the following prerequisites:
- NVIDIA CUDA toolkit (tested with version 7.0+).
- Python 2.7+
- libSVM 3.12+
- (Optional, for the SpMV example) NVIDIA CUSP library
To build the benchmarks included in the examples/ subdirectory, the following system paths must be set:
- Set
$NITRO_ROOTto directory where Nitro is installed. - Add
$NITRO ROOT/srcto$PYTHONPATH. - Set
$LIBSVM PATHto directory where libSVM is installed (make sure libSVM is built). - Add the libSVM installation directory to
$PATHand$LD LIBRARY PATH. - Set
$CUSP PATHto directory where CUSP library is installed. - Set
$SHADER MODELto the SM architecture of the GPU you have (eg.:35 for Tesla K20c).
Nitro consists of two parts:
- A Header-only C++ template library
- A Python-based autotuning interface
An existing application can be integrated with Nitro by (1) specifying code variants, features and constraints within the application code using Nitro’s C++ library, (2) building a customized autotuner for the application using Nitro’s Python interface, and (3) specifying training data and (optionally) testing data. The Python interface allows users to customize various aspects of the tuning process such as toggling constraints, specifying which classifier to use, classifier properties etc.
Given a set of code/algorithmic variants (hereafter referred to as code variants), and a set of features associated with each, Nitro selects the optimal (w.r.t some metric such as performance) variant for the given input and architecture. To accomplish this, Nitro builds a classification model from user-provided training data. This classification model is then used to predict the code variant to execute. Each feature is a user-defined function that describes a characteristic of the input. For example, average out-degree of a graph or the number of rows in a matrix.
A Sparse-Matrix Vector Multiplication (SpMV) example application is included with this release to demonstrate how applications can be tuned using Nitro. It can be found at examples/spmv. The following subsections walk through the process of integrating Nitro into the C++ part of the application and then customizing the tuning process using the Python-based tuning interface.
Before specifying variants and features in the target C++ application, an object of type nitro::context must first be created. This object sets up initial state and manages the coordination among the various code variants executing within the application.
Code variants are represented by the nitro::code_variant class, which takes the following template parameters:
-
Tuning policy: For a code variant named, say
variantX, a class of the same name is generated automatically in thenitro::tuning_policiesnamespace innitro_config.h. The class is generated according to what's specified in thetune.pyscript (described in the next section). -
Argument type tuple: The types of the arguments of the variant must be wrapped in a
thrust::tupletype and specified here. For example, if the variant takes argument types(float*, float*), thenthrust::tuple<float*, float*>is the argument tuple type.
An example instantiation of the code_variant object would be:
using namespace nitro;
typedef thrust::tuple<float*, float*> ArgTuple;
code_variant<tuning_policies::test_variant, ArgTuple> v(cx, "test_variant");This creates an object v of type nitro::code_variant that uses the tuning policy specified in nitro::tuning_policies::test variant and takes in argument types (float*, float*). Note that the string specified as the second argument to the constructor must exactly match the name of the variant in the tune.py script.
A code variant in Nitro can take any types of arguments but must return a string representing the performance of the variant on the given input. There are two ways in which a variant can be defined:
-
Create functor: The functor must derive from
nitro::variant_type<T1, T2, ...>where eachTiis an argument type of the variant. Further, it must override the function call operator taking the correct argument types and returning a double. In our running example, this would bedouble operator()(float*, float*) { ... }. -
Wrap function pointer: In case the variant is defined as a function that accepts the relevant argument types and returns a string, then Nitro provides a function pointer wrapper class wrap variant that can create the correct functor type. For our example, this would be (assuming
variant1is one of the variants defined as a function):nitro::wrap variant<float*, float*> variant1(variant1);.
Variants defined using the notation explained in the previous subsection can be added to a code_variant object using the add_variant function. It accepts a pointer to the function object representing the variant and adds it to the internal list of variants maintained by the code_variant object.
Execution of the variant is accomplished by simply calling the code_variant object's function call operator with the required input arguments. Depending on the tuning context, Nitro automatically selects the correct internal variant to execute.
Input features can be specified in a way similar to variants: using either a functor, or using the wrap feature meta-function. Feature functions accept the same parameters as variants, and must return a real (double) value representing the value of the calculated feature for that input data. For example:
double feature1(float* a, float *b) {
...
return foo;
}
...
wrap_feature<float*, float*> _feature1(feature1);
test_variant.add_input_feature(&_feature1);For certain inputs, it's possible that a variant produces wrong results, or takes unacceptably long to execute. To handle such cases, Nitro supports the specification of constraint functions. Constraint functions can be added to code variants using the add_constraint function which accepts a constraint function and the specific variant for which it is valid. Constraints are automatically evaluated by Nitro and force the computation to revert to the default variant if a constraint fails in the deployed executable. In the SpMV example, the dia_cutoff constraint ensures that the DIA variant doesn't execute if the constraint evaluates to false.
Tuning parameters may be specified using Nitro's Python interface. This includes training inputs, the machine learning algorithm, etc. For SpMV, the configuration file is tune.py. The final call to the tune() function begins the autotuning process.
Running the tuning script using
$ python tune.py
invokes the autotuner. Training data is automatically collected, and variant selection models are placed in the models/ subdirectory.
Once the tuning process is complete, an adaptive executable named spmv_tuned is automatically generated, which is capable of querying the constructed model at runtime and selecting the appropriate variant to execute.