Vikunja is a performance portable algorithm library that defines functions operating on ranges of elements for a variety of purposes . It supports the execution on multi-core CPUs and various GPUs.
Vikunja uses alpaka to implement platform-independent primitives such as reduce or transform.
Alpaka requires a boost installation.
git clone --depth 1 --branch 0.8.0 https://github.com/alpaka-group/alpaka.git
mkdir alpaka/build
cd alpaka/build
cmake ..
cmake --install .For more information see the alpaka GitHub repository. It is recommended to use the latest release version. Vikunja supports alpaka from version 0.6 up to version 0.8.
git clone https://github.com/alpaka-group/vikunja.git
mkdir vikunja/build
cd vikunja/build
cmake ..
cmake --install .cd vikunja/build
cmake .. -DBUILD_TESTING=ON
ctestcmake .. -Dvikunja_BUILD_EXAMPLES=ONExamples can be found in the folder example/.
The following source code shows an application that uses vikunja to replace all values in a vector with their absolute values.
#include <vikunja/transform/transform.hpp>
#include <alpaka/alpaka.hpp>
#include <algorithm>
#include <iostream>
#include <random>
int main()
{
// Define the accelerator.
// The accelerator decides on which processor type the vikunja algorithm will be executed.
// The accelerators must be enabled during the CMake configuration to be available.
//
// It is possible to choose from a set of accelerators:
// - AccGpuCudaRt
// - AccGpuHipRt
// - AccCpuThreads
// - AccCpuFibers
// - AccCpuOmp2Threads
// - AccCpuOmp2Blocks
// - AccOmp5
// - AccCpuTbbBlocks
// - AccCpuSerial
using Acc = alpaka::AccCpuOmp2Blocks<alpaka::DimInt<1u>, int>;
// Create a device that executes the algorithm.
// For example, it can be a CPU or GPU Nr. 0 or 1 in a multi-GPU system.
auto const devAcc = alpaka::getDevByIdx<Acc>(0u);
// The host device is required if the devAcc does not use the same memory as the host.
// For example, if the host is a CPU and the device is a GPU.
auto const devHost(alpaka::getDevByIdx<alpaka::PltfCpu>(0u));
// All algorithms must be enqueued so that they are executed in the correct order.
using QueueAcc = alpaka::Queue<Acc, alpaka::Blocking>;
QueueAcc queueAcc(devAcc);
// Dimension of the problem. 1D in this case (inherited from the Accelerator).
using Dim = alpaka::Dim<Acc>;
// The index type needs to fit the problem size.
// A smaller index type can reduce the execution time.
// In this case the index type is inherited from the Accelerator: std::uint64_t.
using Idx = alpaka::Idx<Acc>;
// Type of the user data.
using Data = int;
// The extent stores the problem size.
using Vec = alpaka::Vec<Dim, Idx>;
Vec extent(Vec::all(static_cast<Idx>(10)));
// Allocate memory for the device.
auto deviceMem(alpaka::allocBuf<Data, Idx>(devAcc, extent));
// The memory is accessed via a pointer.
Data* deviceNativePtr = alpaka::getPtrNative(deviceMem);
// Allocate memory for the host.
auto hostMem(alpaka::allocBuf<Data, Idx>(devHost, extent));
Data* hostNativePtr = alpaka::getPtrNative(hostMem);
// Initialize the host memory with random values from -10 to 10.
std::uniform_int_distribution<Data> distribution(-10, 10);
std::default_random_engine generator;
std::generate(
hostNativePtr,
hostNativePtr + extent.prod(),
[&distribution, &generator]() { return distribution(generator); });
// Copy data to the device.
alpaka::memcpy(queueAcc, deviceMem, hostMem, extent);
// Use a lambda function to define the transformation function.
// Returns the absolute value of each input
auto abs = [] ALPAKA_FN_HOST_ACC(auto const& acc, Data const j) { return alpaka::math::abs(acc, j); };
vikunja::transform::deviceTransform<Acc>(
devAcc, // The device that executes the algorithm.
queueAcc, // Queue in which the algorithm is enqueued.
extent.prod(), // Problem size
deviceNativePtr, // Input memory
deviceNativePtr, // Operator
abs);
// Copy data to the host.
alpaka::memcpy(queueAcc, hostMem, deviceMem, extent);
for(Data i = 0; i < extent.prod(); ++i)
{
std::cout << hostNativePtr[i] << " ";
}
std::cout << std::endl;
return 0;
}CMakeLists.txt
cmake_minimum_required(VERSION 3.18)
project(vikunjaAbs)
add_subdirectory(vikunja REQUIRED)
alpaka_add_executable(${CMAKE_PROJECT_NAME} main.cpp)
target_link_libraries(${CMAKE_PROJECT_NAME} PRIVATE vikunja::vikunja)Build instructions:
# the source folder contains the main.cpp and the CMakeLists.txt
cd <folder/with/source/code>
mkdir build && cd build
# configure build with OpenMP backend enabled
cmake .. -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE=ON
# compile application
cmake --build .
# run application
./vikunjaAbs # output: 10 8 5 1 1 6 10 4 4 9- You can find the general documentation here: https://vikunja.readthedocs.io/en/latest/
- You can find the API documentation here: https://vikunja.readthedocs.io/en/latest/doxygen/index.html
- Simeon Ehrig*
- Dr. Michael Bussmann
- Hauke Mewes
- René Widera
- Bernhard Manfred Gruber
- Jan Stephan
- Dr. Jiří Vyskočil
- Matthias Werner
