Skip to content

Commit baf9b25

Browse files
authored
Merge pull request #254 from ComputationalRadiationPhysics/develop
Merge develop to master
2 parents 5687201 + f27f048 commit baf9b25

File tree

263 files changed

+16566
-5738
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

263 files changed

+16566
-5738
lines changed

.travis.yml

Lines changed: 328 additions & 225 deletions
Large diffs are not rendered by default.

CMakeLists.txt

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#
2+
# Copyright 2015 Benjamin Worpitz
3+
#
4+
# This file is part of alpaka.
5+
#
6+
# alpaka is free software: you can redistribute it and/or modify
7+
# it under the terms of the GNU Lesser General Public License as published by
8+
# the Free Software Foundation, either version 3 of the License, or
9+
# (at your option) any later version.
10+
#
11+
# alpaka is distributed in the hope that it will be useful,
12+
# but WITHOUT ANY WARRANTY; without even the implied warranty of
13+
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14+
# GNU Lesser General Public License for more details.
15+
#
16+
# You should have received a copy of the GNU Lesser General Public License
17+
# along with alpaka.
18+
# If not, see <http://www.gnu.org/licenses/>.
19+
#
20+
21+
################################################################################
22+
# Required CMake version.
23+
################################################################################
24+
25+
CMAKE_MINIMUM_REQUIRED(VERSION 2.8.12)
26+
27+
PROJECT("alpakaAll")
28+
29+
################################################################################
30+
# Add subdirectories.
31+
################################################################################
32+
33+
ADD_SUBDIRECTORY("example/")
34+
ADD_SUBDIRECTORY("test/")

Findalpaka.cmake

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,19 +26,22 @@
2626
# under the environment variable BOOST_ROOT.
2727
#
2828
# ALPAKA_ACC_CPU_B_SEQ_T_FIBERS_ENABLE will require Boost.Fiber to be built.
29-
# ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE and ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE will require a OpenMP 2.0 capable compiler.
30-
# ALPAKA_ACC_CPU_BT_OMP4_ENABLE will require a OpenMP 4.0 capable compiler.
31-
# ALPAKA_ACC_GPU_CUDA_ENABLE will require CUDA 7.0 to be installed.
29+
# ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE and ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE will require a OpenMP 2.0+ capable compiler.
30+
# ALPAKA_ACC_CPU_BT_OMP4_ENABLE will require a OpenMP 4.0+ capable compiler.
31+
# ALPAKA_ACC_GPU_CUDA_ENABLE will require CUDA 7.0+ to be installed.
32+
# ALPAKA_ACC_CPU_B_TBB_T_SEQ_ENABLE will require TBB 2.2+ to be installed
3233
#
3334
# Set the following CMake variables BEFORE calling find_packages to
3435
# change the behaviour of this module:
36+
# - ``ALPAKA_ACC_GPU_CUDA_ONLY_MODE`` {ON, OFF}
3537
# - ``ALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE`` {ON, OFF}
3638
# - ``ALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLE`` {ON, OFF}
3739
# - ``ALPAKA_ACC_CPU_B_SEQ_T_FIBERS_ENABLE`` {ON, OFF}
3840
# - ``ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE`` {ON, OFF}
3941
# - ``ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE`` {ON, OFF}
4042
# - ``ALPAKA_ACC_CPU_BT_OMP4_ENABLE`` {ON, OFF}
4143
# - ``ALPAKA_ACC_GPU_CUDA_ENABLE`` {ON, OFF}
44+
# - ``ALPAKA_ACC_CPU_B_TBB_T_SEQ_ENABLE`` {ON, OFF}
4245
# - ``ALPAKA_CUDA_VERSION`` {7.0, ...}
4346
# - ``ALPAKA_CUDA_ARCH`` {sm_20, sm...}
4447
# - ``ALPAKA_CUDA_FAST_MATH`` {ON, OFF}

README.md

Lines changed: 92 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
**alpaka** - Abstraction Library for Parallel Kernel Acceleration
22
=================================================================
33

4+
[![Build Status](https://travis-ci.org/ComputationalRadiationPhysics/alpaka.svg?branch=develop)](https://travis-ci.org/ComputationalRadiationPhysics/alpaka)
5+
[![Build status](https://ci.appveyor.com/api/projects/status/xjeyugcg1cb0662s/branch/develop?svg=true)](https://ci.appveyor.com/project/BenjaminW3/alpaka-vuiya/branch/develop)
6+
47
The **alpaka** library is a header-only C++11 abstraction library for accelerator development.
58

69
Its aim is to provide performance portability across accelerators through the abstraction (not hiding!) of the underlying levels of parallelism.
@@ -42,73 +45,129 @@ Accelerator Back-ends
4245
|Accelerator Back-end|Lib/API|Devices|Execution strategy grid-blocks|Execution strategy block-threads|
4346
|---|---|---|---|---|
4447
|Serial|n/a|Host CPU (single core)|sequential|sequential (only 1 thread per block)|
45-
|OpenMP 2.0 blocks|OpenMP 2.0|Host CPU (multi core)|parallel (preemptive multitasking)|sequential (only 1 thread per block)|
46-
|OpenMP 2.0 threads|OpenMP 2.0|Host CPU (multi core)|sequential|parallel (preemptive multitasking)|
47-
|OpenMP 4.0 (CPU)|OpenMP 4.0|Host CPU (multi core)|parallel (undefined)|parallel (preemptive multitasking)|
48+
|OpenMP 2.0+ blocks|OpenMP 2.0+|Host CPU (multi core)|parallel (preemptive multitasking)|sequential (only 1 thread per block)|
49+
|OpenMP 2.0+ threads|OpenMP 2.0+|Host CPU (multi core)|sequential|parallel (preemptive multitasking)|
50+
|OpenMP 4.0+ (CPU)|OpenMP 4.0+|Host CPU (multi core)|parallel (undefined)|parallel (preemptive multitasking)|
4851
| std::thread | std::thread |Host CPU (multi core)|sequential|parallel (preemptive multitasking)|
49-
| Boost.Fiber | boost::fibers::fiber |Host CPU (single core)|sequential|parallel (cooperative multitasking)|
50-
|CUDA 7.0|CUDA 7.0|NVIDIA GPUs SM 2.0+|parallel (undefined)|parallel (lock-step within warps)|
52+
|CUDA 7.0+|CUDA 7.0+|NVIDIA GPUs SM 2.0+|parallel (undefined)|parallel (lock-step within warps)|
53+
|TBB 2.2+ blocks|TBB 2.2+|Host CPU (multi core)|parallel (preemptive multitasking)|sequential (only 1 thread per block)|
5154

5255

5356
Supported Compilers
5457
-------------------
5558

5659
This library uses C++11 (or newer when available).
5760

58-
|Accelerator Back-end|gcc 4.9.2|gcc 5.2|clang 3.5/3.6|clang 3.7|MSVC 2015|
59-
|---|---|---|---|---|---|
60-
|Serial|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
61-
|OpenMP 2.0 blocks|:white_check_mark:|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|
62-
|OpenMP 2.0 threads|:white_check_mark:|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|
63-
|OpenMP 4.0 (CPU)|:white_check_mark:|:white_check_mark:|:x:|:x:|:x:|
64-
| std::thread |:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
65-
| Boost.Fiber |:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
66-
|CUDA 7.0|:white_check_mark:|:x:|:x:|:x:|:x:|
67-
68-
**NOTE**: :bangbang: Currently the *CUDA accelerator back-end* can not be enabled together with the *std::thread accelerator back-end* or the *Boost.Fiber accelerator back-end* due to bugs in the NVIDIA nvcc compiler :bangbang:
69-
70-
Build status master branch: [![Build Status](https://travis-ci.org/ComputationalRadiationPhysics/alpaka.svg?branch=master)](https://travis-ci.org/ComputationalRadiationPhysics/alpaka)
71-
72-
Build status develop branch: [![Build Status](https://travis-ci.org/ComputationalRadiationPhysics/alpaka.svg?branch=develop)](https://travis-ci.org/ComputationalRadiationPhysics/alpaka)
61+
|Accelerator Back-end|gcc 4.9.2|gcc 5.3|gcc 6.1|clang 3.5/3.6|clang 3.7|clang 3.8|MSVC 2015|
62+
|---|---|---|---|---|---|---|---|
63+
|Serial|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
64+
|OpenMP 2.0+ blocks|:white_check_mark:|:white_check_mark:|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
65+
|OpenMP 2.0+ threads|:white_check_mark:|:white_check_mark:|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
66+
|OpenMP 4.0+ (CPU)|:white_check_mark:|:white_check_mark:|:white_check_mark:|:x:|:x:|:x:|:x:|
67+
| std::thread |:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|
68+
|CUDA 7.0+|:white_check_mark: (nvcc 7.0+)|:x:|:x:|:x:|:x:|:white_check_mark: (native)|:x:|
69+
|TBB 2.2+|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:grey_question:|
7370

7471

7572
Dependencies
7673
------------
7774

78-
[Boost](http://boost.org/) 1.56+ is the only mandatory external dependency.
75+
[Boost](http://boost.org/) 1.59+ is the only mandatory external dependency.
7976
The **alpaka** library itself just requires header-only libraries.
8077
However some of the accelerator back-end implementations require different boost libraries to be built.
8178

82-
When an accelerator back-end using *Boost.Fiber* is enabled, the develop branch of boost and the proposed boost library [`boost-fibers`](https://github.com/olk/boost-fiber) (develop branch) are required.
83-
`boost-fibers`, `boost-context` and all of its dependencies are required to be build in C++14 mode `./b2 cxxflags="-std=c++14"`.
84-
8579
When an accelerator back-end using *CUDA* is enabled, version *7.0* of the *CUDA SDK* is the minimum requirement.
80+
*NOTE*: When using *CUDA* 7.0, the *CUDA accelerator back-end* can not be enabled together with the *std::thread accelerator back-end* or the *Boost.Fiber accelerator back-end* due to bugs in the nvcc compiler.
8681

87-
When an accelerator back-end using *OpenMP 2.0/4.0* is enabled, the compiler and the platform have to support the corresponding *OpenMP* version.
82+
When an accelerator back-end using *OpenMP* is enabled, the compiler and the platform have to support the corresponding minimum *OpenMP* version.
83+
84+
When an accelerator back-end using *TBB* is enabled, the compiler and the platform have to support the corresponding minimum *TBB* version.
8885

8986

9087
Usage
9188
-----
9289

9390
The library is header only so nothing has to be build.
9491
CMake 3.3.0+ is required to provide the correct defines and include paths.
95-
Just call `ALPAKA_ADD_EXECUTABLE` instead of `CUDA_ADD_EXECUTABLE` or `ADD_EXECUTABLE` and the difficulties of the CUDA nvcc compier in handling `.cu` and `.cpp` files is automatically taken care of.
96-
Examples how to utilize alpaka within CMake can be found in the `examples` folder.
97-
92+
Just call `ALPAKA_ADD_EXECUTABLE` instead of `CUDA_ADD_EXECUTABLE` or `ADD_EXECUTABLE` and the difficulties of the CUDA nvcc compiler in handling `.cu` and `.cpp` files are automatically taken care of.
9893
Source files do not need any special file ending.
94+
Examples of how to utilize alpaka within CMake can be found in the `example` folder.
95+
9996
The whole alpaka library can be included with: `#include <alpaka/alpaka.hpp>`
100-
Code not intended to be utilized by users is hidden in the `detail` namespace.
97+
Code that is not intended to be utilized by the user is hidden in the `detail` namespace.
98+
99+
100+
Introduction
101+
------------
102+
103+
For a quick introduction, feel free to playback the recording of our presentation at
104+
[GTC 2016](http://mygtc.gputechconf.com/quicklink/858sI36):
105+
106+
- E. Zenker, R. Widera, G. Juckeland et al.,
107+
*Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka*,
108+
[video link (39 min)](http://on-demand.gputechconf.com/gtc/2016/video/S6298.html)
109+
110+
111+
Citing alpaka
112+
-------------
113+
114+
Currently all authors of **alpaka** are scientists or connected with
115+
research. For us to justify the importance and impact of our work, please
116+
consider citing us accordingly in your derived work and publications:
117+
118+
```latex
119+
% Peer-Reviewed Publication %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
120+
%
121+
% Peer reviewed and accepted publication in
122+
% "The Sixth International Workshop on
123+
% Accelerators and Hybrid Exascale Systems (AsHES)"
124+
% at the
125+
% "30th IEEE International Parallel and Distributed
126+
% Processing Symposium" in Chicago, IL, USA
127+
@inproceedings{ZenkerAsHES2016,
128+
author = {Erik Zenker and Benjamin Worpitz and Ren{\'{e}} Widera
129+
and Axel Huebl and Guido Juckeland and
130+
Andreas Kn{\"{u}}pfer and Wolfgang E. Nagel and Michael Bussmann},
131+
title = {Alpaka - An Abstraction Library for Parallel Kernel Acceleration},
132+
archivePrefix = "arXiv",
133+
eprint = {1602.08477},
134+
keywords = {Computer science;CUDA;Mathematical Software;nVidia;OpenMP;Package;
135+
performance portability;Portability;Tesla K20;Tesla K80},
136+
day = {23},
137+
month = {May},
138+
year = {2016},
139+
publisher = {IEEE Computer Society},
140+
url = {http://arxiv.org/abs/1602.08477},
141+
}
142+
143+
144+
% Original Work: Benjamin Worpitz' Master Thesis %%%%%%%%%%
145+
%
146+
@MasterThesis{Worpitz2015,
147+
author = {Benjamin Worpitz},
148+
title = {Investigating performance portability of a highly scalable
149+
particle-in-cell simulation code on various multi-core
150+
architectures},
151+
school = {{Technische Universit{\"{a}}t Dresden}},
152+
month = {Sep},
153+
year = {2015},
154+
type = {Master Thesis},
155+
doi = {10.5281/zenodo.49768},
156+
url = {http://dx.doi.org/10.5281/zenodo.49768}
157+
}
158+
```
101159

102160

103161
Authors
104162
-------
105163

106-
### Maintainers and core developers
164+
### Maintainers and Core Developers
107165

108-
- Benjamin Worpitz
166+
- Benjamin Worpitz (original author)
167+
- Erik Zenker
168+
- Rene Widera
109169

110-
### Participants, Former Members and Thanks
170+
### Former Members, Contributions and Thanks
111171

112172
- Dr. Michael Bussmann
113-
- Rene Widera
114173
- Axel Huebl

0 commit comments

Comments
 (0)