@@ -7,12 +7,12 @@ lang: en
7
7
8
8
# Kokkos overview
9
9
10
- - Kokkos is a C++ performance portability ecosystem
10
+ - Kokkos is a C++ performance portability ecosystem
11
11
- Linux Foundation project, originally started in Sandia National Laboratories in 2011
12
12
- Part of the US Department of Energies Exascale Project, developed in several
13
13
supercomputing centers in the US and in Europe
14
14
- Abstractions for both parallel execution of code and data management
15
- - Designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources
15
+ - Designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources
16
16
- Currently there are CUDA, HIP, SYCL, HPX, OpenMP and C++ threads backends
17
17
18
18
# Kokkos ecosystem
@@ -135,14 +135,14 @@ Kokkos::initialize(Kokkos::InitializationSettings()
135
135
.set_device_id(0) /* select the device (eg, 0th gpu of the total of 4 gpus) */
136
136
.set_disable_warnings(false) /* disable warning messages */
137
137
.set_num_threads(1) /* set the number of threads */
138
- .set_print_configuration(true)); /* print the configuration after initialization */
138
+ .set_print_configuration(true)); /* print the configuration after initialization */
139
139
```
140
140
- Kokkos docs: [ https://kokkos.github.io/kokkos-core-wiki/API/core/Initialize-and-Finalize.html ] ( https://kokkos.github.io/kokkos-core-wiki/API/core/Initialize-and-Finalize.html )
141
141
</small >
142
142
143
143
144
144
145
- # Kokkos programmin - hello example
145
+ # Kokkos programming - hello example
146
146
- The following is a full example of a Kokkos program that initializes Kokkos and prints the execution space and memory space instances
147
147
```
148
148
#include <Kokkos_Core.hpp>
@@ -188,17 +188,17 @@ int main(int argc, char* argv[]) {
188
188
# Views - examples
189
189
190
190
``` cpp
191
- Kokkos::View<int *> a ("a", n); // 1D array with runtime dimension
191
+ Kokkos::View<int *> a ("a", n); // 1D array with runtime dimension
192
192
Kokkos::View<double* [ 3] > b("b", n); // 2D n x 3 array with compile time dimension
193
193
Kokkos::View<double** , Kokkos::HostSpace> h_b("h_b", n, m);
194
194
Kokkos::View<double*** , Kokkos::SharedSpace> s_b("s_b", n, m, k);
195
- Kokkos::View<double** , Kokkos::Device<Kokkos::Serial, Kokkos::SharedSpace> >
195
+ Kokkos::View<double** , Kokkos::Device<Kokkos::Serial, Kokkos::SharedSpace> >
196
196
s2_b("s2_b", n, m); // Specify execution space
197
197
198
198
std::cout << "Execution space of a: " <<
199
199
decltype(a)::execution_space::name() << std::endl;
200
200
std::cout << "Memory space of a: " <<
201
- decltype(a)::memory_space::name() << std::endl;
201
+ decltype(a)::memory_space::name() << std::endl;
202
202
```
203
203
```
204
204
$ srun ... ./views
@@ -208,7 +208,7 @@ Memory space of a: HIP
208
208
209
209
# Accessing entries
210
210
211
- - The elements of a view are accessed using parentheses enclosing a comma-delimited list of integer indices
211
+ - The elements of a view are accessed using parentheses enclosing a comma-delimited list of integer indices
212
212
(similar to Fortran and C++23 `mdspan`)
213
213
- View’s entries can be accessed only in an execution space which is allowed to access that View’s memory space
214
214
@@ -228,13 +228,13 @@ d_a(1, 1) = 12; // Error, can be accessed only from code running on device
228
228
```
229
229
Kokkos::View<int*> a ("a", 10);
230
230
Kokkos::View<int*, Kokkos::HostSpace> b ("b", 10);
231
- Kokkos::deep_copy (a, b); // copy contents of b into a
231
+ Kokkos::deep_copy (a, b); // copy contents of b into a
232
232
```
233
233
234
234
# Memory management with raw pointers
235
235
236
236
- Kokkos supports also using raw pointers
237
- - With raw pointers, one can simply allocate and deallocate memory by
237
+ - With raw pointers, one can simply allocate and deallocate memory by
238
238
239
239
<small >
240
240
@@ -266,7 +266,7 @@ where `Kokkos::SharedSpace` maps to any potentially available memory of "Unified
266
266
267
267
# Parallel operations
268
268
269
- - Kokkos provides three different parallel operations: ` parallel_for ` , ` parallel_reduce ` , and ` parallel_scan `
269
+ - Kokkos provides three different parallel operations: ` parallel_for ` , ` parallel_reduce ` , and ` parallel_scan `
270
270
- The ` parallel_for ` operation is used to execute a loop in parallel
271
271
- The ` parallel_reduce ` operation is used to execute a loop in parallel and reduce the results to a single value
272
272
- The ` parallel_scan ` operation implements a prefix scan
@@ -311,7 +311,7 @@ Kokkos::parallel_reduce(n, KOKKOS_LAMBDA(const int i, int &lsum) {
311
311
- The iteration space of parallel operation is defined by * execution policy*
312
312
- Kokkos provides several possibilities
313
313
- integer: 1D iteration from 0 to count
314
- - RangePolicy: 1D iteration from start to end
314
+ - RangePolicy: 1D iteration from start to end
315
315
- MDRangePolicy: multi-dimensional iteration space
316
316
- ,,,
317
317
- Kokkos promises nothing about the loop order or the amount of work which actually runs concurrently
@@ -356,12 +356,12 @@ auto h_a2 = Kokkos::create_mirror(d_a) // Always allocate h_a2
356
356
- Kokkos allows one to work with slices (similar to Python and Fortran) via * subviews* .
357
357
- Subview is always reference, * i.e.* modifying data via subview modifies also the orginal array
358
358
- Slices are defined with ` std::make_pair `
359
- - A special ` Kokkos::ALL() `
359
+ - A special ` Kokkos::ALL() `
360
360
```
361
361
Kokkos::View<double**> a ("a", 10, 10);
362
- // a(2:4, 3:7) slice
362
+ // a(2:4, 3:7) slice
363
363
auto a_slice = Kokkos::subview(a, std::make_pair(2, 4), std::make_pair(3, 7));
364
- // a(:, 5) slice
364
+ // a(:, 5) slice
365
365
auto a_slice2 = Kokkos::subview(a, Kokkos::ALL(), 5);
366
366
```
367
367
0 commit comments