Skip to content

A C++17 Data Stream Processing Parallel Library for Multicores and GPUs

License

LGPL-3.0, MIT licenses found

Licenses found

LGPL-3.0
LICENSE.LGPL
MIT
LICENSE.MIT
Notifications You must be signed in to change notification settings

Della97/WindFlow-Kafka

 
 

Repository files navigation

License: LGPL v3 License: MIT Release Hits Say Thanks! Donate

Introduction

WindFlow is a C++17 header-only library for parallel data stream processing targeting heterogeneous shared-memory architectures equipped with multi-core CPUs and NVIDIA GPUs. The library provides traditional stream processing operators like map, flatmap, filter, fold/reduce as well as sliding-window operators. The API allows building streaming applications through the MultiPipe and the PipeGraph programming constructs. The first is used to create parallel pipelines, while the second allows several MultiPipe instances to be interconnected through merge and split operations, in order to create complex directed acyclic graphs of interconnected operators.

WindFlow does not support streaming analytics applications only (e.g., the ones written with relational algebra query languages), but rather general-purpose streaming applications can be supported through operators with user-defined custom logics. In terms of runtime system, WindFlow is suitable for embedded architectures equipped with low-power multi-core CPUs and integrated NVIDIA GPUs (like the Jetson family of NVIDIA boards). However, it works also on traditional multi-core servers equipped with discrete NVIDIA GPUs.

The web site of the library is available at: https://paragroup.github.io/WindFlow/.

Dependencies

The library requires the following dependencies:

  • a C++ compiler with full support for C++17 (WindFlow tests have been successfully compiled with both GCC and CLANG)
  • FastFlow version >= 3.0 (https://github.com/fastflow/fastflow)
  • CUDA (for using operators targeting GPUs)
  • libtbb-dev for using efficient concurrent containers required by the GPU operators
  • libgraphviz-dev and rapidjson-dev when compiling with -DWF_TRACING_ENABLED to report statistics and using the Web Dashboard
  • librdkafka-dev for using the integration with Kafka (special Kafka Source and Sink operators)
  • doxygen (to generate the documentation)

Important about the FastFlow dependency -> after downloading FastFlow, the user needs to configure the library for the underlying multi-core environment. By default, FastFlow pins its threads onto the cores of the machine. To make FastFlow aware of the ordering of cores, and their correspondence in CPUs and NUMA regions, it is important to run (just one time) the script "mapping_string.sh" in the folder fastflow/ff before compiling your programs.

Macros

WindFlow, and its underlying level FastFlow, come with some important macros that can be used during compilation to enable specific behaviors:

  • -DWF_TRACING_ENABLED -> enables tracing (logging) at the WindFlow level (operator replicas), and allows streaming applications to continuously report statistics to a Web Dashboard (which is a separate sub-project). Outputs are also written in log files at the end of the processing
  • -DTRACE_FASTFLOW -> enables tracing (logging) at the FastFlow level (raw threads and FastFlow nodes). Outputs are written in log files at the end of the processing
  • -DFF_BOUNDED_BUFFER -> enables the use of bounded lock-free queues for pointer passing between threads. Otherwise, queues are unbounded (no backpressure mechanism)
  • -DDEFAULT_BUFFER_CAPACITY=VALUE -> set the size of the lock-free queues capacity. The default size of the queues is of 2048 entries. We suggest the users to greatly reduce this size in applications that use GPU operators (e.g., using values between 16 to 128 depending on the available GPU memory)
  • -DNO_DEFAULT_MAPPING -> if set, FastFlow threads are not pinned onto the CPU cores and are scheduled by the Operating System
  • -DBLOCKING_MODE -> if set, FastFlow queues use the blocking concurrency mode (pushing to a full queue or polling from an empty queue might suspend the underlying thread). If not set, waiting conditions are implemented by busy-waiting spin loops.

Some macros are useful to configure the run-time system when GPU operators are utilized in your applications. The default version of the GPU support is based on explicit CUDA memory management and overlapped data transfers, which is a version suitable for a wide range of NVIDIA GPU models. However, the developer could want to switch to a different implementation that makes use of the CUDA unified memory support. This can be done by compiling with the macro -DWF_GPU_UNIFIED_MEMORY. Unified memory support has a variable performance dependening on the underlying GPU models and the version of the CUDA driver.

Build the Examples

WindFlow is a header-only template library. To build your applications you have to include the main header of the library (windflow.hpp). For using the operators targeting GPUs, you further have to include the windflow_gpu.hpp header file and compile using the nvcc CUDA compiler (or through clang with CUDA support). The source code in this repository includes several examples that can be used to understand the use of the API and the advanced features of the library. The examples can be found in the tests folder. To compile them:

    $ cd <WINDFLOW_ROOT>
    $ mkdir ./build
    $ cd build
    $ cmake ..
    $ make -j<no_cores> # compile all the tests (not the doxygen documentation)
    $ make all_cpu -j<no_cores> # compile only CPU tests
    $ make all_gpu -j<no_cores> # compile only GPU tests
    $ make docs # generate the doxygen documentation (if doxygen has been installed)

Docker Images

Two Docker images are available in the WindFlow GitHub repository. The images contain all the synthetic tests compiled and ready to be executed. To build the first image (the one without tests using GPU operators) execute the following commands:

    $ cd <WINDFLOW_ROOT>
    $ cd dockerimages
    $ docker build -t windflow_nogpu -f Dockerfile_nogpu .
    $ docker run windflow_nogpu ./bin/graph_tests/test_graph_1 -r 1 -l 10000 -k 10

The last command executes one of the synthetic experiments (test_graph_1). You can execute any of the compiled tests in the same mannner.

The second image contains all synthetic tests with GPU operators. To use your GPU device with Docker, please follow the guidelines in the following page (https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). Then, you can build the image and run the container as follows:

    $ cd <WINDFLOW_ROOT>
    $ cd dockerimages
    $ docker build -t windflow_gpu -f Dockerfile_gpu .
    $ docker run --gpus all windflow_gpu ./bin/graph_tests_gpu/test_graph_gpu_1 -r 1 -l 10000 -k 10

Again, the last command executes one of the synthetic experiments (test_graph_gpu_1). You can execute any of the compiled tests in the same mannner.

Web Dashboard

WindFlow has its own Web Dashboard that can be used to profile the execution of running WindFlow applications. The dashboard code is in the sub-folder WINDFLOW_ROOT/dashboard. It is a Java package based on Spring (for the Web Server) and developed using React for the front-end part. To start the Web Dashboard run the following commands:

    cd <WINDFLOW_ROOT>/dashboard/Server
    mvn spring-boot:run

The web server listens on the default port 8080 of the machine. To change the port, and other configuration parameters, users should modify the configuration file WINDFLOW_ROOT/dashboard/Server/src/main/resources/application.properties for the Spring server (e.g., to change the HTTP port), and the file WINDFLOW_ROOT/dashboard/Server/src/main/java/com/server/CustomServer/Configuration/config.json for the internal server receiving reports of statistics from the WindFlow applications (e.g., to change the port used by applications to report statistics to the dashboard).

WindFlow applications compiled with the macro -DWF_TRACING_ENABLED try to connect to the Web Dashboard and report statistics to it every second. By default, the applications assume that the dashboard is running on the local machine. To change the hostname and the port number, developers can use the macros WF_DASHBOARD_MACHINE=hostname/ip_addr and WF_DASHBOARD_PORT=port_number.

About the License

From version 3.1.0, WindFlow is released with a double license: LGPL-3 and MIT. Programmers should check the licenses of the other libraries used as dependencies.

Cite our Work

In order to cite our work, we kindly ask interested people to use the following reference:

@article{9408386,
  author={Mencagli, Gabriele and Torquati, Massimo and Cardaci, Andrea and Fais, Alessandra and Rinaldi, Luca and Danelutto, Marco},
  journal={IEEE Transactions on Parallel and Distributed Systems},
  title={WindFlow: High-Speed Continuous Stream Processing With Parallel Building Blocks},
  year={2021},
  volume={32},
  number={11},
  pages={2748-2763},
  doi={10.1109/TPDS.2021.3073970}
}

Requests for Modifications

If you are using WindFlow for your purposes and you are interested in specific modifications of the API (or of the runtime system), please send an email to the maintainer.

Contributors

The main developer and maintainer of WindFlow is Gabriele Mencagli (Department of Computer Science, University of Pisa, Italy).

About

A C++17 Data Stream Processing Parallel Library for Multicores and GPUs

Resources

License

LGPL-3.0, MIT licenses found

Licenses found

LGPL-3.0
LICENSE.LGPL
MIT
LICENSE.MIT

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 91.0%
  • JavaScript 4.3%
  • Java 2.6%
  • CMake 1.7%
  • HTML 0.2%
  • CSS 0.1%
  • Other 0.1%