Skip to content

cwida/FastLanesGpu-Damon2025

Repository files navigation

G-ALP

G-ALP is a compression scheme for floating-point data on GPUs.

  • G-ALP is a light-weight compression encoding.
  • G-ALP is designed for OLAP workloads, prioritizing high compression and high decompression throughput.
  • G-ALP has a single value decoding API, allowing for fine-grained access to compressed data.
  • G-ALP decoding API is kernel-agnostic.
  • G-ALP is based on ALP, but with data-parallel exception patching .
  • G-ALP is part of the development of the FastLanes project.

G-ALP matches nv-zstd's compression ratio while outperforming all other encodings in decompression throughput in this decompression into GPU RAM benchmark. Scatter chart showing compression on x-axis, decompression throughput on y-axis, for various compression schemes. G-ALP performs best in decompression throughput while attaining a high compression ratio.

G-ALP is almost an order of magnitude faster in a filter benchmark on compressed data by fusing the decompression and processing kernel in this filter benchmark on compressed data. Bar chart filter throughput on y-axis, for various compression schemes. G-ALP performs best in decompression throughput as G-ALP fuses the decompression kernel with the data processing kernel.

Fusing the decompression and processing kernel is possible with the kernel-agnostic single value decoding API. The decoding API makes no assumption about the kernel that calls it, and requires no access to special memory regions (shared memory or constant memory). This allows any kernel to decompress G-ALP encoded data.

This repository contains the source code and benchmark results for the paper "G-ALP: Rethinking Light-weight Encodings for GPUs", presented at DaMoN 2025 in Berlin. The benchmarks can be repeated on a machine with an NVIDIA GPU using the benchmarking script provided in the repo. In this repo only a GPU decompressor for G-ALP is implemented.

Compilation

Software requirements:

  • nvcc
  • nvCOMP
  • clang++-14 (to compile ALP)
  • g++-12 (to compile nvCOMP)

To compile all executables, run:

make all -j 8

The full compilation takes a while, -j 8 adds multiprocessing to compilation.

To only compile the compressor benchmarks for real data benchmarking:

make compressors-benchmark

Benchmarks

Requires Nsight Compute CLI. To test on real datasets, place the binary files of single precision float arrays into the folder binary-columns. NCU requires sudo to read performance counters.

Compile the code, and run all benchmarks:

make benchmark-all

To run only the benchmarks:

make benchmark-compressors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors