Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.
- Fused Operations - Operations are fused when possible
- GPU Acceleration - Built on CUDA/Thrust for high performance
- Chainable API - Clean, expressive syntax for complex operations
- Header-Only - Simple integration with
#include "parrot.hpp"
ΒΉ Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.
#include "parrot.hpp"
int main() {
// Create arrays
auto A = parrot::array({3, 4, 0, 8, 2});
auto B = parrot::array({6, 7, 2, 1, 8});
auto C = parrot::array({2, 5, 7, 4, 3});
// Chain operations
(B * C + A).print(); // Output: 15 39 14 12 26
}#include "parrot.hpp"
using namespace parrot::literals;
auto softmax(auto matrix) {
auto cols = matrix.ncols();
auto z = matrix - matrix.maxr(2_ic).replicate(cols);
auto num = z.exp();
auto den = num.sum(2_ic);
return num / den.replicate(cols);
}
int main() {
auto matrix = parrot::range(6).as<float>().reshape({2, 3});
softmax(matrix).print();
}- CMake (3.10+)
- C++ compiler with C++20 support
- NVIDIA CUDA Toolkit 13.0 or later
- NVIDIA GPU with compute capability 7.0+
# Clone the repository
git clone <repository-url>
cd parrot
# Create build directory
mkdir build && cd build
# Configure and build
cmake ..
cmake --build . -j$(nproc)
# Run tests
ctestFor detailed build instructions, see BUILDING.md.
Parrot provides significant code reduction compared to other CUDA libraries:
| Library | Code Reduction |
|---|---|
| Thrust | ~10x less code |
See detailed comparisons in our documentation.
- Full Documentation - Complete API reference and examples
- Examples - Standalone Parrot examples
- vs Thrust - Side-by-side comparisons with Thrust
The examples/ directory contains:
getting_started/- Simple examples to get startedthrust/- Parrot implementations of Thrust examples
# All tests
ctest
# Individual test categories
./test_basic # Basic operations
./test_sorting # Sorting algorithms
./test_math # Mathematical operations
./test_reductions # Reduction operations# Run clang-tidy
./scripts/run-clang-tidy.sh
# Auto-fix issues
./scripts/run-clang-tidy.sh --fix# Install dependencies
uv venv
uv pip install -r requirements.txt
# Build docs
cd scripts && ./build-docs.shparrot/
βββ parrot.hpp # Main header (single-file library)
βββ thrustx.hpp # Extended Thrust utilities
βββ examples/ # Example code
β βββ getting_started/ # Simple getting-started examples
β βββ thrust/ # Parrot versions of Thrust examples
β βββ real_world/ # Parrot versions of examples from open source projects
βββ tests/ # Unit tests
βββ docs/ # Documentation source
βββ scripts/ # Development scripts
We welcome contributions! Please see our CONTRIBUTING.md guide for details on:
- Setting up the development environment
- Code standards and review process
- Running tests and code quality checks
- Building documentation
All contributions must comply with the Apache License 2.0.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
This project includes third-party software components. See THIRD_PARTY_LICENSES for complete license information and attributions.
Built on top of NVIDIA Thrust and CUDA.
