Skip to content
/ parrot Public

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

License

Notifications You must be signed in to change notification settings

NVlabs/parrot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

159 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Sean Parrot

Parrot

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

✨ Features

  • Fused Operations - Operations are fused when possible
  • GPU Acceleration - Built on CUDA/Thrust for high performance
  • Chainable API - Clean, expressive syntax for complex operations
  • Header-Only - Simple integration with #include "parrot.hpp"

ΒΉ Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.

πŸš€ Quick Start

#include "parrot.hpp"

int main() {
    // Create arrays
    auto A = parrot::array({3, 4, 0, 8, 2});
    auto B = parrot::array({6, 7, 2, 1, 8});
    auto C = parrot::array({2, 5, 7, 4, 3});
    
    // Chain operations
    (B * C + A).print();  // Output: 15 39 14 12 26
}

Advanced Example: Row-wise Softmax

#include "parrot.hpp"

using namespace parrot::literals;

auto softmax(auto matrix) {
    auto cols = matrix.ncols();
    auto z    = matrix - matrix.maxr(2_ic).replicate(cols);
    auto num  = z.exp();
    auto den  = num.sum(2_ic);
    return num / den.replicate(cols);
}

int main() {
    auto matrix = parrot::range(6).as<float>().reshape({2, 3});
    softmax(matrix).print();
}

πŸ—οΈ Building

Prerequisites

  • CMake (3.10+)
  • C++ compiler with C++20 support
  • NVIDIA CUDA Toolkit 13.0 or later
  • NVIDIA GPU with compute capability 7.0+

Build Steps

# Clone the repository
git clone <repository-url>
cd parrot

# Create build directory
mkdir build && cd build

# Configure and build
cmake ..
cmake --build . -j$(nproc)

# Run tests
ctest

For detailed build instructions, see BUILDING.md.

πŸ“Š Comparisons

Parrot provides significant code reduction compared to other CUDA libraries:

Library Code Reduction
Thrust ~10x less code

See detailed comparisons in our documentation.

πŸ“š Documentation

πŸ§ͺ Examples

The examples/ directory contains:

  • getting_started/ - Simple examples to get started
  • thrust/ - Parrot implementations of Thrust examples

πŸ› οΈ Development

Running Tests

# All tests
ctest

# Individual test categories
./test_basic      # Basic operations
./test_sorting    # Sorting algorithms
./test_math       # Mathematical operations
./test_reductions # Reduction operations

Code Quality

# Run clang-tidy
./scripts/run-clang-tidy.sh

# Auto-fix issues
./scripts/run-clang-tidy.sh --fix

Building Documentation

# Install dependencies
uv venv
uv pip install -r requirements.txt

# Build docs
cd scripts && ./build-docs.sh

πŸ“ Repository Structure

parrot/
β”œβ”€β”€ parrot.hpp              # Main header (single-file library)
β”œβ”€β”€ thrustx.hpp             # Extended Thrust utilities
β”œβ”€β”€ examples/               # Example code
β”‚   β”œβ”€β”€ getting_started/    # Simple getting-started examples
β”‚   β”œβ”€β”€ thrust/             # Parrot versions of Thrust examples
β”‚   β”œβ”€β”€ real_world/         # Parrot versions of examples from open source projects
β”œβ”€β”€ tests/                  # Unit tests
β”œβ”€β”€ docs/                   # Documentation source
β”œβ”€β”€ scripts/                # Development scripts

🀝 Contributing

We welcome contributions! Please see our CONTRIBUTING.md guide for details on:

  • Setting up the development environment
  • Code standards and review process
  • Running tests and code quality checks
  • Building documentation

All contributions must comply with the Apache License 2.0.

πŸ“„ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Third-Party Licenses

This project includes third-party software components. See THIRD_PARTY_LICENSES for complete license information and attributions.

πŸ™ Acknowledgments

Built on top of NVIDIA Thrust and CUDA.


πŸ“– Read the Full Documentation β†’

About

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published