|
| 1 | +# AI Development Guide for PopSift |
| 2 | + |
| 3 | +This guide defines how AI-assisted code generation should be done in this repository. |
| 4 | +It ensures that contributions (from GitHub Copilot, ChatGPT, Claude, etc.) follow a **consistent, modern, and maintainable style**. |
| 5 | + |
| 6 | +--- |
| 7 | + |
| 8 | +## General Principles |
| 9 | + |
| 10 | +- Always prioritize **readability** and **clarity** over micro-optimizations. |
| 11 | +- Follow **modern C++17 best practices**. |
| 12 | +- Keep host-side C++ and CUDA device code **cleanly separated**. |
| 13 | +- Prefer **modularity**: each class or major component should live in its own file. |
| 14 | +- Code should be **self-documenting** whenever possible, with clear naming and structure. |
| 15 | + |
| 16 | +--- |
| 17 | + |
| 18 | +## C++ Guidelines |
| 19 | + |
| 20 | +- **Standard**: Use **C++17**. Prefer `constexpr`, `auto`, `enum class`, range-based for loops, and smart pointers (`std::unique_ptr`, `std::shared_ptr`). |
| 21 | +- **Memory Management**: Use RAII. Avoid raw `new`/`delete` except in CUDA contexts where unavoidable. |
| 22 | +- **Error Handling**: |
| 23 | + - Use exceptions in host C++ code. |
| 24 | + - In CUDA, check and propagate error codes using helper utilities/macros. Never ignore errors. |
| 25 | +- **Namespaces**: Group related functions/classes logically. Avoid polluting the global namespace. |
| 26 | +- **Headers**: |
| 27 | + - Keep headers minimal; forward declare instead of including heavy dependencies. |
| 28 | + - Each header should be guarded with `#pragma once`. |
| 29 | +- **Style**: |
| 30 | + - `snake_case` for variables and functions. |
| 31 | + - `CamelCase` for class and struct names. |
| 32 | + - `ALL_CAPS` for macros and compile-time constants. |
| 33 | + |
| 34 | +--- |
| 35 | + |
| 36 | +## CUDA Guidelines |
| 37 | + |
| 38 | +- Separate **kernels** from host orchestration code. |
| 39 | +- Name kernels descriptively, e.g. `compute_gradient_kernel`. |
| 40 | +- Document assumptions about: |
| 41 | + - Thread/block layout |
| 42 | + - Shared memory usage |
| 43 | + - Synchronization requirements |
| 44 | +- Use `__restrict__` and `constexpr` where appropriate for performance and clarity. |
| 45 | +- Prefer small, focused kernels over overly complex ones. |
| 46 | +- Always validate CUDA API calls. |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## Threading Guidelines |
| 51 | + |
| 52 | +- **Host Threading**: Use `std::thread` and synchronization primitives from `<mutex>`. |
| 53 | +- **CUDA Streams**: Use multiple streams for concurrent kernel execution. |
| 54 | +- **Thread Safety**: Document thread safety guarantees for all public APIs. |
| 55 | +- **Avoid**: Raw pthreads or platform-specific threading APIs. |
| 56 | + |
| 57 | +--- |
| 58 | + |
| 59 | +## Modularity and Organization |
| 60 | + |
| 61 | +- Keep code **organized by functionality** (e.g., detection, description, GPU utilities). |
| 62 | +- Avoid very long functions (>50 lines); refactor into helpers when possible. |
| 63 | +- Prefer **free functions** in namespaces over singletons or unnecessary wrapper classes. |
| 64 | +- Keep algorithms and data structures reusable when possible. |
| 65 | + |
| 66 | +--- |
| 67 | + |
| 68 | +## Performance Guidelines |
| 69 | + |
| 70 | +- **Memory Access Patterns**: Prefer coalesced memory access in CUDA kernels. Document stride patterns. |
| 71 | +- **Shared Memory**: Use shared memory for data reuse within thread blocks. Document bank conflicts. |
| 72 | +- **Register Usage**: Monitor register pressure in kernels. Aim for high occupancy. |
| 73 | +- **Asynchronous Operations**: Use CUDA streams for overlapping computation and memory transfers. |
| 74 | +- **Profiling**: Profile with `nvprof` or Nsight before optimizing. Document performance assumptions. |
| 75 | +- **Memory Bandwidth**: Consider memory bandwidth as the primary bottleneck for most kernels. |
| 76 | + |
| 77 | +--- |
| 78 | + |
| 79 | +## Documentation |
| 80 | + |
| 81 | +- Use **Doxygen-style comments** for public APIs, classes, and CUDA kernels. |
| 82 | +- Document algorithm choices and any CUDA-specific design tradeoffs. |
| 83 | +- Update examples and README when new features are introduced. |
| 84 | +- At each update ensure that the changelog is also updated respecting the [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format. |
| 85 | + - for each new feature, bug fix, or breaking change, add a corresponding entry in the changelog. |
| 86 | + - the description should be short but informative, followed by the relevant PR link. |
| 87 | + |
| 88 | +--- |
| 89 | + |
| 90 | +## Git Guidelines |
| 91 | + |
| 92 | +- **Branch Names**: `feature/description`, `fix/issue-number`, `refactor/component` |
| 93 | +- **Commit Messages**: Use conventional commits format: `[feat]`, `[fix]`, `[refactor]`, `[doc]` etc. |
| 94 | +- **File Organization**: Keep related files in logical directories |
| 95 | +- **Ignore Patterns**: Update `.gitignore` for build artifacts and IDE files |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +## Testing |
| 100 | + |
| 101 | +- Provide unit tests for new functionality whenever possible. |
| 102 | +- CUDA-specific code should fail gracefully on systems without CUDA. |
| 103 | +- All new code should compile cleanly with: |
| 104 | + |
| 105 | + ```bash |
| 106 | + cmake -DCMAKE_BUILD_TYPE=Release .. |
| 107 | + make -j |
| 108 | + ```` |
| 109 | + |
| 110 | + and should not introduce new warnings with -Wall -Wextra -pedantic. |
| 111 | + |
| 112 | +--- |
| 113 | + |
| 114 | +## Commit & PR Guidelines |
| 115 | + |
| 116 | +- Keep commits small and focused (one feature or fix per commit). |
| 117 | +- Do not commit untracked files that are not relevant. |
| 118 | +- PRs should include: |
| 119 | + - Clear description of changes |
| 120 | + - Explanations for algorithmic choices or CUDA-specific design decisions |
| 121 | + - Updated tests or examples if applicable |
| 122 | +- Code must pass existing CI checks before merging. |
0 commit comments