This file provides guidance to coding agents when working with code in this repository.
# Configure and build
cmake -S . -B build -DBUILD_SHARED_LIBS=ON -DDAQIRI_BUILD_PYTHON=OFF -DDAQIRI_MGR="dpdk socket rdma"
cmake --build build -j
cmake --install build --prefix /opt/daqiri
# Container build (compiles patched DPDK from source)
BASE_TARGET=dpdk DAQIRI_MGR="dpdk socket rdma" scripts/build-container.shCMake options (full table in docs/getting-started.md):
DAQIRI_MGR— space-separated backend list. Valid values:dpdk,socket,rdma. Default insrc/CMakeLists.txt:137is"dpdk socket"(which, due to the rule below, effectively builds all three).DAQIRI_BUILD_PYTHON— buildspybind11bindings frompython/.DAQIRI_BUILD_EXAMPLES— builds the benchmark executables (defaultON).DAQIRI_REORDER_GPU_PROFILE— enable CUDA event timing in the DPDK reorder kernels (off by default).
CUDA architectures are hardcoded to 80;90;121 (A100, H100, GB10) in src/CMakeLists.txt:25. Change this when targeting other GPUs.
Socket → RDMA dependency: the socket backend reuses the RoCE transport from the RDMA implementation, so src/CMakeLists.txt:144-152 automatically prepends rdma to DAQIRI_MGR_LIST whenever socket is requested. The reverse is not true — listing rdma alone does not pull in socket.
There is no unit test suite. Verification is done via the benchmark executables in examples/, driven by YAML configs. Build outputs (examples/CMakeLists.txt:59-71):
| Executable | Source | Typical config |
|---|---|---|
daqiri_bench_raw_gpudirect |
raw_gpudirect_bench.cpp |
daqiri_bench_raw_tx_rx.yaml, daqiri_bench_raw_tx_rx_spark.yaml, daqiri_bench_raw_sw_loopback.yaml, daqiri_bench_raw_rx_multi_q.yaml |
daqiri_bench_raw_hds |
raw_hds_bench.cpp |
daqiri_bench_raw_tx_rx_hds.yaml |
daqiri_bench_raw_reorder_seq |
raw_reorder_seq_bench.cpp |
daqiri_bench_raw_tx_rx_reorder_seq_1024*.yaml, daqiri_bench_raw_rx_reorder_seq_*.yaml |
daqiri_bench_raw_reorder_quantize |
raw_reorder_quantize_bench.cpp |
daqiri_bench_raw_tx_rx_reorder_quantize_seq_batch.yaml |
daqiri_bench_rdma |
rdma_bench.cpp |
daqiri_bench_rdma_tx_rx.yaml, daqiri_bench_rdma_tx_rx_spark.yaml |
daqiri_bench_socket |
socket_bench.cpp |
daqiri_bench_socket_{udp,tcp}_tx_rx.yaml |
The four raw_* benches share raw_bench_common.{cpp,h} and accept --seconds N. daqiri_bench_rdma and daqiri_bench_socket also take --mode {tx,rx,both}.
./build/examples/daqiri_bench_raw_gpudirect ./build/examples/daqiri_bench_raw_tx_rx.yaml --seconds 10
./build/examples/daqiri_bench_socket ./build/examples/daqiri_bench_socket_udp_tx_rx.yaml --seconds 10 --mode bothYAML files contain <angle-bracket> placeholders (PCIe addresses, CPU cores, MACs, IPs) that must be replaced for your system. daqiri_bench_raw_sw_loopback.yaml requires no physical link and is the fastest way to smoke-test a build.
Configs named raw_rx_* are RX-only — they initialize the RX path and wait for external traffic, so a standalone run can exit cleanly with 0 packets. Use the tx_rx configs for closed-loop smoke tests.
clang-format is required for contributions (CONTRIBUTING.md):
git-clang-format --style file # format staged changes
clang-format -style=file -i -fallback-style=none <files>Single C++/CUDA shared library (libdaqiri.so) exposing a C++ API through #include <daqiri/daqiri.h>. The public surface is intentionally flat free-function helpers (get_rx_burst, get_packet_ptr, set_udp_header, …) that all operate on opaque DAQIRI-owned buffers. Applications never touch backend types directly.
src/manager.h defines daqiri::Manager — an (almost) ABC with ~50 virtual methods covering init, RX/TX burst dequeue/enqueue, header-fill helpers, buffer free, and RDMA connection setup. Backends live in src/managers/<name>/ (dpdk/, rdma/, socket/) and are selected at CMake configure time via DAQIRI_MGR. Each backend produces its own static library (daqiri_dpdk, daqiri_rdma, daqiri_socket) linked into daqiri_common, and each adds a DAQIRI_MGR_<NAME>=1 compile definition (see src/CMakeLists.txt:156-183).
ManagerFactory (also in manager.h) is a singleton that instantiates the active backend. daqiri_init(...) resolves which backend to use from the NetworkConfig and then delegates everything through the Manager vtable. There is only ever one active Manager per process.
All packet data flows through BurstParams, a batch of packets. Only pointers are passed between NIC, DAQIRI internals, and the application — the caller reads directly from the buffers the NIC DMA'd into. The caller must explicitly free bursts; a missed free drains the pool and produces NO_FREE_BURST_BUFFERS / NO_FREE_PACKET_BUFFERS errors and NIC drops. See docs/api-guide.md.
A single packet can span multiple segments (contiguous memory regions), each in CPU or GPU memory. The header-data split (HDS) mode puts headers in segment 0 (CPU) and payload in segment 1 (GPU), enabling GPUDirect zero-copy payload paths. Batched-GPU and CPU-only modes use a single segment.
src/manager.h and src/common.cpp use rte_ring / rte_mbuf directly, so DPDK is a build dependency of daqiri_common even when building only the RDMA or socket backend. src/CMakeLists.txt:60-67 requires pkg-config + libdpdk and falls back to /opt/mellanox/dpdk/.... The container build uses patched DPDK from dpdk_patches/ (dmabuf.patch, dpdk.nvidia.patch) — the dmabuf patch removes the peermem kernel-module requirement for GPUDirect.
src/kernels.cu hosts the CUDA reorder paths used by the raw_reorder_* benches. Compile with -DDAQIRI_REORDER_GPU_PROFILE=ON to instrument them with CUDA event timing.
Vendored under third_party/ as submodules (.gitmodules): yaml-cpp (config parser) and spdlog (logging). CMake prefers these over system copies. Missing them is a fatal error.
- TX header fill currently supports UDP only (see README).
- No CI yet — contributors and reviewers verify manually (CONTRIBUTING.md).
The web docs live in docs/ and are built with MkDocs Material. The site config is mkdocs.yml.
Structure:
docs/index.html— custom HTML landing page (not generated by MkDocs, hand-maintained)docs/daqiri-api.html— standalone HTML API reference (hand-maintained)docs/api-guide.md,docs/getting-started.md,docs/configuration.md— core markdown docsdocs/tutorials/— tutorial walkthroughs (background, system config, benchmarking, config files)docs/stylesheets/extra.css— custom theme overrides
Keeping docs in sync with code: before committing changes, scan for the recurring drift hotspots:
- Backend list (
src/managers/*/) — README Backends table,docs/getting-started.md,docs/configuration.md - CMake options /
DAQIRI_MGRdefault (src/CMakeLists.txt:137) — README Quick Start,docs/getting-started.md, this file's Build & run section - Benchmark binary or YAML names (
examples/) — the benchmark table above,docs/tutorials/benchmarking_examples.md, and the "Choosing an example config" decision tree indocs/tutorials/configuration-walkthrough.md(every YAML must have a leaf; CI'sscripts/check_doc_refs.pyenforces coverage) - Public API include (
#include <daqiri/daqiri.h>; source files underinclude/daqiri/) —docs/api-guide.md,docs/daqiri-api.html - Doc reorganization (any rename in
docs/) —docs/index.htmllanding page,mkdocs.ymlnav, README Documentation table
The full mapping with rationale lives in the docs-sync agent rule. Internal-link, anchor, and nav drift is enforced by CI (.github/workflows/docs.yml); content drift (stale binary names, defaults) is still a manual check at commit time.
Deployment: .github/workflows/docs.yml runs mkdocs gh-deploy --force on pushes to main, publishing to the gh-pages branch. GitHub Pages serves from gh-pages.
From CONTRIBUTING.md:
- DCO sign-off required — every commit must have
Signed-off-by:(usegit commit -s). Unsigned commits will be rejected. - Commit titles in imperative mood, prefixed with the GitHub issue number:
#<Issue Number> - <Title>. - An issue must exist and be approved before coding.
- Prefer toggling features via new CMake options (with backward-compatible defaults) rather than wrapping entire files in
#ifguards. Use#ifonly for minor in-file changes. - Keep PRs narrowly scoped — one concern per PR, dependencies noted in the description.
- When opening a PR that touches
src/,examples/, ormkdocs.yml, scan the doc-sync agent rule and update affected docs in the same PR.
Compiling should always be done inside of the container built from the project's Dockerfile. The container should be started in privileged mode with all GPUs passed though. Hugepages mounted on the host should be passed through into the container via a volume mount. When compiling the container should be started with the current user. When running the
container should run as root.