Skip to content

[INFO] quack-rs: SDK for Creating DuckDB Extensions in Rust Without Glue Code #1496

@tomtom215

Description

@tomtom215

Hi all,

I know this may not be the best place to post this, but I was not sure the most effective spot so please forgive me, but I figured the only people looking at this repo would actually understand what I am sharing and why it might be useful to them. I am sharing this to help others, and to improve the ecosystem overall and to give something back to open-source.

I want to share and bring attention and some real world testing and feedback to my new personal open source project called quack-rs. This post is not meant to be marketing or self-serving at all. Maintainers feel free to close/delete this, and reach out to me if there is a better route or you want to discuss further outside of this GitHub issue.

Recently, I built an all rust DuckDB extension called duckldb-behaviorial which has been published and available as a Community Extension and to be honest, I hit way more issues than was worth the fight to get everything setup and working in Rust and FFi and undocumented C API behavior.

The DuckDB community extensions FAQ states:

Writing a Rust-based DuckDB extension requires writing glue code in C++ and will force you to build through DuckDB's CMake & C++ based extension template. We understand that this is not ideal and acknowledge the fact that Rust developers prefer to work on pure Rust codebases.

So in order to save myself the trouble if I want to write another rust extension, but more importantly to help any other Rust developers looking to create DuckDB Community Extensions, and to help enforce some best practices and structure so that every project that uses it will be higher quality out of the box than having to handwrite glue code and hit all of the walls I ran in to before you even get started with your business logic, I created quack-rs. I have already began to port duckdb-behaviorial over to using it and I have been able to remove over 100+ of raw FFI glue code with minimal migration effort to prove that it really worked. I will push the updated version this weekend to the community extension repo.

To prove that I am well intentioned and hoping to help the community and DuckDB itself, quack-rs is MIT licensed and already published to crates.io.

If the maintainers of DuckDB (or the C API maintainers) want to have structured feedback of everything that caused friction and required a solution like quack-rs you can check out my LESSONS.md.

To save you from even clicking a link to my repo - here is a copy and paste from the main overview section of my README.md:


Why quack-rs?

The DuckDB community extensions FAQ states:

Writing a Rust-based DuckDB extension requires writing glue code in C++ and will
force you to build through DuckDB's CMake & C++ based extension template. We understand
that this is not ideal and acknowledge the fact that Rust developers prefer to work on
pure Rust codebases.

The DuckDB C Extension API (available since v1.1) changes this. quack-rs wraps that API
and eliminates every rough edge, so you write zero lines of C or C++.

What extension authors face without quack-rs

Problem Without quack-rs With quack-rs
Entry point boilerplate ~40 lines of unsafe extern "C" code 1 macro call
State init/destroy Raw Box::into_raw / Box::from_raw FfiState<T> handles all of it
Boolean reads UB if read as bool directly VectorReader::read_bool uses u8 != 0
NULL output Silent corruption if ensure_validity_writable skipped VectorWriter::set_null calls it automatically
LogicalType memory Leak if not freed LogicalType implements Drop
Aggregate combine Config fields lost on segment-tree merges Testable with AggregateTestHarness
FFI panics Process abort or undefined behavior init_extension never panics
Table functions ~100 lines of raw bind/init/scan callbacks TableFunctionBuilder 5-method chain
Replacement scans Undocumented vtable + manual string allocation ReplacementScanBuilder 4-method chain
Complex types (STRUCT/LIST/MAP) Manual offset arithmetic over child vectors StructVector, ListVector, MapVector helpers
Complex param/return types Raw duckdb_create_logical_type + manual lifecycle param_logical(LogicalType) / returns_logical(LogicalType) on all builders
Extension naming Rejected by DuckDB CI with no explanation validate_extension_name catches issues before submission
description.yml No tooling to validate before submission validate_description_yml_str validates the whole file
New project setup Hours of boilerplate + reading DuckDB internals generate_scaffold produces all 11 required files

What quack-rs Solves

Building a DuckDB extension in Rust — from project setup to community submission — requires navigating undocumented C API contracts, FFI memory rules, and data-encoding specifics found only in DuckDB's source code, which surface as silent corruption, process aborts, or unexplained CI rejections rather than compiler errors. quack-rs eliminates these barriers systematically across the complete extension lifecycle — scaffolding, function registration, type-safe data access, aggregate testing, metadata validation, and community submission readiness — with every abstraction backed by a documented, reproducible pitfall in LESSONS.md, making correct behavior automatic and incorrect behavior a compile-time error wherever the type system permits. The result is that any Rust developer can build, test, and ship a production-quality DuckDB extension without prior knowledge of DuckDB internals, covering every extension type exposed by DuckDB's public C Extension API: scalar, aggregate, table, cast, replacement scan, and SQL macro functions.

quack-rs encapsulates 15 documented FFI pitfalls — hard-won knowledge from building
real DuckDB extensions in Rust:

L1  COMBINE must propagate ALL config fields (not just data)
L2  State destroy double-free → FfiState<T> nulls pointers after free
L3  No panics across FFI → init_extension uses Result throughout
L4  ensure_validity_writable required before NULL output → VectorWriter handles it
L5  Boolean reading must use u8 != 0 → VectorReader enforces this
L6  Function set name must be set on EACH member → Set builders enforce on every member
L7  LogicalType memory leak → LogicalType implements Drop

P1  Library name must match [lib] name in Cargo.toml exactly
P2  C API version ("v1.2.0") ≠ DuckDB release version ("v1.4.4" / "v1.5.0")
P3  E2E SQLLogicTests required for community submission
P4  extension-ci-tools submodule must be initialized
P5  SQLLogicTest output must match DuckDB CLI output exactly
P6  Function registration can fail silently → builders check return values
P7  DuckDB strings use 16-byte format with inline and pointer variants
P8  INTERVAL is { months: i32, days: i32, micros: i64 } — not a single i64

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions