Experimental in-database LLM and deep-learning inference via native relational prediction operator. Built on top of a high-performance analytical database system, DuckDB.
@misc{ipdb2026arxiv,
title={iPDB -- Optimizing SQL Queries with ML and LLM Predicates},
author={Udesh Kumarasinghe and Tyler Liu and Chunwei Liu and Walid G. Aref},
year={2026},
eprint={2601.16432},
archivePrefix={arXiv},
primaryClass={cs.DB},
url={https://arxiv.org/abs/2601.16432},
}We release a duckdb-python package built with iPDB as the underlying engine. Please fetch the latest version from Releases (i.e., duckdb-<latest_version>.tar.gz). To install the Python package, run:
pip install duckdb-<latest_version>.tar.gzThe above command will build the wheel and install iPDB as a Python package. The API is identical to the DuckDB Python API, however, this version supports all the semantic operators.
You can get an up-and-running instance of iPDb with either Building from the source or Building with Docker.
This step is optional if you are only inferencing with LLMs (via API or local models).
First, configure the ONNX runtime library, which we'll use for pre-trained models stored in .onnx format. You have multiple options to get an up-and-running instance of the ONNX runtime library.
Detailed instructions are available here https://onnxruntime.ai/docs/build/inferencing.html.
Download and extract a compatible release for your development platform from the onnxruntime GitHub releases.
The above link directs to the version
1.19.2instead of the latest version, as this is the version tested to work (onlinux-amd64).
For both of the above options, set the ONNX runtime installed path as an env variable.
export ONNX_INSTALL_PREFIX=<onnx_runtime_installed_path>This step is optional if you are only inferencing with LLM APIs or pre-trained ONNX models.
First, fetch and build the llama.cpp library, which we'll use for inferencing local large language models. You have multiple options to get an up-and-running instance of the llama.cpp library.
Clone the llama.cpp repository.
git clone https://github.com/ggml-org/llama.cpp
cd llama.cppBuild for the project using CMake (CPU build).
cmake -B build
cmake --build build --config Release -j 8
Releasebuilds are preferred to achieve the best inference performance.
Please refer to the detailed building guide here in the llama.cpp repository for other backend (i.e. GPU), debug builds and trobleshooting.
Set the llama.cpp installed path as a env variable.
export LIBLLAMA_INSTALL_PREFIX="<llama_cpp_repo>/build"duckdb uses a make script for the builds. We have extended that with additional parameters to configure iPDb.
make debug GEN=ninja -j12 CORE_EXTENSIONS='httpfs' ENABLE_PREDICT=1 PREDICTOR_IMPL=llama_cpp ENABLE_LLM_API=True DISABLE_SANITIZER=1
GEN=ninja -j12(optional, drop if you don't have ninja setup) use the ninja for build parallelization (w/ -j12 for 12 workers).ENABLE_PREDICT=1enables the ML extension.PREDICTOR_IMPL=onnxchoose the internal ML platform. Available options,onnx- Use ONNX Runtime to infer pre-trained.onnxmodels (Step 1.1 required).llama_cpp- Use llama.cpp to infer LLMs inGGUFformat.
ENABLE_LLM_API=1Enable LLM calling with OpenAI API compatible APIs via the network (sets theCORE_EXTENSIONS='httpfs'option automatically).
Previously available torchscript for inferring pre-trained pytorch models exported with TorchScript is [DEPRECATED] and is not supported.
The majority of publicly available remote LLMs require an API key from the respective developer to use their capabilities. iPDb lets users define the LLM API using the CREATE SECRET statement, where users can define either in-memory or persisted API keys that can be reused with different models. Please acquire the respective key and use the following SQL syntax for defining API keys.
CREATE PERSISTENT SECRET openai_key (TYPE http, bearer_token '<openai_api_key>');
CREATE PERSISTENT SECRET google_key (TYPE http, bearer_token '<google_api_key>');Alternatively, the API keys can be set in the env variable as follows so that iPDb can identify them before inference. However, this limits the models to only one vendor as only one API is available.
export OPENAI_API_KEY="<api_key>"Add the above in the shell config file, i.e.,
.bashrcfor permanent availability.
Currently Docker script is written build only for ONNX model support. Please build from scratch to build for LLM inference.
Make sure Docker is set up correctly.
Clone the iPDb repository. It should include a Dockerfile and a .dockerignore.
Build the Docker image by running (we'll create an image named iPDb).
docker build -t iPDbRun the Docker image:
docker run -it -v <absolute_path_to_data_repo>:/data --name=”iPDb_container” iPDb /bin/bashFor subsequent runs, you can just spin up the stopped container using the name of the container.
docker start -ai iPDb_containerBoth of the above commands will open an interactive shell in the Docker container. Here, we mount a data directory where we will be storing the pre-trained models. If you want just to see if the iPDb works, run the about command without
-v <absolute_path_to_mldb_repo>:/data.
Run iPDb:
./build/debug/iPDb <your_database>Once you have a working iPDb instance, you can experiment with SQL queries that are capable of in-database inference.
Within the iPDb shell,
- Create and populate tables with feature data (say, iris data).
- Upload the model to the database via the
CREATE MODELstatement.CREATE TABULAR MODEL iris_cls PATH `<model_path>` ON TABLE iris OUTPUT (class INTEGER);
- Run a prediction query.
SELECT * FROM PREDICT(iris_cls, iris) AS p WHERE p.class = 2;
Syntax tree and examples of both CREATE MODEL and PREDICT statements are available here (opened via draw.io).
Make sure you have built iPDb with options that enable remote LLM calling,
ENABLE_PREDICT=1ENABLE_LLM_API=1
Additionally, make sure either SECRETs (refer to "API LLM Calling" section) or OPENAI_API_KEY environment variable is set with the OpenAI API key correctly.
Within the iPDb shell,
- Create and populate tables with data (say, a
jobtable withdescriptioncolumn containing a job listing document). - Upload the model to the database via the
CREATE MODELstatement.CREATE LLM MODEL o4_mini PATH 'o4-mini' ON PROMPT API 'https://api.openai.com' SECRET openai_key;
Notice that PATH accepts the model name accepted by the API (e.g., o4-mini, gpt-4.1).
Furthermore, the model is uploaded as ON PROMPT, which results in the query execution pipeline inferring the input/output columns to the LLM from the prompt during the query execution.
Additionally, the user should set API to the base URL of the respective LLM.
- Run a prediction query.
SELECT * FROM LLM o4_mini (PROMPT 'extract the {s:location} and {d:salary} for job {description}', job);
Here, notice that we have an additional PROMPT clause within the PREDICT statement. Inside the prompt, user can define input columns by mentioning each column with braces, i.e., {column} (e.g., {descrtiption} in the above query). Similarly, the user can define the output columns with braces in the format, {data_type:column_name} (e.g., {s:location} returns a VARCHAR column with location and {d:salary} returns a INTEGER column with salary).
- More details on building DuckDB from source are here.
- Follow the LibTorch tutorial on native model inference.
- If you are new to deep-learning inference, start with beginner-friendly Python-based training and inference examples here.
- Introduction to native C++ model exporting and inference with TorchScript is available here. The most feature-rich version of iPDb is implemented via the ONNX runtime instead of LibTorch, but this guide is far better than any ONNX examples can offer (the concepts are the same).
- The high-level duckdb execution model is explained in the official documentation and these slides.
- Read the source for the implementation of iPDb. Extensive documentation of this extension is WIP.
Native relational prediction operator is realized using the following systems and frameworks.
- DuckDB (1.3.0)- Relational Database https://duckdb.org
- ONNX Runtime (1.19) - Efficient generalizable deep learning runtime https://onnx.ai
- llama.cpp (latest) - Local LLM inference https://github.com/ggml-org/llama.cpp
Please refer to the Official DuckDB repository for documentation of the base source code (https://github.com/duckdb/duckdb).