Span Queries

Use of LLM-based inference is evolving from its origins of chat. These days, use cases involve the combination of multiple inference calls, tool calls, and database lookups. RAG, agentic AI, and deep research are three examples of these more sophisticated use cases.

The goal of this project to facilitate optimizations that drastically reduce the cost of inference for RAG, agentics, and deep research (by 10x ¹) without harming accuracy. Our approach is to generalize the interface to inference servers via the Span Query.

In a span query, chat is a special case of a more general form. To the right is a visualization of a span query for a "judge/generator" (a.k.a. "LLM-as-a-judge").

Learn more about span query syntax and semantics

Getting Started with SPNL

SPNL is a library for creating, optimizing, and tokenizing span queries. The library is surfaced for consumption as:

Using the `spnl` CLI

The spnl CLI provides commands for running span queries and managing vLLM deployments. For macOS users, you can install via Homebrew:

# Add the tap
brew tap IBM/spnl https://github.com/IBM/spnl

# Install the spnl CLI
brew install spnl

For other platforms, you can download the latest spnl CLI from the SPNL releases page.

Managing vLLM Deployments

The spnl CLI provides commands to easily deploy and manage vLLM inference servers on Kubernetes or Google Compute Engine. See the vLLM documentation for detailed instructions.

Quick example:

# Bring up a vLLM server on Kubernetes (requires HuggingFace token)
spnl vllm up my-deployment --target k8s --hf-token YOUR_HF_TOKEN

# Bring down the vLLM server
spnl vllm down my-deployment --target k8s

Quick Start with Docker

To kick the tires with the spnl CLI running Ollama:

podman run --rm -it ghcr.io/ibm/spnl-ollama --verbose

This will run a judge/generator email example. You also can point it to a JSON file containing a span query.

CLI Usage

For comprehensive CLI documentation including all commands, options, and examples, see docs/cli.md.

Quick reference:

# Run a query
spnl run [OPTIONS]

# Manage vLLM deployments
spnl vllm <up|down> [OPTIONS]

# Get help
spnl --help
spnl run --help
spnl vllm --help

Building SPNL

First, configure your environment for Rust. Now you can build the CLI with cargo build -p spnl-cli, which will produce ./target/debug/spnl. Adding --release will produce a build with source code optimizations in ./target/release/spnl.

https://arxiv.org/html/2409.15355v5 ↩

Name		Name	Last commit message	Last commit date
Latest commit History 901 Commits
.cargo-husky/hooks		.cargo-husky/hooks
.github		.github
Formula		Formula
benchmarks		benchmarks
cli		cli
docker		docker
docs		docs
spnl		spnl
web		web
.containerignore		.containerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Containerfile		Containerfile
Containerfile.ollama		Containerfile.ollama
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Span Queries

Getting Started with SPNL

Using the `spnl` CLI

Managing vLLM Deployments

Quick Start with Docker

CLI Usage

Building SPNL

About

Uh oh!

Releases 34

Packages

Uh oh!

Uh oh!

Contributors 7

Languages

License

IBM/spnl

Folders and files

Latest commit

History

Repository files navigation

Span Queries

Getting Started with SPNL

Using the spnl CLI

Managing vLLM Deployments

Quick Start with Docker

CLI Usage

Building SPNL

Footnotes

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 34

Packages 0

Uh oh!

Uh oh!

Contributors 7

Languages

Using the `spnl` CLI

Packages