The Quereus runtime executes query plans through a three-phase process: Planning (AST → Plan Nodes), Emission (Plan Nodes → Instructions), and Execution (Instructions → Results).
Core SQL data types that can be stored and manipulated:
type SqlValue = string | number | bigint | boolean | Uint8Array | null;Input types that instructions can receive as arguments:
type RuntimeValue = SqlValue | Row | AsyncIterable<Row> | ((ctx: RuntimeContext) => OutputValue);Output types that instructions can produce:
type OutputValue = MaybePromise<RuntimeValue>;The runtime uses TypeScript's structural typing for type safety. Key classes and interfaces:
PlanNode: Base class for all plan nodesVoidNode: Plan nodes that don't produce output (DDL, DML)RelationalNode: Plan nodes that produce rows (must implementgetAttributes())ExpressionNode: Plan nodes that produce scalar values
// src/planner/nodes/my-operation-node.ts
import { RelationalNode } from './plan-node.js';
import { PlanNodeType } from './plan-node-type.js';
import { Cached } from '../../util/cached.js';
export class MyOperationNode extends PlanNode implements UnaryRelationalNode {
readonly nodeType = PlanNodeType.MyOperation;
private attributesCache: Cached<Attribute[]>;
constructor(
scope: Scope,
public readonly source: RelationalPlanNode,
public readonly operationParam: string
) {
super(scope, source.getTotalCost() + 10); // Add operation cost
this.attributesCache = new Cached(() => this.buildAttributes());
}
private buildAttributes(): Attribute[] {
// Define how this node creates/transforms attributes
// Option 1: Preserve source attributes (like FilterNode, SortNode)
return this.source.getAttributes();
// Option 2: Create new attributes (like ProjectNode)
// return this.projections.map((proj, index) => ({
// id: PlanNode.nextAttrId(),
// name: proj.alias ?? `col_${index}`,
// type: proj.node.getType(),
// sourceRelation: `${this.nodeType}:${this.id}`
// }));
}
getAttributes(): Attribute[] {
return this.attributesCache.value;
}
getType(): RelationType {
// Define output relation type
return this.source.getType(); // Or build custom type
}
// ... other required methods
}// src/planner/nodes/plan-node-type.ts
export enum PlanNodeType {
// ... existing types
MyOperation = 'MyOperation',
}// src/planner/building/my-operation.ts
import type { PlanningContext } from '../planning-context.js';
import * as AST from '../../parser/ast.js';
import { MyOperationNode } from '../nodes/my-operation-node.js';
import { buildSelectStmt } from './select.js';
export function buildMyOperationStmt(ctx: PlanningContext, stmt: AST.MyOperationStmt): MyOperationNode {
// Build child nodes
const sourceNode = buildSelectStmt(ctx, stmt.inputQuery);
// Validate parameters
if (!stmt.operationParam) {
throw new QuereusError('Operation parameter required', StatusCode.ERROR);
}
return new MyOperationNode(ctx.scope, sourceNode, stmt.operationParam);
}All plan nodes follow standardized output conventions for consistent query plan display and debugging.
Each plan node provides three complementary sources of information:
{
id: string, // Unique node identifier
nodeType: PlanNodeType, // Node type enum (displayed by viewer)
description: string, // toString() output
logical: Record<string, any>, // getLogicalProperties() output
physical?: PhysicalProperties // Physical execution properties (when optimized)
}Purpose: Provide concise, human-readable descriptions for quick plan comprehension.
Rules:
- Never include node type, ID, or parentheses
- Keep ≤ 80 characters when practical
- Start with SQL keyword or principal action
- Show only essential information (predicates, projections, etc.)
- Don't duplicate information from logical/physical properties
Examples:
// TableReferenceNode
toString(): "main.users"
// FilterNode
toString(): "where age > 40"
// ProjectNode
toString(): "select name, count(*) as total"
// SortNode
toString(): "order by name desc, age asc"
// AggregateNode
toString(): "group by dept_id agg count(*) as count, sum(salary) as total"Purpose: Provide comprehensive logical information for detailed plan analysis.
Rules:
- Always return an object (never undefined)
- Use camelCased keys with semantic meaning
- Return primitive JSON types when possible (strings, numbers, arrays)
- Include logically important information not in description
- Don't duplicate physical properties (estimatedRows, ordering, etc.)
Examples:
// FilterNode
getLogicalProperties(): {
predicate: "age > 40"
}
// AggregateNode
getLogicalProperties(): {
groupBy: ["dept_id"],
aggregates: [
{ expression: "COUNT(*)", alias: "count" },
{ expression: "SUM(salary)", alias: "total" }
]
}Use consistent formatting helpers from src/util/plan-formatter.ts:
import {
formatExpression, // ScalarPlanNode → string
formatExpressionList, // ScalarPlanNode[] → "expr1, expr2, ..."
formatProjection, // Expression + alias → "expr AS alias"
formatSortKey, // Expression + direction + nulls → "expr DESC NULLS LAST"
formatScalarType // ScalarType → "INTEGER" | "TEXT" | etc.
} from '../../util/plan-formatter.js';export class MyOperationNode extends PlanNode {
// ... constructor and other methods
override toString(): string {
// Concise description focusing on key operation details
return `MY_OP ${this.operationParam}`;
}
override getLogicalProperties(): Record<string, unknown> {
return {
operation: this.operationParam,
targetColumns: this.columns.map(col => col.name),
// Include other logical details...
};
}
}This standardized format ensures plan viewers receive consistent, comprehensive information for both quick scanning (description) and deep analysis (logical + physical properties).
// src/runtime/emit/my-operation.ts
import type { MyOperationNode } from '../../planner/nodes/my-operation-node.js';
import type { Instruction, RuntimeContext } from '../types.js';
import type { RowDescriptor } from '../../planner/nodes/plan-node.js';
import type { EmissionContext } from '../emission-context.js';
import { emitPlanNode } from '../emitters.js';
import { createRowSlot } from '../context-helpers.js';
export function emitMyOperation(plan: MyOperationNode, ctx: EmissionContext): Instruction {
const sourceInstruction = emitPlanNode(plan.source, ctx);
// Create row descriptor for source attributes
const sourceRowDescriptor: RowDescriptor = [];
const sourceAttributes = plan.source.getAttributes();
sourceAttributes.forEach((attr, index) => {
sourceRowDescriptor[attr.id] = index;
});
// Create output row descriptor (if this node transforms attributes)
const outputRowDescriptor: RowDescriptor = [];
const outputAttributes = plan.getAttributes();
outputAttributes.forEach((attr, index) => {
outputRowDescriptor[attr.id] = index;
});
// Common run function pattern: streaming with row slot
async function* run(rctx: RuntimeContext, inputRows: AsyncIterable<Row>): AsyncIterable<Row> {
const rowSlot = createRowSlot(rctx, sourceRowDescriptor);
try {
for await (const row of inputRows) {
rowSlot.set(row);
const processedRow = processRow(row, plan.operationParam);
yield processedRow;
}
} finally {
rowSlot.close();
}
}
// For scalar operations:
// function run(rctx: RuntimeContext, inputValue: SqlValue): SqlValue {
// return processValue(inputValue, plan.operationParam);
// }
// For void operations (DDL/DML):
// async function run(rctx: RuntimeContext, inputRows: AsyncIterable<Row>): Promise<void> {
// await processRowsWithContext(rctx, sourceRowDescriptor, inputRows, async (row) => {
// await performSideEffect(row);
// });
// return undefined;
// }
// Emit child instructions
const sourceInstruction = emitPlanNode(plan.source, ctx);
return {
params: [sourceInstruction],
run,
note: `myOperation(${plan.operationParam})`
};
}// src/runtime/register.ts
import { emitMyOperation } from './emit/my-operation.js';
export function registerEmitters() {
// ... existing registrations
registerEmitter(PlanNodeType.MyOperation, emitMyOperation as EmitterFunc);
}Use context helpers to manage row contexts safely and efficiently:
Pattern 1: High-volume streaming (createRowSlot) — preferred for all streaming emitters
import { createRowSlot } from '../context-helpers.js';
// Used by scan, join, filter, project, and distinct emitters.
// Installs the context entry once; updates by cheap field write per row.
const rowSlot = createRowSlot(rctx, rowDescriptor);
try {
for await (const row of sourceRows) {
rowSlot.set(row); // Cheap update - no Map mutation
yield processRow(row);
}
} finally {
rowSlot.close();
}Pattern 2: One-off / low-frequency context (withRowContext / withAsyncRowContext)
import { withRowContext, withAsyncRowContext } from '../context-helpers.js';
// Best for single-row evaluations such as constraint checks, DML context
// setup, or any place where Map.set+delete once is negligible.
// Synchronous evaluation
const result = withRowContext(rctx, rowDescriptor, () => row, () => {
return evaluateExpression(rctx);
});
// Async evaluation
const result = await withAsyncRowContext(rctx, rowDescriptor, () => row, async () => {
return await evaluateAsyncExpression(rctx);
});Column references are resolved automatically using attribute IDs. The runtime now searches the context from newest → oldest, so the most recently-pushed scope wins:
// In emitColumnReference (built-in):
function run(ctx: RuntimeContext): SqlValue {
// Deterministic lookup: newest (innermost) scope wins
return resolveAttribute(ctx, plan.attributeId, plan.expression.name);
}The Scheduler executes instructions in dependency order:
- Flattening: Converts instruction tree to linear array
- Dependency Resolution: Ensures instructions execute after their dependencies
- Async Handling: Uses
Promise.all()for concurrent dependency resolution - Memory Management: Clears instruction arguments after execution
- Row Descriptors: Always create row descriptors mapping attribute IDs to column indices
- Context Cleanup: Use try/finally blocks to ensure context cleanup
- Return Types: Match your function signature to expected output type
- Async Iterables: Use
async function*for row-producing operations - Error Handling: Throw
QuereusErrorwith appropriateStatusCode - Attribute Preservation: Understand whether your node preserves or creates new attributes
Quereus resolves all schema dependencies during the planning phase and tracks them for automatic plan invalidation:
All schema objects are resolved during planning and stored directly in plan nodes:
// TableReferenceNode stores pre-resolved objects
class TableReferenceNode {
constructor(
scope: Scope,
public readonly tableSchema: TableSchema,
public readonly vtabModule: VirtualTableModule,
public readonly vtabAuxData?: unknown
) { ... }
}
// ScalarFunctionCallNode stores pre-resolved function
class ScalarFunctionCallNode {
constructor(
scope: Scope,
public readonly expression: AST.FunctionExpr,
public readonly functionSchema: FunctionSchema,
public readonly operands: ScalarPlanNode[]
) { ... }
}The planning context tracks all schema dependencies:
// During planning
const functionSchema = resolveFunctionSchema(ctx, 'sum', 1);
const tableSchema = resolveTableSchema(ctx, 'users');
const vtabModule = resolveVtabModule(ctx, 'memory');
// Dependencies tracked automatically
ctx.schemaDependencies.recordDependency({
type: 'function',
objectName: 'sum/1'
}, functionSchema);Prepared statements automatically invalidate when dependencies change:
// Schema change triggers automatic plan invalidation
schemaManager.createTable(...); // Emits 'table_added' event
// → Statements using affected schema objects recompile automaticallyQuereus implements a robust attribute-based context system that eliminates the architectural deficiencies of traditional node-based column reference resolution.
Core Design Principles:
- Stable Attribute IDs: Every column is identified by a unique, stable attribute ID that persists across plan transformations and optimizations.
- Deterministic Resolution: Column references use attribute IDs for lookup, eliminating the need for node type checking or fragile node-based resolution.
- Context Isolation: Each row context is isolated using row descriptors that map attribute IDs to column indices.
- Transformation Safety: Plan transformations (logical→physical) preserve attribute IDs, ensuring column references remain valid.
RowDescriptor: Maps attribute IDs to column indices in a row
type RowDescriptor = number[]; // attributeId → columnIndex mappingRowGetter: Function that provides access to the current row
type RowGetter = () => Row;RuntimeContext: Uses attribute-based context mapping
interface RuntimeContext {
db: Database;
stmt: Statement;
params: SqlParameters;
context: RowContextMap; // Row contexts with O(1) attribute index
}Every relational plan node must implement getAttributes(): Attribute[] to define its output schema:
interface Attribute {
id: number; // Stable, unique identifier
name: string; // Column name
type: ScalarType; // Column type
sourceRelation: string; // For debugging/tracing
}Key principles:
- Attribute IDs are stable across plan transformations
- Column references use attribute IDs for resolution, not node references
- Optimizer preserves attribute IDs when converting logical to physical nodes
- No node type checking required in
emitColumnReference
Quereus provides comprehensive debugging infrastructure for diagnosing context-related issues, which are common when developing new emitters or troubleshooting column reference resolution problems.
quereus:runtime:context: General context lifecycle operations
quereus:runtime:context:lookup: Column reference resolution attempts
# Enable all context tracing
set DEBUG=quereus:runtime:context* && yarn test"No row context found" Errors:
- Enable
DEBUG=quereus:runtime:context:lookupto see what contexts are available - Check if the expected attribute ID is present in any context
- Verify context push/pop timing with
DEBUG=quereus:runtime:context
Context Lifecycle Issues:
- Enable
DEBUG=quereus:runtime:contextto trace context management - Look for mismatched PUSH/POP operations
- Verify contexts are available when column references are evaluated
Best Practices for Emitter Authors:
- Always use the logging helpers:
logContextPush()andlogContextPop() - Include meaningful notes that identify the operation context
- Log attribute information when setting up row descriptors
- Always use context helpers (
withRowContext,withAsyncRowContext,createRowSlot) - Never call
rctx.context.set/deletedirectly - Choose the appropriate helper based on your use case
- Include meaningful notes in your instruction's
notefield
Quereus implements a precise distinction between bags (multisets) and sets in its relational model, aligning with Third Manifesto principles and enabling sophisticated query optimizations.
Set: A relation that guarantees unique rows (no duplicates)
- All rows are distinct according to the relation's primary key(s)
- Example: Result of
SELECT DISTINCT, aggregation results, base tables
Bag: A relation that can contain duplicate rows
- Multiple identical rows are possible
- Example: Result of
SELECT * FROM table, table function outputs
Every relational plan node specifies whether it produces a set or bag via the isSet property:
interface RelationType {
...
isSet: boolean; // true = set (unique rows), false = bag (duplicates possible)
...
}Nodes that produce Sets (isSet: true): - TableScanNode, AggregateNode/StreamAggregateNode, SingleRowNode, SequencingNode
Nodes that may produce Bags (isSet: false): - TableFunctionCallNode (depends on function declaration), ProjectNode (depending on whether key columns are preserved, and whether distinct), FilterNode (reflects input), SortNode (reflects input), WindowNode, ValuesNode (assumed to be bag, but we could check statically)
SequencingNode is a special operation that converts any bag into a set by adding a unique row number column (sequenceColumnName)
Runtime Behavior:
// Emitter adds row numbers to each row
async function* run(ctx: RuntimeContext, source: AsyncIterable<Row>): AsyncIterable<Row> {
let rowNumber = 1;
for await (const sourceRow of source) {
yield [...sourceRow, rowNumber++] as Row;
}
}The bag/set distinction enables important optimizations:
Set-Specific Optimizations:
- Duplicate elimination can be skipped for sets
- Certain join algorithms are more efficient with sets
- Set operations (UNION, INTERSECT) have different complexity
Bag-Aware Planning:
- Streaming operations can be more efficient on bags
- Memory usage optimizations for bag operations
- Different sorting strategies for bags vs sets
This design aligns with Third Manifesto principles:
- Clear Semantics: Explicit distinction between sets and bags
- Type Safety: RelationType captures bag/set information at compile time
- Algebraic Foundation: Operations preserve or transform bag/set properties predictably
- Optimization Enabling: Type information guides query optimization decisions
Quereus implements a uniform OLD/NEW attribute model for all mutation operations (INSERT, UPDATE, DELETE) that eliminates conditional context management and provides consistent symbol resolution.
Always-Present Attributes: Every mutation operation has both OLD and NEW attributes for every table column, regardless of operation type:
- INSERT: OLD attributes are constant NULL, NEW attributes contain inserted values
- UPDATE: OLD attributes contain pre-update values, NEW attributes contain post-update values
- DELETE: OLD attributes contain deleted values, NEW attributes are constant NULL
Flat Row Composition: At runtime, mutation contexts use a flat row format:
[oldCol0, oldCol1, ..., oldColN, newCol0, newCol1, ..., newColN]
During statement building, mutation operations generate:
oldRowDescriptor: Maps OLD attribute IDs to indices 0..n-1 in flat rownewRowDescriptor: Maps NEW attribute IDs to indices n..2n-1 in flat row- Layered scope registration where unqualified column references default to the meaningful values:
- INSERT/UPDATE: NEW attributes (since OLD may be NULL/irrelevant)
- DELETE: OLD attributes (since NEW is always NULL)
Context Setup: Single flat context eliminates attribute ID collisions:
// Use withRowContext for constraint evaluation
const flatRow = composeOldNewRow(oldRow, newRow, columnCount);
await withAsyncRowContext(rctx, flatRowDescriptor, () => flatRow, async () => {
await evaluateConstraints(rctx);
});Symbol Resolution: Column references resolve deterministically:
- Unqualified
column→ NEW.column (INSERT/UPDATE) or OLD.column (DELETE) - Qualified
OLD.column→ OLD section of flat row - Qualified
NEW.column→ NEW section of flat row
Constraint Evaluation: All constraints (CHECK, NOT NULL) evaluate against the flat row context without conditional logic. CHECK constraints that reference other relations automatically defer to transaction boundaries via the DeferredConstraintQueue, so emitters simply enqueue the evaluator and continue streaming. Deferred rows reuse a single runtime context and row slot for efficiency while preserving scope isolation.
- Eliminates Context Conflicts: Single flat descriptor prevents attribute ID collisions
- Simplifies Emitters: No conditional OLD/NEW context setup across mutation types
- Consistent Symbol Space: OLD/NEW always available, always defined for all operations
- Easier Reasoning: Users can reliably reference OLD/NEW in any mutation context
- Future-Proof: Supports triggers, defaults, and other features that need OLD/NEW access
The previous model used conditional OLD/NEW descriptors with metadata properties:
// OLD MODEL - conditional contexts
if (plan.oldRowDescriptor) {
rctx.context.set(plan.oldRowDescriptor, () => updateData.oldRow);
}
// Plus hidden __updateRowData properties
// CURRENT MODEL - always-present flat context with helpers
const flatRow = composeOldNewRow(oldRow, newRow, columnCount);
const slot = createRowSlot(rctx, flatRowDescriptor);
try {
for await (const flatRow of flatRows) {
slot.set(flatRow);
yield flatRow;
}
} finally {
slot.close();
}This eliminates the break-fix cycle where attribute ID conflicts caused unpredictable column resolution behavior.
Quereus supports table-level mutation context variables that provide per-operation parameters for default values and constraints. This feature integrates seamlessly with the existing attribute-based context system.
Mutation context allows you to:
- Define reusable parameters in table definitions
- Pass different values for each DML operation
- Use context in default value expressions
- Reference context in CHECK constraints (both immediate and deferred)
- Provide runtime-specific validation rules
Planning Phase:
- Context variables are parsed from
WITH CONTEXT (...)clauses - Variables converted to attributes with unique attribute IDs
- Context scope created using
RegisteredScope - Both unqualified (
varName) and qualified (context.varName) symbols registered - Context variables registered BEFORE OLD/NEW columns (giving them shadowing precedence)
Runtime Phase:
- Context values evaluated once per statement (not per row)
- Context stored in row descriptor using attribute ID mapping
- Context made available via
createRowSlot()for the statement lifetime - Context composed with OLD/NEW rows for constraint evaluation:
[context..., old..., new...]
Mutation context variables are registered in scopes using the same mechanism as table columns:
// In constraint-builder.ts
contextAttributes.forEach((attr, contextVarIndex) => {
const contextVar = tableSchema.mutationContext![contextVarIndex];
const varNameLower = contextVar.name.toLowerCase();
// Register both unqualified and qualified names
constraintScope.registerSymbol(varNameLower, (exp, s) =>
new ColumnReferenceNode(s, exp as AST.ColumnExpr, attr.type, attr.id, contextVarIndex)
);
constraintScope.registerSymbol(`context.${varNameLower}`, (exp, s) =>
new ColumnReferenceNode(s, exp as AST.ColumnExpr, attr.type, attr.id, contextVarIndex)
);
});Resolution Order:
- Context variables registered first (in constraint scopes)
- OLD/NEW columns registered after
- Unqualified references resolve to context if name matches
- Qualified
context.varNamealways resolves to context
Context Evaluation:
// In constraint-check emitter
// Evaluate context once per statement
const contextRow: Row = [];
for (const contextEvaluator of contextEvalFunctions) {
const value = await contextEvaluator(rctx) as SqlValue;
contextRow.push(value);
}
// Install context for statement duration
const contextSlot = createRowSlot(rctx, contextDescriptor);
contextSlot.set(contextRow);
try {
// Process rows - context available to all child operations
for await (const row of inputRows) {
// Defaults and constraints can reference context variables
}
} finally {
contextSlot.close();
}Combined Row Composition: For constraint evaluation, context is composed with OLD/NEW rows:
const combinedRow = [...contextRow, ...oldRow, ...newRow];
const combinedDescriptor = composeCombinedDescriptor(contextDescriptor, flatRowDescriptor);Descriptor Composition:
function composeCombinedDescriptor(
contextDescriptor: RowDescriptor,
flatRowDescriptor: RowDescriptor
): RowDescriptor {
const combined: RowDescriptor = [];
const contextLength = Object.keys(contextDescriptor).length;
// Context attributes: indices 0..contextLength-1
for (const attrIdStr in contextDescriptor) {
const attrId = parseInt(attrIdStr);
combined[attrId] = contextDescriptor[attrId];
}
// OLD/NEW attributes: offset by contextLength
for (const attrIdStr in flatRowDescriptor) {
const attrId = parseInt(attrIdStr);
combined[attrId] = flatRowDescriptor[attrId] + contextLength;
}
return combined;
}Mutation context is captured and preserved for deferred constraints:
Queueing:
rctx.db._queueDeferredConstraintRow(
baseTable,
constraintName,
row.slice() as Row,
flatRowDescriptor,
evaluator,
connectionId,
contextRow, // Captured context values
contextDescriptor // Context row descriptor
);Evaluation at COMMIT:
// Compose context with flat row for deferred evaluation
const evaluationRow = entry.contextRow
? [...entry.contextRow, ...entry.row]
: entry.row;
const evaluationDescriptor = entry.contextRow && entry.contextDescriptor
? composeCombinedDescriptor(entry.contextDescriptor, entry.descriptor)
: entry.descriptor;
// Evaluate with context available
const slot = createRowSlot(runtimeCtx, evaluationDescriptor);
slot.set(evaluationRow);
const value = await entry.evaluator(runtimeCtx);DML Nodes (InsertNode, UpdateNode, DeleteNode):
mutationContextValues?: Map<string, ScalarPlanNode>- Value expressions for each variablecontextAttributes?: Attribute[]- Attribute metadata for context variablescontextDescriptor?: RowDescriptor- Maps attribute IDs to row indices
ConstraintCheckNode:
- Receives mutation context from parent DML node
- Stores context for use during emission
- Passes context through optimizer transformations
Attribute-Based Context:
- Mutation context uses the same attribute ID system as OLD/NEW rows
- Context attributes have unique, stable IDs
- No special handling needed - integrates with existing
resolveAttribute()
Row Descriptors:
- Context uses standard row descriptors
- Context row composed with OLD/NEW rows for constraint evaluation
- Single combined descriptor provides unified attribute lookup
Transaction Support:
- Context evaluated per statement
- Captured for deferred constraints
- Preserved across savepoints (part of queued row data)
When adding new mutation operations:
- Process
stmt.contextValuesin the builder - Create context attributes with unique IDs
- Build context expression plan nodes
- Create context scope and register variables (both forms)
- Pass context scope when evaluating defaults
- Pass context attributes to
buildConstraintChecks() - Create context descriptor from attributes
- Pass mutation context to plan node constructors
- Pass mutation context to ConstraintCheckNode
Key Points:
- Context is evaluated once per statement (performance)
- Context persists for entire statement via row slot
- Context composed with OLD/NEW for constraints
- Deferred constraints capture and preserve context
- Use existing context helpers - no special APIs needed
Quereus enforces that all expressions in CHECK constraints and DEFAULT values must be deterministic. This ensures that captured statements at the VTable update boundary are fully deterministic and replayable.
Non-deterministic expressions (like random(), datetime('now')) produce different values on each execution. If these were allowed in constraints or defaults:
- Replaying captured statements would produce different results
- Constraint validation could be inconsistent
- Audit logs would not be reproducible
Rejected in Constraints and Defaults:
random(),randomblob()- Random value generationdate('now'),time('now'),datetime('now'),julianday('now')- Current time functions- User-defined functions marked as non-deterministic
- Any expression containing non-deterministic sub-expressions
Allowed in Constraints and Defaults:
- Constant literals:
42,'hello',true - Deterministic built-in functions:
upper(),lower(),abs(),round() - Column references:
NEW.price,OLD.quantity - Mutation context variables:
context.timestamp,context.user_id - User-defined functions marked as deterministic (default)
Instead of using non-deterministic functions directly, pass values via mutation context:
-- ❌ REJECTED: Non-deterministic default
create table orders (
id integer primary key,
created_at text default datetime('now') -- ERROR
);
-- ✅ ACCEPTED: Use mutation context
create table orders (
id integer primary key,
created_at text default timestamp
) with context (
timestamp text
);
-- Pass the timestamp when inserting
insert into orders (id)
with context timestamp = datetime('now')
values (1);Determinism is tracked through the PhysicalProperties system:
interface PhysicalProperties {
deterministic: boolean; // Same inputs → same outputs
readonly: boolean; // No side effects
idempotent: boolean; // Safe to call multiple times
constant: boolean; // Directly produces constant result
}Propagation Rules:
- Function nodes check the
FunctionFlags.DETERMINISTICflag - Non-deterministic functions mark
deterministic: false - Properties propagate bottom-up through the expression tree
- Parent nodes inherit the most restrictive properties from children
User-Defined Functions:
// Non-deterministic UDF
db.createScalarFunction("my_random",
{ numArgs: 0, deterministic: false },
() => Math.random()
);
// Deterministic UDF (default)
db.createScalarFunction("my_upper",
{ numArgs: 1, deterministic: true }, // or omit (defaults to true)
(text) => String(text).toUpperCase()
);CREATE TABLE:
- DEFAULT expressions are rejected if they reference bind parameters
(
?,:name) or table columns; both are detected via an AST pre-walk before expression building. - DEFAULT expressions are then built and rejected if their physical
deterministicproperty is false (e.g.random()). - CHECK constraints are walked at DDL time: any function call is looked up
against the registry and rejected unless it has the
DETERMINISTICflag. Bind parameters (?,:name) are also rejected at DDL time. Column references inside CHECK predicates are validated later, at INSERT/UPDATE time, when the row scope is established.
These DDL-time guards currently fire only on CREATE TABLE. The ALTER TABLE paths (ADD COLUMN, ADD CONSTRAINT, ALTER COLUMN ... SET DEFAULT) do not yet route through the same validators — extending them is
a known follow-up.
INSERT/UPDATE:
- DEFAULT expressions validated when building row expansion
- CHECK constraints validated when building constraint checks (full column-scope resolution happens here)
ALTER TABLE ADD CONSTRAINT:
- Validation deferred to first INSERT/UPDATE (constraints may reference NEW/OLD)
// Streaming pattern with row slot
async function* run(rctx: RuntimeContext, input: AsyncIterable<Row>): AsyncIterable<Row> {
const slot = createRowSlot(rctx, rowDescriptor);
try {
for await (const row of input) {
slot.set(row);
yield processRow(row, rctx);
}
} finally {
slot.close();
}
}function run(rctx: RuntimeContext, ...args: SqlValue[]): SqlValue {
// Compute result
return result;
}async function run(rctx: RuntimeContext, input: AsyncIterable<Row>): Promise<undefined> {
// Process each row with proper context
for await (const row of input) {
await withAsyncRowContext(rctx, rowDescriptor, () => row, async () => {
await performMutation(row, rctx);
});
}
return undefined;
}The Quereus optimizer transforms logical plan nodes into physical execution plans between the builder and runtime phases. This section covers the key aspects relevant to runtime emitter development.
The optimizer uses a single plan node hierarchy with logical-to-physical transformation:
- Logical nodes: Created by the builder - may or may not have physical emitters
- Physical nodes: Transformed by the optimizer with execution properties
- Attribute preservation: Column references use stable attribute IDs that survive optimization
Key optimizer guarantees for emitter authors:
- Every node reaching the emitter phase has
physicalproperties set - Attribute IDs remain stable across all transformations
- Column references can rely on deterministic attribute ID lookup
- The optimizer respects virtual table capabilities via
BestAccessPlan
Physical properties capture execution characteristics used by both optimizer and runtime:
interface PhysicalProperties {
ordering?: Ordering[]; // Output row ordering
estimatedRows?: number; // Cardinality estimate
uniqueKeys?: number[][]; // Attribute IDs forming unique keys
deterministic: boolean; // Pure and repeatable
readonly: boolean; // No side effects
}These can be overridden through overriding the computePhysical() plan node method, otherwise these are inherited from child nodes or are defaults.
computePhysical(): Partial<PhysicalProperties> {
return {
readonly: false, // Side-effecting (should only be set if the node directly mutates)
estimatedRows: this.source.estimatedRows,
uniqueKeys: this.source.getType().keys.map(key => key.map(colRef => colRef.index)),
};
}The runtime's column reference resolution relies on the optimizer's attribute ID preservation:
- Each column has a unique, stable attribute ID assigned during planning
- The optimizer's
withChildren()infrastructure preserves these IDs - Runtime column lookup uses attribute IDs, not names or positions
- This enables robust resolution across arbitrary plan transformations
For comprehensive optimizer details, see the Optimizer Documentation.
Quereus can reuse a single incremental runtime to power multiple features that react to base-table changes: transaction-deferred assertions, materialized views, and future trigger-like facilities. The core idea is to execute only the affected slice of a registered query at transaction boundaries using binding-aware residual plans.
- Reuse the same delta infrastructure across assertions and views
- Execute parameterized residuals per affected key/group; fall back to global when required
- Respect savepoints; changes rolled back via SAVEPOINT should not be visible to COMMIT-time checks
- ChangeCapture (existing): per-transaction change log tracking primary-key tuples per base table; savepoint aware
- BindingInference: classifies a plan’s table references as row-specific, group-specific, or global (see optimizer doc) and identifies binding keys (PK/unique or group-by/partition keys)
- ParameterizedPlanCache: per-registrant (assertion/view) and per relationKey, store prepared residual plans with parameter slots aligned to key order
- DeltaExecutor: at COMMIT, select impacted registrants, decide global vs per-binding execution, early-exit on first violation (assertions) or produce delta rows (views)
- Assertions: run residuals and fail on first non-empty result (error → rollback)
- Materialized Views (future): compute ΔView and merge into cached table (insert/update/delete)
- On SAVEPOINT: push a new change layer
- On ROLLBACK TO: discard the top layer
- On RELEASE: merge the top layer into the previous one
explain_assertion(name)exposes classification and prepared parameter layout for assertions- Future:
explain_view_delta(name)for materialized views
This design keeps runtime responsibilities focused on execution and caching, while the optimizer provides binding inference and plan shaping. See the optimizer document for analysis details.
SQL requires different coercion strategies for different contexts. Quereus handles coercion at two levels:
- Plan-time coercion — Cross-category comparisons (numeric vs textual) are resolved by the planner, which inserts explicit
CastNodes so the runtime never needs implicit coercion for comparisons or BETWEEN. - Runtime coercion — Arithmetic and aggregate contexts still use centralized utilities from
src/util/coercion.ts.
Comparison Context (plan-time):
- When one operand is numeric and the other textual, the planner wraps the textual operand in a CastNode targeting the numeric type
- Example:
42 = '42'→ planner rewrites to42 = cast('42' as INTEGER), both sides are numeric at runtime - No runtime coercion is needed; the generic comparison path only handles temporal checks
Arithmetic Context (coerceToNumberForArithmetic):
- Converts all values to numbers for arithmetic operations
- Non-numeric strings become 0 (SQL standard behavior)
- Example:
'abc' + 0→ 0,'123' + 0→ 123 - Used in: +, -, *, /, % operators
Aggregate Context (coerceForAggregate):
- Function-specific coercion for aggregate arguments
- COUNT functions skip coercion, numeric aggregates (SUM/AVG) coerce strings
- Used in: aggregate function argument processing
import { coerceToNumberForArithmetic, coerceForAggregate } from '../../util/coercion.js';
// In arithmetic operations:
const n1 = coerceToNumberForArithmetic(v1);
const n2 = coerceToNumberForArithmetic(v2);
const result = n1 + n2;
// In aggregate functions:
const coercedArg = coerceForAggregate(rawValue, functionName);
accumulator = schema.stepFunction(accumulator, coercedArg);Critical Rule: Never implement custom coercion logic in individual emitters. Always use centralized utilities (for arithmetic/aggregates) or rely on planner-inserted CastNodes (for comparisons) to ensure consistent behavior across the system.
Wrong:
const seen = new Set<string>();
const key = JSON.stringify(value);
if (seen.has(key)) continue; // Skip duplicate
seen.add(key);Problems:
- Doesn't follow SQL comparison rules
1and"1"have different JSON representations but may be equal in SQL- Doesn't respect collation rules
Correct — pre-resolve comparators at emit time to avoid runtime overhead:
import { BTree } from 'inheritree';
import { createCollationRowComparator, resolveCollation, BINARY_COLLATION } from '../util/comparison.js';
// At emit time: pre-resolve collation-based row comparator
const collationRowComparator = createCollationRowComparator(
attributes.map(attr => attr.type.collationName ? resolveCollation(attr.type.collationName) : BINARY_COLLATION)
);
// At runtime: use pre-resolved comparator in BTree
const distinctTree = new BTree<Row, Row>(
(row: Row) => row,
collationRowComparator
);
const existingPath = distinctTree.insert(row);
if (!existingPath.on) {
continue; // Skip duplicate
}For typed contexts (where runtime types are guaranteed, e.g. GROUP BY keys):
import { createTypedComparator, resolveCollation } from '../util/comparison.js';
// At emit time: pre-resolve typed comparator from expression type
const exprType = expr.getType();
const collationFunc = exprType.collationName ? resolveCollation(exprType.collationName) : undefined;
const comparator = createTypedComparator(exprType.logicalType, collationFunc);Based on real implementation experiences, here are key concepts and common mistakes to avoid when developing runtime emitters.
❌ NEVER call instructions directly:
// WRONG - bypasses scheduler
const result = await conditionInstruction.run(rctx, ...args);
if (result) {
// This breaks the execution model
}✅ ALWAYS use scheduler callbacks:
// CORRECT - scheduler handles execution and dependency resolution
if (conditionCallback) {
const conditionResult = await conditionCallback(rctx);
conditionMet = !!conditionResult;
}Why this matters:
- The scheduler manages instruction dependencies and execution order
- Direct calls bypass dependency resolution and can cause race conditions
- Callbacks ensure proper context setup and error handling
When debugging column resolution issues, understand the scope hierarchy:
Scope Resolution Order:
MultiScopechecks child scopes in order (first match wins)AliasedScopehandles qualified references (table.column)RegisteredScopecontains actual column-to-attribute mappings
Common scope resolution bugs:
- Missing scope in MultiScope: Check that all relevant scopes are included
- Wrong scope order: Earlier scopes shadow later ones - order matters
- Projection scope issues: After
ProjectNode, ensure both projection outputs AND original qualified columns are accessible
Debugging pattern:
// Add targeted debugging for specific symbols
if (symbolKey === 'problematic.column') {
console.log('Scope resolution for', symbolKey, 'in', this.scopes.length, 'scopes');
}Context Setup Pattern:
// Always use context helpers for row context
// Pattern 1: Streaming with row slot
const slot = createRowSlot(rctx, rowDescriptor);
try {
for await (const row of rows) {
slot.set(row);
const result = await processRow(row, rctx);
yield result;
}
} finally {
slot.close();
}
// Pattern 2: One-off async evaluation
const result = await withAsyncRowContext(rctx, rowDescriptor, () => row, async () => {
// Async processing with automatic cleanup
return await processRow(row, rctx);
});
// Pattern 3: One-off synchronous evaluation
const slot = createRowSlot(rctx, rowDescriptor);
try {
for await (const row of rows) {
slot.set(row);
yield processRow(row, rctx);
}
} finally {
slot.close(); // CRITICAL: Always clean up
}Common context bugs:
- Forgetting cleanup: Memory leaks and stale context references
- Wrong row descriptor: Attribute IDs don't match actual row structure
- Context timing: Setting up context too late or cleaning up too early
Effective debugging approaches:
- Start with scope resolution: Most column reference errors are scope issues
- Check context timing: Verify context is available when column references execute
- Use targeted logging: Debug specific symbols rather than everything
- Verify row descriptors: Ensure attribute IDs match actual row structure
- Test instruction isolation: Verify emitters work independently before integration
Debugging environment variables:
# Context lifecycle and column resolution
DEBUG=quereus:runtime:context* yarn test
# Specific operation tracing
DEBUG=quereus:runtime:emit:join yarn test
# Full runtime tracing (verbose)
DEBUG=quereus:runtime* yarn testRecursive CTE Execution Pattern
Quereus provides helper functions in src/runtime/context-helpers.ts to simplify context operations and ensure consistent behavior:
createRowSlot(rctx, descriptor)
- Creates a mutable slot for efficient streaming operations
- Installs context once, updates by reference (no Map mutations per row)
- Used by all high-frequency streaming emitters: scan, join, filter, project, distinct
- Must call
close()to clean up
resolveAttribute(rctx, attributeId, columnName?)
- Looks up an attribute ID in the current context via O(1) attribute index
- Falls back to linear scan newest → oldest when the indexed slot isn't populated yet
- Throws descriptive error if not found
withRowContext(rctx, descriptor, rowGetter, fn)
- Executes a function with a row context
- Executes a synchronous function with a row context
- Ensures proper cleanup in finally block
- Use for synchronous expression evaluation
withAsyncRowContext(rctx, descriptor, rowGetter, fn)
- Executes an async function with a row context
- Ensures proper cleanup in finally block
- Use for async operations (e.g., constraint checks)
Example usage:
import { createRowSlot, withRowContext, withAsyncRowContext, resolveAttribute } from '../context-helpers.js';
// Pattern 1: Streaming with row slot
async function* run(rctx: RuntimeContext, rows: AsyncIterable<Row>): AsyncIterable<Row> {
const slot = createRowSlot(rctx, rowDescriptor);
try {
for await (const row of rows) {
slot.set(row);
const value = someExpression(rctx); // Column refs auto-resolve
yield processRow(row, value);
}
} finally {
slot.close();
}
}
// Pattern 2: Synchronous expression evaluation
function evaluateSync(rctx: RuntimeContext, row: Row): SqlValue {
return withRowContext(rctx, rowDescriptor, () => row, () => {
// Synchronous expression evaluation
return someExpression(rctx);
});
}
// Pattern 4: Async operation with context
async function evaluateAsync(rctx: RuntimeContext, row: Row): Promise<SqlValue> {
return await withAsyncRowContext(rctx, rowDescriptor, () => row, async () => {
// Async operation (e.g., constraint check)
return await someAsyncOperation(rctx);
});
}