A deployable event-ordering runtime for distributed systems that still use clocks, but cannot rely on one globally synchronized clock as the truth model.
causal-order helps developers design and run event processing, replay, and recovery flows without assuming the system has one perfect global time source. It does not replace clocks or timestamps; it provides a deployable ordering layer when timestamp order alone is not enough to explain what happened or keep event integrity honest.
Website: https://causal-order.gazali.one
It helps you:
- order what can be ordered
- preserve concurrency only when it can be justified honestly
- flag what is suspicious
- keep the difference between proof, inference, fallback, and unknown
Distributed systems often produce misleading timelines:
- clocks drift across regions
- replayed events can look newer than original events
- offline devices sync late
- ingestion order differs from creation order
- some events are truly concurrent
causal-order exists to make that uncertainty visible instead of hiding it.
causal-order is built around a simple rule:
Be easy to use at the surface, but hard to misuse into false certainty.
In practice, that means:
- not every event set should be forced into one total order
- explicit causal evidence outranks clock appearance
- cross-node events without supported causal evidence should usually remain
unknown - shared
traceIdorpartitionmetadata does not, by itself, imply causality - streaming finality is operational, not causal truth
Supported causal evidence today is intentionally narrow:
parentEventIddependencyEventIds- same-node monotonic
sequence
This library is not trying to eliminate clocks. It is trying to stop treating wall-clock agreement as the truth model for a distributed system.
Given a set of distributed events, the library returns:
ordered: events withorderIndex,orderBasis, andconfidenceanomalies: invalid, suspicious, or operationally important recordsstats: summary counts for the batch
Confidence is explicit:
proven: explicit causal evidence existsderived: order was inferred from useful but weaker metadatafallback: deterministic ordering was imposed for stabilityunknown: the library cannot honestly justify the claim
npm install causal-orderESM only.
Focused imports are also available when you want a narrower public entrypoint:
import { orderEvents } from "causal-order/order"
import { orderEvents as batchOrderEvents } from "causal-order/batch"
import { orderEventStream } from "causal-order/stream"
import { createProcessingTimeWatermark } from "causal-order/watermarks"
import { translateBatch } from "causal-order/translate"
import { createHlcClock } from "causal-order/clock"causal-order is the core runtime in the causal-order package ecosystem.
Additional packages extend the runtime with focused operational capabilities.
| Package | Purpose |
|---|---|
causal-order |
Core causal event ordering runtime |
@causal-order/dedupe |
Duplicate-event filtering before causal ordering |
See: https://www.npmjs.com/package/@causal-order/dedupe
Current runtime posture:
- published package support starts at
Node.js >=20 - active development and performance validation target
Node.js 24 - CI still exercises
Node.js 18,20, and24to catch regressions around the supported floor - the package is ESM only
For the fuller compatibility and support boundary, see COMPATIBILITY.md.
import { orderEvents } from "causal-order"
const events = [
{
id: "evt-1",
nodeId: "orders-api",
clock: {
physicalTimeMs: 1714971840123n,
logicalCounter: 0,
nodeId: "orders-api",
},
sequence: 1n,
payload: { type: "order.created" },
},
{
id: "evt-2",
nodeId: "payments-worker",
clock: {
physicalTimeMs: 1714971840125n,
logicalCounter: 1,
nodeId: "payments-worker",
},
parentEventId: "evt-1",
payload: { type: "payment.captured" },
},
]
const result = orderEvents(events, {
strict: false,
detectAnomalies: true,
})
console.log(result.ordered)
console.log(result.anomalies)Example output shape:
[
{
event: events[0],
orderIndex: 0n,
orderBasis: "sequence",
confidence: "derived",
},
{
event: events[1],
orderIndex: 1n,
orderBasis: "causal",
confidence: "proven",
causalEvidence: [{ type: "parent_event", parentEventId: "evt-1" }],
},
]The important part is not just the order. It is the explanation of why that order exists and how trustworthy it is.
The default batch-ordering posture is designed to keep uncertainty visible instead of flattening it away:
const result = orderEvents(events, {
strict: false,
detectAnomalies: true,
})The main options to understand are:
- think of the option boundary this way:
strictcontrols fail-fast behavior for ordering and validationallowUnknownOrdercontrols whether unresolved placement stays explicit in non-strict outputdetectAnomaliescontrols how much diagnostic analysis is emitted with the result
strict: falsekeeps the run warning-visible by default so invalid or unresolved cases can surface as structured anomalies instead of stopping the whole batch immediately translation fail-fast behavior is configured separately throughtranslateBatch()policy rather than through this same optionallowUnknownOrderdefaults to an uncertainty-visible posture where unresolved placement can still be emitted with warning-level visibility in non-strict modedetectAnomalies: truekeeps anomaly reporting on by default because the package treats anomaly visibility as part of the ordinary answer, not as a debugging extra
Two boundary rules matter here:
- setting
allowUnknownOrder: falsestrengthens severity posture for unresolved output, but it does not invent stronger certainty or silently rewrite the ordering result - setting
detectAnomalies: falsereduces emitted diagnostic output, but it does not make the result truer, cleaner, or more causally justified
If you are unsure, start with warning-visible defaults and tighten later once the surrounding workflow is actually prepared to reject uncertain input.
When you need a direct helper rather than a full ordering pass, prefer the primary pairwise methods:
import { compareByHlc, compareDeterministically } from "causal-order"
const hlcRelation = compareByHlc(eventA.clock, eventB.clock)
const fallbackOrder = compareDeterministically(eventA, eventB, "event_id")
console.log(hlcRelation)
console.log(fallbackOrder)As of 0.5.0, the preferred names are:
compareByHlc()for direct HLC comparisoncompareDeterministically()for deterministic fallback comparison
The stable surface now keeps the primary names only.
Focused entrypoints and the root causal-order import both emphasize the same current API story instead of mixing canonical names with transitional aliases.
applyTieBreaker() remains available as a lower-level helper when you specifically
want the tie-break step on its own rather than the full deterministic fallback comparison.
When your data is not already in the library's event-envelope shape, use translateBatch() first and then pass the translated records into orderEvents().
import { orderEvents, translateBatch } from "causal-order"
const records = [
{
eventId: "evt-1",
source: "orders-api",
occurredAt: "1714971840123",
sequence: 1n,
body: { type: "order.created" },
},
{
eventId: "evt-2",
source: "payments-worker",
occurredAt: 1714971840125,
sequence: 1n,
parent: "evt-1",
body: { type: "payment.captured" },
},
]
const translated = translateBatch(records, {
getEventId: (record) => record.eventId,
getNodeId: (record) => record.source,
getPhysicalTime: (record) => record.occurredAt,
getSequence: (record) => record.sequence,
getParentEventId: (record) => record.parent,
getPayload: (record) => record.body,
})
console.log(translated.anomalies)
const ordered = orderEvents(translated.translated, {
strict: false,
detectAnomalies: true,
})
console.log(ordered.ordered)What matters most:
- required mappers:
getEventId,getNodeId,getPhysicalTime - accepted timestamp inputs:
bigint, safe integernumber, or canonical integerstring - rejected timestamp inputs include
Date, ISO timestamp strings, decimals, exponent notation, and unsafe integers - translated results split accepted records from structured translation anomalies
Keep pre-translation shaping narrow. Pass original source records into translateBatch() whenever possible and let it own coercion, rejection, and anomaly reporting.
If you need the deeper ingress-policy details, see:
- API: translateBatch()
- Guide: Policy Guidance
For operational review, replay audits, or emitted-batch inspection, the current release includes a small additive helper layer on top of the core runtime output:
import {
inspectOrderResult,
orderEvents,
summarizeTranslationAnomalies,
translateBatch,
} from "causal-order"
const translated = translateBatch(records, config)
const translationSummary = summarizeTranslationAnomalies(translated.anomalies)
const ordered = orderEvents(translated.translated, {
strict: false,
detectAnomalies: true,
})
const inspection = inspectOrderResult(ordered)
console.log(translationSummary)
console.log(inspection)The 1.0.0 helper layer is intentionally narrow:
summarizeEventAnomalies()summarizeTranslationAnomalies()explainOrderedEvent()inspectOrderResult()inspectOrderBatch()
These helpers summarize or explain existing package output. They do not hide anomalies, rewrite ordered state, or invent stronger causal claims than the runtime already supports.
If you need adjacent adapters, workflow glue, or domain policy on top of this surface, see:
For large or unbounded event flows, use orderEventStream() instead of assuming everything belongs in one in-memory batch.
That includes both ordinary day-to-day stream processing and delayed reconnect, offline sync, or recovery flows where late arrivals are part of normal operations.
Each emitted batch is a StreamOrderBatch carrying the currently ready ordered events plus stream-specific metadata such as watermark, optional correction, and isFinal.
import { orderEventStream } from "causal-order"
for await (const batch of orderEventStream(source(), {
batchSize: 100,
maxLateArrivalMs: 30_000n,
lateArrivalPolicy: "flag",
strict: false,
})) {
console.log(batch.events)
console.log(batch.anomalies)
console.log(batch.watermark, batch.isFinal)
}Keep this mental model in mind:
- the watermark controls operational readiness, not causal truth
- late events are handled by explicit policy rather than being silently hidden
- non-final output may need later reconciliation, especially in reconnect-heavy flows
For the full stream contract, see:
causal-order is primarily for deployable operational event processing in distributed systems that cannot rely on one perfect global clock.
It is meant to be deployable as the event-ordering engine in that workflow, not as a complete end-to-end event platform by itself.
One of its main applications is straightforward deployment as the ordering layer for distributed event workflows that still need honest causal ordering at runtime.
That includes:
- continuous stream processing with explicit late-arrival and reconciliation behavior
- delayed reconnect and recovery workflows
- offline sync inspection
- replay analysis
Other strong use cases include:
- multi-region debugging
- audit timeline reconstruction
- late-arrival stream handling
- distributed incident analysis
It is especially useful when:
- events come from multiple services, devices, or regions
- timestamps are not enough on their own
- ordering claims need explanation
- concurrency matters
- suspicious metadata should not be silently normalized
It is less useful when:
- you already have authoritative causal ordering elsewhere
- your data has already been normalized into the exact ordering truth you trust by a consensus layer such as Raft or Paxos
- you only need a plain timestamp sort
If a consensus system has already settled the ordering question cleanly for the stream you care about, causal-order is usually not the interesting part of the stack anymore.
The library is strongest when ordering truth is still messy, partial, cross-boundary, or worth explaining.
Evaluate the package:
- start with What This Library Is
- read Quick Start Scenarios
- check Supported Vs Unsupported Usage
- scan Examples And Entrypoints
Build a first flow:
- start with Package Surface Overview
- use Policy Guidance to choose strictness and late-arrival behavior
- keep Upgrade Expectations nearby if you are adopting it into a maintained system
- use Mental Model and Clocks, Causality, And Why HLC when you need deeper design context
Operate or debug a deployed workflow:
- use Replay Inspection Workflow
- use Streaming Reconciliation Workflow
- use Incident Review Guide
- use Anomaly Interpretation Guide
- use Operator Metrics Guide
- use Streaming Recovery And Resync and Streaming Finality for stream-specific behavior
Study failure patterns and workloads:
- Case Studies
- Replay Corruption
- Multi-Region Drift
- False Audit Timelines
- Offline Sync Anomalies
- Causal Inversion
- AWS-Inspired DynamoDB Outage Exercise
- Stress Hardening
- After-Hours Batch Processing
- Realistic Workloads
Runnable examples:
- Examples Index
- Minimal Ingress Example
- Ingress Replay Pipeline Example
- Local Durable Buffer Replay Example
- False Audit Timeline Example
- Offline Sync Anomalies Example
- Streaming Recovery And Resync Example
The runnable examples are written from the package consumer point of view.
They use the public causal-order package surface so copied example code still looks like the right starting point in a real project.
1.0.0 is the current published causal-order release.
Current package posture:
causal-orderis ready to use as a deployable event-ordering runtime today when you want confidence-aware ordering, anomaly visibility, and explicit causal justification in a real workflow- deployment is a first-class application of the package, not only a forensic or post-incident use case
- bounded batch recovery, replay, reconciliation, and audit-style workloads are the clearest production-credible starting point in the current contract
- streaming is also part of the public contract, with the current hardening and runtime-stability guides defining how to deploy it with explicit lateness, correction, and reconciliation posture
- the repo already includes heavier deployment-facing evidence, including named
250kbatch and stream validation profiles and a documented1,000,000-event AWS-inspired streaming outage exercise - raw-record translation into the event envelope and its machine-readable failure contract are now part of the package surface rather than repo-local work
Confidence ladder:
CIcovers everyday correctness, docs sync, and package-facing examplesPost-Merge 150k Confidenceis the routine stronger automated confidence runManual 250k Confidenceis the heavier on-demand batch and stream validation pathManual AWS Incident Confidenceis the outage-shape streaming confidence run with GC-observed summary artifacts
1.0.0 finalizes:
- the first stable public contract for the bounded batch, streaming, translation, validation, and inspection surfaces
- the last planned helper-alias cleanup by removing
compareClocks()after the0.9.xdeprecation line - the stable package-facing docs and website alignment around
causal-orderas a deployable event-ordering runtime
For the 1.0.0 package-facing release surface, see:
- Extension Boundary Guide
- Policy Guidance
- Replay Inspection Workflow
- Streaming Reconciliation Workflow
- Incident Review Guide
- Anomaly Interpretation Guide
- Operator Metrics Guide
- Stress Hardening
- AWS-Inspired DynamoDB Outage Exercise
If you are working in the repository itself, start with:
The most useful local gates are:
npm run check
npm test
npm run release:checkMIT. See LICENSE.
See SECURITY.md for supported versions and private vulnerability reporting guidance.
See CONTRIBUTING.md for repository workflow, verification expectations, and documentation update guidance.