Skip to content

GazaliAhmad/causal-order

causal-order

CI Status Zero Dependencies Unified Package Footprint Node Runtime Stress Profile

causal-order banner

A deployable event-ordering runtime for distributed systems that still use clocks, but cannot rely on one globally synchronized clock as the truth model.

causal-order helps developers design and run event processing, replay, and recovery flows without assuming the system has one perfect global time source. It does not replace clocks or timestamps; it provides a deployable ordering layer when timestamp order alone is not enough to explain what happened or keep event integrity honest.

Website: https://causal-order.gazali.one

It helps you:

  • order what can be ordered
  • preserve concurrency only when it can be justified honestly
  • flag what is suspicious
  • keep the difference between proof, inference, fallback, and unknown

Why This Exists

Distributed systems often produce misleading timelines:

  • clocks drift across regions
  • replayed events can look newer than original events
  • offline devices sync late
  • ingestion order differs from creation order
  • some events are truly concurrent

causal-order exists to make that uncertainty visible instead of hiding it.

Mental Model

causal-order is built around a simple rule:

Be easy to use at the surface, but hard to misuse into false certainty.

In practice, that means:

  • not every event set should be forced into one total order
  • explicit causal evidence outranks clock appearance
  • cross-node events without supported causal evidence should usually remain unknown
  • shared traceId or partition metadata does not, by itself, imply causality
  • streaming finality is operational, not causal truth

Supported causal evidence today is intentionally narrow:

  • parentEventId
  • dependencyEventIds
  • same-node monotonic sequence

This library is not trying to eliminate clocks. It is trying to stop treating wall-clock agreement as the truth model for a distributed system.

What You Get

Given a set of distributed events, the library returns:

  • ordered: events with orderIndex, orderBasis, and confidence
  • anomalies: invalid, suspicious, or operationally important records
  • stats: summary counts for the batch

Confidence is explicit:

  • proven: explicit causal evidence exists
  • derived: order was inferred from useful but weaker metadata
  • fallback: deterministic ordering was imposed for stability
  • unknown: the library cannot honestly justify the claim

Install

npm install causal-order

ESM only.

Focused imports are also available when you want a narrower public entrypoint:

import { orderEvents } from "causal-order/order"
import { orderEvents as batchOrderEvents } from "causal-order/batch"
import { orderEventStream } from "causal-order/stream"
import { createProcessingTimeWatermark } from "causal-order/watermarks"
import { translateBatch } from "causal-order/translate"
import { createHlcClock } from "causal-order/clock"

Package Ecosystem

causal-order is the core runtime in the causal-order package ecosystem.

Additional packages extend the runtime with focused operational capabilities.

Package Purpose
causal-order Core causal event ordering runtime
@causal-order/dedupe Duplicate-event filtering before causal ordering

See: https://www.npmjs.com/package/@causal-order/dedupe

Runtime Policy

Current runtime posture:

  • published package support starts at Node.js >=20
  • active development and performance validation target Node.js 24
  • CI still exercises Node.js 18, 20, and 24 to catch regressions around the supported floor
  • the package is ESM only

For the fuller compatibility and support boundary, see COMPATIBILITY.md.

Quick Example

import { orderEvents } from "causal-order"

const events = [
  {
    id: "evt-1",
    nodeId: "orders-api",
    clock: {
      physicalTimeMs: 1714971840123n,
      logicalCounter: 0,
      nodeId: "orders-api",
    },
    sequence: 1n,
    payload: { type: "order.created" },
  },
  {
    id: "evt-2",
    nodeId: "payments-worker",
    clock: {
      physicalTimeMs: 1714971840125n,
      logicalCounter: 1,
      nodeId: "payments-worker",
    },
    parentEventId: "evt-1",
    payload: { type: "payment.captured" },
  },
]

const result = orderEvents(events, {
  strict: false,
  detectAnomalies: true,
})

console.log(result.ordered)
console.log(result.anomalies)

Example output shape:

[
  {
    event: events[0],
    orderIndex: 0n,
    orderBasis: "sequence",
    confidence: "derived",
  },
  {
    event: events[1],
    orderIndex: 1n,
    orderBasis: "causal",
    confidence: "proven",
    causalEvidence: [{ type: "parent_event", parentEventId: "evt-1" }],
  },
]

The important part is not just the order. It is the explanation of why that order exists and how trustworthy it is.

Default Option Posture

The default batch-ordering posture is designed to keep uncertainty visible instead of flattening it away:

const result = orderEvents(events, {
  strict: false,
  detectAnomalies: true,
})

The main options to understand are:

  • think of the option boundary this way:
    • strict controls fail-fast behavior for ordering and validation
    • allowUnknownOrder controls whether unresolved placement stays explicit in non-strict output
    • detectAnomalies controls how much diagnostic analysis is emitted with the result
  • strict: false keeps the run warning-visible by default so invalid or unresolved cases can surface as structured anomalies instead of stopping the whole batch immediately translation fail-fast behavior is configured separately through translateBatch() policy rather than through this same option
  • allowUnknownOrder defaults to an uncertainty-visible posture where unresolved placement can still be emitted with warning-level visibility in non-strict mode
  • detectAnomalies: true keeps anomaly reporting on by default because the package treats anomaly visibility as part of the ordinary answer, not as a debugging extra

Two boundary rules matter here:

  • setting allowUnknownOrder: false strengthens severity posture for unresolved output, but it does not invent stronger certainty or silently rewrite the ordering result
  • setting detectAnomalies: false reduces emitted diagnostic output, but it does not make the result truer, cleaner, or more causally justified

If you are unsure, start with warning-visible defaults and tighten later once the surrounding workflow is actually prepared to reject uncertain input.

Pairwise Helpers

When you need a direct helper rather than a full ordering pass, prefer the primary pairwise methods:

import { compareByHlc, compareDeterministically } from "causal-order"

const hlcRelation = compareByHlc(eventA.clock, eventB.clock)
const fallbackOrder = compareDeterministically(eventA, eventB, "event_id")

console.log(hlcRelation)
console.log(fallbackOrder)

As of 0.5.0, the preferred names are:

  • compareByHlc() for direct HLC comparison
  • compareDeterministically() for deterministic fallback comparison

The stable surface now keeps the primary names only. Focused entrypoints and the root causal-order import both emphasize the same current API story instead of mixing canonical names with transitional aliases. applyTieBreaker() remains available as a lower-level helper when you specifically want the tie-break step on its own rather than the full deterministic fallback comparison.

Raw Record Translation

When your data is not already in the library's event-envelope shape, use translateBatch() first and then pass the translated records into orderEvents().

import { orderEvents, translateBatch } from "causal-order"

const records = [
  {
    eventId: "evt-1",
    source: "orders-api",
    occurredAt: "1714971840123",
    sequence: 1n,
    body: { type: "order.created" },
  },
  {
    eventId: "evt-2",
    source: "payments-worker",
    occurredAt: 1714971840125,
    sequence: 1n,
    parent: "evt-1",
    body: { type: "payment.captured" },
  },
]

const translated = translateBatch(records, {
  getEventId: (record) => record.eventId,
  getNodeId: (record) => record.source,
  getPhysicalTime: (record) => record.occurredAt,
  getSequence: (record) => record.sequence,
  getParentEventId: (record) => record.parent,
  getPayload: (record) => record.body,
})

console.log(translated.anomalies)

const ordered = orderEvents(translated.translated, {
  strict: false,
  detectAnomalies: true,
})

console.log(ordered.ordered)

What matters most:

  • required mappers: getEventId, getNodeId, getPhysicalTime
  • accepted timestamp inputs: bigint, safe integer number, or canonical integer string
  • rejected timestamp inputs include Date, ISO timestamp strings, decimals, exponent notation, and unsafe integers
  • translated results split accepted records from structured translation anomalies

Keep pre-translation shaping narrow. Pass original source records into translateBatch() whenever possible and let it own coercion, rejection, and anomaly reporting.

If you need the deeper ingress-policy details, see:

Operational Inspection Helpers

For operational review, replay audits, or emitted-batch inspection, the current release includes a small additive helper layer on top of the core runtime output:

import {
  inspectOrderResult,
  orderEvents,
  summarizeTranslationAnomalies,
  translateBatch,
} from "causal-order"

const translated = translateBatch(records, config)
const translationSummary = summarizeTranslationAnomalies(translated.anomalies)

const ordered = orderEvents(translated.translated, {
  strict: false,
  detectAnomalies: true,
})

const inspection = inspectOrderResult(ordered)

console.log(translationSummary)
console.log(inspection)

The 1.0.0 helper layer is intentionally narrow:

  • summarizeEventAnomalies()
  • summarizeTranslationAnomalies()
  • explainOrderedEvent()
  • inspectOrderResult()
  • inspectOrderBatch()

These helpers summarize or explain existing package output. They do not hide anomalies, rewrite ordered state, or invent stronger causal claims than the runtime already supports.

If you need adjacent adapters, workflow glue, or domain policy on top of this surface, see:

Streaming Overview

For large or unbounded event flows, use orderEventStream() instead of assuming everything belongs in one in-memory batch.

That includes both ordinary day-to-day stream processing and delayed reconnect, offline sync, or recovery flows where late arrivals are part of normal operations. Each emitted batch is a StreamOrderBatch carrying the currently ready ordered events plus stream-specific metadata such as watermark, optional correction, and isFinal.

import { orderEventStream } from "causal-order"

for await (const batch of orderEventStream(source(), {
  batchSize: 100,
  maxLateArrivalMs: 30_000n,
  lateArrivalPolicy: "flag",
  strict: false,
})) {
  console.log(batch.events)
  console.log(batch.anomalies)
  console.log(batch.watermark, batch.isFinal)
}

Keep this mental model in mind:

  • the watermark controls operational readiness, not causal truth
  • late events are handled by explicit policy rather than being silently hidden
  • non-final output may need later reconciliation, especially in reconnect-heavy flows

For the full stream contract, see:

When To Use It

causal-order is primarily for deployable operational event processing in distributed systems that cannot rely on one perfect global clock. It is meant to be deployable as the event-ordering engine in that workflow, not as a complete end-to-end event platform by itself. One of its main applications is straightforward deployment as the ordering layer for distributed event workflows that still need honest causal ordering at runtime.

That includes:

  • continuous stream processing with explicit late-arrival and reconciliation behavior
  • delayed reconnect and recovery workflows
  • offline sync inspection
  • replay analysis

Other strong use cases include:

  • multi-region debugging
  • audit timeline reconstruction
  • late-arrival stream handling
  • distributed incident analysis

It is especially useful when:

  • events come from multiple services, devices, or regions
  • timestamps are not enough on their own
  • ordering claims need explanation
  • concurrency matters
  • suspicious metadata should not be silently normalized

It is less useful when:

  • you already have authoritative causal ordering elsewhere
  • your data has already been normalized into the exact ordering truth you trust by a consensus layer such as Raft or Paxos
  • you only need a plain timestamp sort

If a consensus system has already settled the ordering question cleanly for the stream you care about, causal-order is usually not the interesting part of the stack anymore. The library is strongest when ordering truth is still messy, partial, cross-boundary, or worth explaining.

Get Started

Evaluate the package:

Build a first flow:

Operate or debug a deployed workflow:

Study failure patterns and workloads:

Runnable examples:

The runnable examples are written from the package consumer point of view. They use the public causal-order package surface so copied example code still looks like the right starting point in a real project.

Status

1.0.0 is the current published causal-order release.

Current package posture:

  • causal-order is ready to use as a deployable event-ordering runtime today when you want confidence-aware ordering, anomaly visibility, and explicit causal justification in a real workflow
  • deployment is a first-class application of the package, not only a forensic or post-incident use case
  • bounded batch recovery, replay, reconciliation, and audit-style workloads are the clearest production-credible starting point in the current contract
  • streaming is also part of the public contract, with the current hardening and runtime-stability guides defining how to deploy it with explicit lateness, correction, and reconciliation posture
  • the repo already includes heavier deployment-facing evidence, including named 250k batch and stream validation profiles and a documented 1,000,000-event AWS-inspired streaming outage exercise
  • raw-record translation into the event envelope and its machine-readable failure contract are now part of the package surface rather than repo-local work

Confidence ladder:

  • CI covers everyday correctness, docs sync, and package-facing examples
  • Post-Merge 150k Confidence is the routine stronger automated confidence run
  • Manual 250k Confidence is the heavier on-demand batch and stream validation path
  • Manual AWS Incident Confidence is the outage-shape streaming confidence run with GC-observed summary artifacts

1.0.0 finalizes:

  • the first stable public contract for the bounded batch, streaming, translation, validation, and inspection surfaces
  • the last planned helper-alias cleanup by removing compareClocks() after the 0.9.x deprecation line
  • the stable package-facing docs and website alignment around causal-order as a deployable event-ordering runtime

For the 1.0.0 package-facing release surface, see:

Repository Development

If you are working in the repository itself, start with:

The most useful local gates are:

npm run check
npm test
npm run release:check

License

MIT. See LICENSE.

Security

See SECURITY.md for supported versions and private vulnerability reporting guidance.

Contributing

See CONTRIBUTING.md for repository workflow, verification expectations, and documentation update guidance.