jsColorEngine docs: ← Project README · Bench · Performance · Roadmap · Examples · API: Profile · Transform · Loader
This folder is the "how it works and why it's fast" layer of the docs. The project README tells you what it does, the Performance page tells you how fast it goes and what we learned measuring it. This folder answers the third question: why.
If you're here you probably:
- Want to know whether to trust the numbers
- Are considering using (or writing) a JavaScript CMS and want to see what "hand-tuned for the JIT" actually means in practice
- Are curious why jsColorEngine's native-JS hot path can keep up with an Emscripten wasm32 build of LittleCMS
- Are reviewing a PR against the kernel code and want the rationale for a design decision
Four things we learned writing this that we didn't expect going in. Each one has a deep-dive page behind it if you want the receipts.
-
Modern JS JITs emit wicked good assembly when the loop is well-shaped. TurboFan lowered our hand-unrolled tetrahedral interpolator to inline x64 SSE/AVX with no spills, no boxing, no allocations, and zero int↔float conversions. If you blanked the filenames you couldn't tell the int kernel's hot body from C. The reputation that numeric JS is slow is about a decade out of date on monomorphic typed-array loops. → JIT inspection
-
Dedicated, unrolled hot loops are still the win. The "tidy it up into a helper function" instinct is wrong here. Our unrolled 6-branch kernels fit in 24–34 % of a 32 KB L1i; re-rolling them would save 5 KB but cost 5–10 cycles per pixel in branch mispredicts. The counter-intuitive code is the fast code, and there's a 200-line comment block at the top of
src/Transform.jsto make sure nobody "cleans it up". → LUT modes, Architecture -
WASM SIMD does 4× the work per cycle and it feels like cheating. Channel-parallel v128 lanes +
v128.load64_zeroinstead of a gather got us 3.0-3.5× over JS'int'on the 3D tetrahedral kernel, bit-exact, in 1.3 KB of.wasm. The hardware really does have this much headroom — you just have to pick the right axis to vectorise along. (We got the axis wrong the first time. See the page.) → WASM kernels -
Plain-optimised JS can beat an Emscripten-wasm32 port of a battle-hardened C library. jsColorEngine's
'int'JS kernel beatslcms-wasmon every direction we benchmarked. That isn't "JS vs C"; it's "one specialised kernel per LUT shape tuned for V8" vs "a general-purpose C codebase compiled through Emscripten, carrying all of lcms2's dispatcher generality, with no SIMD and no Fast Float plugin". The comparison is really about specialisation, not language choice. → Performance -
And the answers match too — jsCE's float pipeline agrees with lcms's float pipeline within visual-noise levels across a 130-file ICC oracle suite (worst case 0.06 ΔE76 on Lab outputs, 1.24 LSB on RGB, 0.04 % ink on CMYK). The one structural divergence (~17 LSB on a niche grey-1c Perceptual path) is fully diagnosed and parked with a documented reason. We're a faithful float-precision peer, not a port. → Accuracy
The rest of this folder is the evidence. If any of these claims smell wrong, the detail pages carry the asm dumps, bench scripts, and repro recipes.
| Page | What it covers |
|---|---|
| Architecture | Pipeline model: how an ICC profile becomes a kernel. Stages, LUT build, kernel dispatch, accuracy vs image paths |
| LUT modes | float / int / int-wasm-scalar / int-wasm-simd — what each mode is, when it's picked, how it's bit-exact vs the reference |
| JIT inspection | V8 emitted x64 assembly walked line-by-line. Working-set size, instruction mix, move classification, the "named temps" micro-test. Why the scalar JS kernel is as fast as it is |
| WASM kernels | Hand-written .wat for 3D and 4D tetrahedral interp. SIMD channel-parallel layout, rolling-shutter pack, the V8 inliner lesson. Reproduction recipes |
| Compiled pipeline (POC) | transform.compile() — turning the runtime stage walker into one straight-line JS function per profile chain. 1.75× on sRGB→CMYK, three measurement methods, and the path to getSource() / toModule() |
| Accuracy | jsColorEngine vs Little CMS — the bench/lcms_compat harness, methodology, headline numbers (130/150 files sub-LSB), the one localised divergence we found, and the design philosophy that keeps jsCE an independent engine rather than an lcms reimplementation |
| LUTs | Custom LUT creation, TIFF-based visual editing, lcms-wasm bridge, portable JSON serialisation format, and the architecture for CMS-agnostic LUT capture and redistribution. Companion how-to: samples/lutbuilder.md. |
If jsColorEngine is your first brush with colour management, or you want to go deeper on the topics we build on:
- LittleCMS — littlecms.com ·
source (mm2/Little-CMS) ·
API manual PDF.
LittleCMS is the reference open-source CMS in C. A lot of jsColorEngine's
core concepts — pipeline stages, tetrahedral interpolation, rendering
intents — are lifted from LittleCMS. This is not a port, but reading
lcms2/src/cmsintrp.cis time well spent. - International Color Consortium (ICC) — color.org — the standards body. Good starting point for "what even is an ICC profile".
- ICC specifications — color.org/specification · direct: ICC.1:2010 (v4.3) PDF · ICC.1:2001-04 (v2.4) PDF. The authoritative specs. v4 is the current standard; v2 is still overwhelmingly dominant in the wild. jsColorEngine decodes both.
- CIE colour science — CIE publications. Reference for the Lab, XYZ, chromatic adaptation and ΔE formulas the engine uses internally.
- WebAssembly SIMD —
WebAssembly SIMD proposal ·
v128 opcode table.
Relevant if you want to read the
.watinsrc/wasm/and understand the instruction choices in WASM kernels.