Uint8Array conversion to and from base64, base32, base58, hex, utf8, utf16, bech32 and wif
And a TextEncoder / TextDecoder polyfill
Performs proper input validation, ensures no garbage-in-garbage-out
Tested on Node.js, Deno, Bun, browsers (including Servo), Hermes, QuickJS and barebone engines in CI (how?)
10-20xfaster thanBufferpolyfill2-10xfaster thaniconv-lite
The above was for the js fallback
It's up to 100x when native impl is available
e.g. in utf8fromString on Hermes / React Native or fromHex in Chrome
Also:
3-8xfaster thanbs5810-30xfaster than@scure/base(or>100xon Node.js <25)- Faster in
utf8toString/utf8fromStringthanBufferorTextDecoder/TextEncoderon Node.js
See Performance for more info
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding.js'Less than half the bundle size of text-encoding, whatwg-encoding or iconv-lite (gzipped or not), and is much faster. See also lite version.
Spec compliant, passing WPT and covered with extra tests.
Moreover, tests for this library uncovered bugs in all major implementations.
Faster than Node.js native implementation on Node.js.
These are only provided as a compatibility layer, prefer hardened APIs instead in new code.
-
TextDecodercan (and should) be used with{ fatal: true }option for all purposes demanding correctness / lossless transforms -
TextEncoderdoes not support a fatal mode per spec, it always performs replacement.That is not suitable for hashing, cryptography or consensus applications.
Otherwise there would be non-equal strings with equal signatures and hashes — the collision is caused by the lossy transform of a JS string to bytes. Those also survive e.g.JSON.stringify/JSON.parseor being sent over network.Use strict APIs in new applications, see
utf8fromString/utf16fromStringbelow.
Those throw on non-well-formed strings by default.
If you don't need support for legacy multi-byte encodings, you can use the lite import:
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-lite.js'This reduces the bundle size 10x:
from 90 KiB gzipped for @exodus/bytes/encoding.js to 9 KiB gzipped for @exodus/bytes/encoding-lite.js.
(For comparison, text-encoding module is 190 KiB gzipped, and iconv-lite is 194 KiB gzipped).
It still supports utf-8, utf-16le, utf-16be and all single-byte encodings specified by the spec,
the only difference is support for legacy multi-byte encodings.
Create a decoder for a supported one-byte encoding.
Returns a function decode(arr) that decodes bytes to a string.
Create a decoder for a supported legacy multi-byte encoding.
Returns a function decode(arr, stream = false) that decodes bytes to a string.
That function will have state while stream = true is used.
Decode windows-1252 bytes to a string.
Also supports ascii and latin-1 as those are strict subsets of windows-1252.
There is no loose variant for this encoding, all bytes can be decoded.
Same as windows1252toString = createSinglebyteDecoder('windows-1252').
Implements the Encoding standard: TextDecoder, TextEncoder, some hooks (see below).
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding.js'
// Hooks for standards
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding.js'TextDecoder implementation/polyfill.
TextEncoder implementation/polyfill.
Implements get an encoding from a string label.
Converts an encoding label to its name, as an ASCII-lowercased string.
If an encoding with that label does not exist, returns null.
This is the same as decoder.encoding getter,
except that it:
- Supports
replacementencoding and its labels - Does not throw for invalid labels and instead returns
null
All encoding names are also valid labels for corresponding encodings.
Implements BOM sniff legacy hook.
Given a TypedArray or an ArrayBuffer instance input, returns either of:
'utf-8', ifinputstarts with UTF-8 byte order mark.'utf-16le', ifinputstarts with UTF-16LE byte order mark.'utf-16be', ifinputstarts with UTF-16BE byte order mark.nullotherwise.
Implements decode legacy hook.
Given a TypedArray or an ArrayBuffer instance input and an optional fallbackEncoding
normalized encoding name, sniffs encoding from BOM with fallbackEncoding fallback and then
decodes the input using that encoding, skipping BOM if it was present.
Notes:
- BOM-sniffed encoding takes precedence over
fallbackEncodingoption per spec. Use with care. fallbackEncodingmust be ASCII-lowercased encoding name, e.g. a result ofnormalizeEncoding(label)call.- Always operates in non-fatal mode, aka replacement. It can convert different byte sequences to equal strings.
This method is similar to the following code, except that it doesn't support encoding labels and only expects lowercased encoding name:
new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding ?? 'utf-8').decode(input)import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding-lite.js'
// Hooks for standards
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding-lite.js'The exact same exports as @exodus/bytes/encoding.js are also exported as
@exodus/bytes/encoding-lite.js, with the difference that the lite version does not load
multi-byte TextDecoder encodings by default to reduce bundle size 10x.
The only affected encodings are: gbk, gb18030, big5, euc-jp, iso-2022-jp, shift_jis
and their labels when used with TextDecoder.
Legacy single-byte encodingds are loaded by default in both cases.
TextEncoder and hooks for standards (including normalizeEncoding) do not have any behavior
differences in the lite version and support full range if inputs.
To avoid inconsistencies, the exported classes and methods are exactly the same objects.
> lite = require('@exodus/bytes/encoding-lite.js')
[Module: null prototype] {
TextDecoder: [class TextDecoder],
TextEncoder: [class TextEncoder],
getBOMEncoding: [Function: getBOMEncoding],
legacyHookDecode: [Function: legacyHookDecode],
normalizeEncoding: [Function: normalizeEncoding]
}
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
Uncaught:
Error: Legacy multi-byte encodings are disabled in /encoding-lite.js, use /encoding.js for full encodings range support
> full = require('@exodus/bytes/encoding.js')
[Module: null prototype] {
TextDecoder: [class TextDecoder],
TextEncoder: [class TextEncoder],
getBOMEncoding: [Function: getBOMEncoding],
legacyHookDecode: [Function: legacyHookDecode],
normalizeEncoding: [Function: normalizeEncoding]
}
> full.TextDecoder === lite.TextDecoder
true
> new full.TextDecoder('big5').decode(Uint8Array.of(0x25))
'%'
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
'%'