Skip to content

Replacement in single-byte encodings should not produce two output bytes per one input codepoint #368

@ChALkeR

Description

@ChALkeR
> [...String.fromCodePoint(0x10000)].length
1 // 1 codepoint
> require('iconv-lite').encode(String.fromCodePoint(0x10000), 'utf32', { addBOM: false })
<Buffer 00 00 01 00> // 1 codepoint
> require('iconv-lite').encode(String.fromCodePoint(0x10000), 'windows1252').toString()
'??' // two replacements?

This happens for all codepoints above 0xFFFF

Single-byte encodings map each byte to a codepoint directly, there is no reason for the encoder to spawn two replacements there

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions