Skip to content

Fewer allocations in Transliterator runtime #3978

Open
@skius

Description

@skius

Segments (regex capture groups) currently use a Vec<String>. That should almost certainly be a SmallVec, it could potentially even be a ShortSlice.

Component-based hardcoded transliterators like NFC allocate, because the Writable/Write based API is incompatible with the in-place modifications we do when transliterating. I don't know how exactly the normalizing algorithms work, but if it's forward-only (no lookback), we could expose that and avoid allocating an intermediate String.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-transliteratorComponent: transliteratorC-unicodeComponent: Props, sets, tries

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions