Skip to content

Add UTF-8 capabilities #3

Open
Open
@javierguerragiraldez

Description

@javierguerragiraldez

Lua 5.3 already includes some:

  • '\u{XX...}' embeds the UTF-8 encoding in string literals.
  • %U in lua_pushfstring
  • utf8 library (for codepoint handling, no Unicode semantics)

surprisingly, it seems it doesn't include

  • %U in string.format

Other things that could be managed by a separate / optional library:

  • conversion between different encodings. (windows still uses some mixture of UCS2 and UTF16)
  • collation
  • normalization, case folding
  • text boundaries

The most obvious objection about including these capabilities with the language is the need of big tables. I think it would be valuable to evaluate what can the basic language do to make a binding as transparent as possible, without a hard dependency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions