Skip to content

Consider compile-time constraints on collator options #1903

Open
@hsivonen

Description

@hsivonen

@sffc wrote:

Thought: I see that CollatorOptions has a lot of knobs in the form of runtime functions. I am wondering whether there is room to move more of these options to compile-time type parameters, so that we can do static code analysis and identify what code and/or data we can remove. At runtime, you load smaller chunks of data, which is good, but where possible we'd like to go further and make things as good as possible for dead-code elimination.

I replied:

In principle, the run-time flags can alternatively come as defaults from language-specific collation data. That makes me generally pessimistic about compile-time type parameters.

If you want to eliminate alternate shifted at compile time, you need to promise not to collate Thai. If you want to eliminate backward second level at compile time, you need to promise not to collate Canadian French. If you want to eliminate case first handling, you need to promise not to collate Danish or Maltese.

Even strength has potential to be overridden by a language-specific default for Japanese. Currently, CLDR explicitly states the tertiary strength for Japanese, which is also the general default, so it's now a no-op. As I understand it, tertiary for Japanese is for performance but for de jure correctness, it should be the identical level.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-designArea: Architecture or designC-collatorComponent: Collation, normalizationS-mediumSize: Less than a week (larger bug fix or enhancement)help wantedIssue needs an assigneequestionUnresolved questions; type unclear

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions