Skip to content

Possible bug in line break trie data #29

Open
@toptensoftware

Description

@toptensoftware

I was just going through cleaning up my version of this code that I ported to C# and noticed a possible problem with the way the trie data is generated for the line break data.

See this line of code

if ((type != null) && (rangeType !== type)) {

Seems this code is trying to coalesce consecutive ranges with the same class into a single trie.setRange call. The problem is that it's checking for a change of class, but not that the ranges are consecutive and without gaps.

Shouldn't that line read something like this:

    if (((type != null) && (rangeType !== type)) || parseInt(rangeStart, 16) != parseInt(end, 16) + 1)

Also, it seems the UnicodeTrieBuilder already coalesces runs so this isn't even necessary. I reduced it to this:

  var re = /^([0-9A-F]+)(?:\.\.([0-9A-F]+))?\s*;\s*(.*?)\s*#/gm
  var m;
  while (m = re.exec(data))
  {
    var from = parseInt(m[1], 16);
    var to = m[2] === undefined ? from : parseInt(m[2], 16);
    var prop = m[3];

    lineBreakClassesTrie.setRange(from, to, LineBreakClass[prop], true);
  }

Unfortunately none of this resolves any of those 30 non-passing tests.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions