make_tokenizer doesn't deal with binary tokenizers

```
tokenize = make_tokenizer([
    (u'x', (br'\xff\n',)),
])

tokens = list(tokenize(b"\xff\n"))
```

throws

```
  File "/Users/gsnedders/Documents/other-projects/funcparserlib/funcparserlib/funcparserlib/tests/test_parsing.py", line 76, in test_tokenize_bytes
    tokens = list(tokenize(b"\xff\n"))
  File "/Users/gsnedders/Documents/other-projects/funcparserlib/funcparserlib/funcparserlib/lexer.py", line 107, in f
    t = match_specs(compiled, str, i, (line, pos))
  File "/Users/gsnedders/Documents/other-projects/funcparserlib/funcparserlib/funcparserlib/lexer.py", line 91, in match_specs
    nls = value.count(u'\n')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
```

`match_specs` needs to handle `unicode` and `bytes` line feed characters.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

make_tokenizer doesn't deal with binary tokenizers #46

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

make_tokenizer doesn't deal with binary tokenizers #46

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions