Skip to content

Norminette crashes against .c/.h files that aren't plain text #567

@grogu-42

Description

@grogu-42

Describe the bug
Running norminette against a file with a .c/.h extension but isn't plain text, will make it crash with the following output:

Setting locale to en_US
Traceback (most recent call last):
  File "/tmp/env/bin/norminette", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/tmp/env/lib/python3.12/site-packages/norminette/__main__.py", line 136, in main
    tokens = list(lexer)
             ^^^^^^^^^^^
  File "/tmp/env/lib/python3.12/site-packages/norminette/lexer/lexer.py", line 536, in __iter__
    while token := self.get_next_token():
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/env/lib/python3.12/site-packages/norminette/lexer/lexer.py", line 512, in get_next_token
    while self.raw_peek():
          ^^^^^^^^^^^^^^^
  File "/tmp/env/lib/python3.12/site-packages/norminette/lexer/lexer.py", line 104, in raw_peek
    if (pos := self.__pos + offset) < len(self.file.source):
                                          ^^^^^^^^^^^^^^^^
  File "/tmp/env/lib/python3.12/site-packages/norminette/file.py", line 20, in source
    self._source = file.read()
                   ^^^^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd8 in position 96: invalid continuation byte

Additional infos
norminette 3.3.59, Python 3.12.3, Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39
Also tested 3.3.55 and it has the same issue but without a stacktrace:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd8 in position 96: invalid continuation byte

Reproduction steps

  • Copy any executable and add the .c/.h extension at the end
  • run norminette on it

Additional context
Is it possible to analyze the file before parsing and output a valid error message if it isn't a plain text file? Or is it intended behavior to let is crash?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions