-
Notifications
You must be signed in to change notification settings - Fork 291
fix(sim): decode simulator output on Windows #1167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ | |
| import sys | ||
| import os | ||
| from os import environ, listdir, pathsep | ||
| import locale | ||
| import subprocess | ||
| from pathlib import Path | ||
| from typing import List | ||
|
|
@@ -360,14 +361,36 @@ def check_output(command, env=None): | |
| """ | ||
| Wrapper arround subprocess.check_output | ||
| """ | ||
| def _decode(data: bytes) -> str: | ||
| """Decode tool output robustly across platforms. | ||
|
|
||
| Some simulators on Windows emit output in a legacy code page (e.g. cp1252), | ||
| which can raise UnicodeDecodeError if decoded as strict UTF-8. | ||
| """ | ||
|
|
||
| encodings_to_try = ( | ||
| "utf-8", | ||
| "utf-8-sig", | ||
| locale.getpreferredencoding(False) or "utf-8", | ||
|
||
| "cp1252", | ||
|
||
| ) | ||
|
|
||
| for encoding in encodings_to_try: | ||
| try: | ||
| return data.decode(encoding) | ||
| except UnicodeDecodeError: | ||
| continue | ||
|
|
||
| return data.decode("utf-8", errors="replace") | ||
|
|
||
| try: | ||
| output = subprocess.check_output( # pylint: disable=unexpected-keyword-arg | ||
| command, env=env, stderr=subprocess.STDOUT | ||
| ) | ||
| except subprocess.CalledProcessError as err: | ||
| err.output = err.output.decode("utf-8") | ||
| err.output = _decode(err.output) | ||
| raise err | ||
| return output.decode("utf-8") | ||
| return _decode(output) | ||
|
|
||
|
|
||
| def check_executable(simulator_name, prefix, executable_name): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the string is encoded with utf-8-sig, utf-8 decoding will pass without errors. You do get some extra characters though:
If utf-8 is decoded as utf-8-sig, it will work most of the time as the decoder recognizes that the initial expected bytes are missing.
The exception is if the utf-8 string starts with the bytes expected at the beginning of a utf-8-sig string. Rare but fully legal.
To be strict I suggest just using utf-8. Worst case is you get som extra characters being printed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That is a right comment and I will drop utf-8-sig from the chain: