Fix escaped untracked files#440
Conversation
|
Related Issue: #439 |
diff_cover/util.py
Outdated
| the filename. | ||
| """ | ||
| if filename.startswith(chr(34)) and filename.endswith(chr(34)): | ||
| filename = ast.literal_eval(filename) |
There was a problem hiding this comment.
So I am freaked out by including literal_eval in the code. Even if its safer than eval
I am not super familiar with c style escapes so I poked around a bit. Not finding much I ended up talking to copilot. After some chat it came up with this
def to_unescaped_filename(filename: str) -> str:
"""Try to unescape the given filename.
Some filenames given by git might be escaped with C-style escape sequences
and surrounded by double quotes.
"""
if not (filename.startswith('"') and filename.endswith('"')):
return filename
# Remove surrounding quotes
unquoted = filename[1:-1]
# Handle C-style escape sequences
result = []
i = 0
while i < len(unquoted):
if unquoted[i] == '\\' and i + 1 < len(unquoted):
# Handle common C escape sequences
next_char = unquoted[i + 1]
result.append({
'\\': '\\',
'"': '"',
"'": "'",
'n': '\n',
't': '\t',
'r': '\r',
'b': '\b',
'f': '\f',
}.get(next_char, next_char))
i += 2
else:
result.append(unquoted[i])
i += 1
return ''.join(result)
I checked out your branch and ran the tests. Any reason to think this would be a problem?
There was a problem hiding this comment.
Definitely understand the hesitation, only suggested it because the other solutions seemed to rely on certain encodings, and not sure how that behaves from system to system.
I tried to find the c-style encoding that the git man page talked about. The first thing I stumbled upon was
https://github.com/git/git/blob/a554262210b4a2ee6fa2d594e1f09f5830888c56/quote.c#L267
Not sure if that is the relevant code, but tbh I am not sure what exactly is happening here anyway.
But they look into a char array which looks somewhat similar to the dict you use:
https://github.com/git/git/blob/a554262210b4a2ee6fa2d594e1f09f5830888c56/quote.c#L222
Maybe adding the 'a' and removing ' would make it an equal reversion, not really sure here.
Otherwise the function looks sensible and the tests pass. Also checked if I encounter any exceptions with the files that currently lead to problems (should be covered by the tests, but you never know) and all works fine.
Even if the function isn't perfect the current state is neither so it at least fixes a known subset.
There was a problem hiding this comment.
im happy to evolve this over time if other people find issues. How about putting this code into this pr, then I can merge it
There was a problem hiding this comment.
Sorry that it took me a while. I added your changes in a separate commit, and rebased on the latest commit from your main branch.
Files that are escaped by git have to be unescaped so that the diff_reporter can open the file.
Instead of using ast.literal_eval use a lookup based unescape logic.
|
Alright, lets get this merged. Thanks for the PR! |
Files that are escaped by git have to be unescaped so that the diff_reporter can open the file.