Description
What version of ripgrep are you using?
ripgrep 12.1.1 (rev 7cb2113)
-SIMD -AVX (compiled)
+SIMD -AVX (runtime)
How did you install ripgrep?
Unzipped Windows 64 bit versions (both MSVC and GNU behave identical, using MSVC).
What operating system are you using ripgrep on?
Windows 10 Pro
Build 18362.19h1_release.190318-1202
Describe your bug.
Inconsistent output of umlauts on Windows depending on running ripgrep from CMD or Git Bash.
What are the steps to reproduce the behavior?
Both shells are configured to use Lucida Console font.
Using standard CMD (PowerShell not tested):
cd ripgrep-test
type subdir\ExampleWithUmlauts.cs
- Displays garbled umlauts
rg SubjectCodes
- Displays file name with Windows-like backslash
- Displays correct umlauts
Using Git Bash (MINGW64):
cd ripgrep-test
cat subdir/ExampleWithUmlauts.cs
- Displays correct umlauts
rg SubjectCodes
- Displays file name with unwanted backslash instead for UNIX-like (forward) slash
- Displays garbled umlauts
What is the actual behavior?
CMD:
C:\Users\Sandra.Eickel\Documents\ripgrep-test>type subdir\ExampleWithUmlauts.cs
namespace ZUGFeRD_Test
{
class ZugFerd1ExtendedWarenrechnungGenerator
{
private InvoiceDescriptor _generateDescriptor()
{
desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
desc.AddNote("MUSTERLIEFERANT GMBH BAHNHOFSTRASSE 99 99199 MUSTERHAUSEN Geschäftsführung: Max Mustermann USt-IdNr: DE123456789 Telefon: +49 932 431 0 www.musterlieferant.de HRB Nr. 372876 Amtsgericht Musterstadt GLN 4304171000002 WEEE-Reg-Nr.: DE87654321",
SubjectCodes.REG);
desc.AddNote("Leergutwert: 46,50");
};
}
}
C:\Users\Sandra.Eickel\Documents\ripgrep-test>rg --debug SubjectCodes
DEBUG|grep_regex::literal|crates\regex\src\literal.rs:58: literal prefixes detected: Literals { lits: [Complete(SubjectCodes)], limit_size: 250, limit_class: 10 }
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, subdir\ExampleWithUmlauts.cs
7: desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
8: desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
10: SubjectCodes.REG);
12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
Git Bash:
$ cat subdir/ExampleWithUmlauts.cs
namespace ZUGFeRD_Test
{
class ZugFerd1ExtendedWarenrechnungGenerator
{
private InvoiceDescriptor _generateDescriptor()
{
desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
desc.AddNote("MUSTERLIEFERANT GMBH BAHNHOFSTRASSE 99 99199 MUSTERHAUSEN Geschäftsführung: Max Mustermann USt-IdNr: DE123456789 Telefon: +49 932 431 0 www.musterlieferant.de HRB Nr. 372876 Amtsgericht Musterstadt GLN 4304171000002 WEEE-Reg-Nr.: DE87654321",
SubjectCodes.REG);
desc.AddNote("Leergutwert: 46,50");
};
}
}
$ rg --debug SubjectCodes
DEBUG|grep_regex::literal|crates\regex\src\literal.rs:58: literal prefixes detected: Literals { lits: [Complete(SubjectCodes)], limit_size: 250, limit_class: 10 }
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
subdir\ExampleWithUmlauts.cs
7: desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
8: desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
10: SubjectCodes.REG);
What is the expected behavior?
While it is nice that it produces readable umlauts when run from CMD, the behaviour when run via Git Bash is not that helpful. It should output readable umlauts and it should print paths with UNIX-like slashes instead of problematic backslashes, including support for drives like "/c/path" for "C:\path".
I did not test with paths containing spaces or umlauts, which - at least for spaces - might have to be treated differently depending on the environment ...
See also #234 and #530 - even if those refer to file name globbing differences.