Skip to content

document Windows locale and path separator settings #1667

Open
@SandraEickel

Description

@SandraEickel

What version of ripgrep are you using?

ripgrep 12.1.1 (rev 7cb2113)
-SIMD -AVX (compiled)
+SIMD -AVX (runtime)

How did you install ripgrep?

Unzipped Windows 64 bit versions (both MSVC and GNU behave identical, using MSVC).

What operating system are you using ripgrep on?

Windows 10 Pro
Build 18362.19h1_release.190318-1202

Describe your bug.

Inconsistent output of umlauts on Windows depending on running ripgrep from CMD or Git Bash.

What are the steps to reproduce the behavior?

Both shells are configured to use Lucida Console font.

Using standard CMD (PowerShell not tested):
cd ripgrep-test

type subdir\ExampleWithUmlauts.cs

  • Displays garbled umlauts

rg SubjectCodes

  • Displays file name with Windows-like backslash
  • Displays correct umlauts

Using Git Bash (MINGW64):
cd ripgrep-test

cat subdir/ExampleWithUmlauts.cs

  • Displays correct umlauts

rg SubjectCodes

  • Displays file name with unwanted backslash instead for UNIX-like (forward) slash
  • Displays garbled umlauts

What is the actual behavior?

CMD:

C:\Users\Sandra.Eickel\Documents\ripgrep-test>type subdir\ExampleWithUmlauts.cs
namespace ZUGFeRD_Test
{
    class ZugFerd1ExtendedWarenrechnungGenerator
    {
        private InvoiceDescriptor _generateDescriptor()
        {
            desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
            desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
            desc.AddNote("MUSTERLIEFERANT GMBH BAHNHOFSTRASSE 99 99199 MUSTERHAUSEN Geschäftsführung: Max Mustermann USt-IdNr: DE123456789 Telefon: +49 932 431 0 www.musterlieferant.de HRB Nr. 372876 Amtsgericht Musterstadt GLN 4304171000002 WEEE-Reg-Nr.: DE87654321",
                         SubjectCodes.REG);
            desc.AddNote("Leergutwert: 46,50");
        };
    }
}

C:\Users\Sandra.Eickel\Documents\ripgrep-test>rg --debug SubjectCodes
DEBUG|grep_regex::literal|crates\regex\src\literal.rs:58: literal prefixes detected: Literals { lits: [Complete(SubjectCodes)], limit_size: 250, limit_class: 10 }
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, subdir\ExampleWithUmlauts.cs
7:            desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
8:            desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
10:                         SubjectCodes.REG);
12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes

Git Bash:

$ cat subdir/ExampleWithUmlauts.cs
namespace ZUGFeRD_Test
{
    class ZugFerd1ExtendedWarenrechnungGenerator
    {
        private InvoiceDescriptor _generateDescriptor()
        {
            desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
            desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
            desc.AddNote("MUSTERLIEFERANT GMBH BAHNHOFSTRASSE 99 99199 MUSTERHAUSEN Geschäftsführung: Max Mustermann USt-IdNr: DE123456789 Telefon: +49 932 431 0 www.musterlieferant.de HRB Nr. 372876 Amtsgericht Musterstadt GLN 4304171000002 WEEE-Reg-Nr.: DE87654321",
                         SubjectCodes.REG);
            desc.AddNote("Leergutwert: 46,50");
        };
    }
}

$ rg --debug SubjectCodes
DEBUG|grep_regex::literal|crates\regex\src\literal.rs:58: literal prefixes detected: Literals { lits: [Complete(SubjectCodes)], limit_size: 250, limit_class: 10 }
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|crates\globset\src\lib.rs:431: built glob set; 0 literals, 0 basenames, 12 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
subdir\ExampleWithUmlauts.cs
7:            desc.AddNote("Es bestehen Rabatt- oder Bonusvereinbarungen.", SubjectCodes.AAK, ContentCodes.ST3);
8:            desc.AddNote("Der Verkäufer bleibt Eigentümer der Waren bis zu vollständigen Erfüllung der Kaufpreisforderung.", SubjectCodes.AAJ, ContentCodes.EEV);
10:                         SubjectCodes.REG);

What is the expected behavior?

While it is nice that it produces readable umlauts when run from CMD, the behaviour when run via Git Bash is not that helpful. It should output readable umlauts and it should print paths with UNIX-like slashes instead of problematic backslashes, including support for drives like "/c/path" for "C:\path".
I did not test with paths containing spaces or umlauts, which - at least for spaces - might have to be treated differently depending on the environment ...

ripgrep-test.zip

See also #234 and #530 - even if those refer to file name globbing differences.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docAn issue with or an improvement to documentation.help wantedOthers are encouraged to work on this issue.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions