Skip to content

Handle smart case per each pattern separately #1980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

xzfc
Copy link

@xzfc xzfc commented Aug 29, 2021

When -S/--smart-case is enabled, analyze and apply a case-insensitivity flag to each search pattern separately rather than globally.

For the PCRE2 engine, wrap case-insensitive expressions into (?i:…). I.e. rg --pcre2 --smart-case -e foo -e bAr gives (?i:foo)|bAr.

For the default engine, produce each expression HIR separately, then combine them using Hir::alternation. Although the PCRE2 approach is also possible, it turned out to be slower on large pattern files.

Fixes #1791


As a side effect, each pattern is validated separately in the default engine; thus, this no longer works:

$ rg -e "something(" -e ")something"
regex parse error:
    something(
             ^
error: unclosed group

Also, this could be considered as a first step in fixing #478.

When -S/--smart-case is enabled, analyze and apply a case-insensitivity
flag to each search pattern separately rather than globally.

For the PCRE2 engine, wrap case-insensitive expressions into `(?i:…)`.
I.e. `rg --pcre2 --smart-case -e foo -e bAr` gives `(?i:foo)|bAr`.

For the default engine, produce each expression HIR separately, then
combine them using `Hir::alternation`.  Although the PCRE2 approach is
also possible, it turned out to be slower on large pattern files.

Fixes BurntSushi#1791
@BurntSushi
Copy link
Owner

Thanks for working on this. Unfortunately, a refactor I did made this PR unmergeable as-is. Actually, the specific kind of change pursued here is probably simpler to do now. With that said, I'm unsure of whether I want to parse every pattern individually like this. It could be a big additional cost when searching a lot of patterns. However, it is something I've been mulling over for making error reporting better. Anyway, I'm going to close this PR out for now but I might re-visit the idea later.

@BurntSushi BurntSushi closed this Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support apply smart case to each individual search pattern separately
2 participants