Description
When parsing a non-Unicode regex that contains named backreferences with the strict: true
option, a syntax error will always be throws regardless of whether the regex is actually correct or not.
Example:
const { RegExpValidator } = require("regexpp")
const validator = new RegExpValidator({ strict: true, ecmaVersion: 2020 })
validator.validatePattern(/(?<foo>A)\k<foo>/.source, undefined, undefined, false)
This produces the following error:
SyntaxError: Invalid regular expression: /(?<foo>A)\k<foo>/: Invalid escape
at RegExpValidator.raise ([...]\regexpp\.temp\src\validator.ts:847:15)
at RegExpValidator.consumeAtomEscape ([...]\regexpp\.temp\src\validator.ts:1475:18)
at RegExpValidator.consumeReverseSolidusAtomEscape ([...]\regexpp\.temp\src\validator.ts:1245:22)
at RegExpValidator.consumeAtom ([...]\regexpp\.temp\src\validator.ts:1213:18)
at RegExpValidator.consumeTerm ([...]\regexpp\.temp\src\validator.ts:1027:23)
at RegExpValidator.consumeAlternative ([...]\regexpp\.temp\src\validator.ts:1000:53)
at RegExpValidator.consumeDisjunction ([...]\regexpp\.temp\src\validator.ts:976:18)
at RegExpValidator.consumePattern ([...]\regexpp\.temp\src\validator.ts:901:14)
at RegExpValidator.validatePattern ([...]\regexpp\.temp\src\validator.ts:531:14)
at validateRegExpPattern (my-project\app.ts:12:75)
However, the regex /(?<foo>A)\k<foo>/
is valid. As stated in the proposal:
In this proposal,
\k<foo>
in non-Unicode RegExps will continue to match the literal string"k<foo>"
unless the RegExp contains a named group, in which case it will match that group or be a syntax error, depending on whether or not the RegExp has a named group namedfoo
.
Since the regex contains a named capturing group, \k<foo>
has to be parsed as a backreference. Since Annex B doesn't say anything about named backreferences, regexpp should parse this regex even with strict: true
.
However, regexpp parses it as an invalid(?) escape and throws an error in strict mode. This is because validation is done is two passes (1, 2). The bug occurs because the n
flag isn't set in the first pass causing the syntax error. This can be seen in the stack trace: the second-last line - at RegExpValidator.validatePattern ([...]\validator.ts:531:14)
- is the first parsing pass.
The fix for this bug is to determine whether the regex contains named groups ahead of time, similar to how the number of capturing groups is counted before parsing. I will make a PR.