Description
Java, Python, and PCRE all restrict what is allowed to appear in a lookbehind assertion. These restrictions are similar, but with some differences.
-
In Python, a lookbehind must have a fixed length and may only contain repetitions like
{n}
.\X
is forbidden in a lookbehind. -
In PCRE, a lookbehind may contain alternatives with different lengths, which can even be nested, as well as repetitions with an upper bound. As in Python,
\X
and unbounded repetitions are forbidden -
Java has the most complicated rules. It allows arbitrary repetition in a lookbehind; however, every repetition must have a fixed length, and may not include alternations or
?
even when they have the same length.?
/{0,1}
is not treated as a repetition by Java.Java treats repeated groups (
(a)+
) differently than other kinds of repetitions (a+
,[a]+
). Infinite repetitions of groups must satisfy the following constraints, where$n$ is the number of code points matched by the group:-
when
$n$ is odd:- it must be preceded by an odd number
$r$ of infinite repetitions, or - the part before the repeated group must match
$c$ code points (excluding infinite repetitions), so that$c - r < n$
- it must be preceded by an odd number
-
when
$n$ is even:- it must be preceded by an odd number
$r$ of infinite repetitions, and - the part before the repeated group must match
$c$ code points (excluding infinite repetitions), so that$c - r \in [0; n)$
- it must be preceded by an odd number
These rules might not be comprehensive, but I haven't found an exception so far.
-
Describe the bug
Pomsky allows anything in a lookbehind.
Expected behavior
A compatibility error should be produced.