Skip to content

Commit d213fba

Browse files
authored
Add support for $+ to pcre2_substitute (#817)
1 parent c742166 commit d213fba

File tree

8 files changed

+367
-231
lines changed

8 files changed

+367
-231
lines changed

doc/html/pcre2api.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3957,6 +3957,7 @@ <h2><a name="SEC38" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a></h
39573957
$` insert the substring that precedes the match
39583958
$' insert the substring that follows the match
39593959
$_ insert the entire input string
3960+
$+ insert the highest-numbered capture group which matched
39603961
$*MARK or ${*MARK} insert a control verb name
39613962
</pre>
39623963
Either a group number or a group name can be given for <i>n</i>, for example $2

doc/html/pcre2syntax.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -722,6 +722,7 @@ <h2><a name="SEC32" href="#TOC1">REPLACEMENT STRINGS</a></h2>
722722
$` insert the substring that precedes the match
723723
$' insert the substring that follows the match
724724
$_ insert the entire input string
725+
$+ insert the highest-numbered capture group which matched
725726
$*MARK or ${*MARK} insert a control verb name
726727
</pre>
727728
For ${n}, n can be a name or a number. If PCRE2_SUBSTITUTE_EXTENDED is set,

doc/pcre2.txt

Lines changed: 233 additions & 229 deletions
Large diffs are not rendered by default.

doc/pcre2api.3

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3956,6 +3956,7 @@ recognized:
39563956
$` insert the substring that precedes the match
39573957
$' insert the substring that follows the match
39583958
$_ insert the entire input string
3959+
$+ insert the highest-numbered capture group which matched
39593960
$*MARK or ${*MARK} insert a control verb name
39603961
.sp
39613962
Either a group number or a group name can be given for \fIn\fP, for example $2

doc/pcre2syntax.3

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -703,6 +703,7 @@ special character is the dollar character in one of the following forms:
703703
$` insert the substring that precedes the match
704704
$' insert the substring that follows the match
705705
$_ insert the entire input string
706+
$+ insert the highest-numbered capture group which matched
706707
$*MARK or ${*MARK} insert a control verb name
707708
.sp
708709
For ${n}, n can be a name or a number. If PCRE2_SUBSTITUTE_EXTENDED is set,

src/pcre2_substitute.c

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1143,6 +1143,44 @@ for (;;)
11431143
subptrend = subject + length;
11441144
goto SUBPTR_SUBSTITUTE;
11451145
}
1146+
else if (next == CHAR_PLUS &&
1147+
!(ptr+1 < repend && ptr[1] == CHAR_LEFT_CURLY_BRACKET))
1148+
{
1149+
/* Perl supports $+ for "highest captured group" (not the same as $^N
1150+
which is mainly only useful inside Perl's match callbacks). We also
1151+
don't accept "$+{..." since that's Perl syntax for our ${name}. */
1152+
++ptr;
1153+
if (code->top_bracket == 0)
1154+
{
1155+
/* Treat either as "no such group" or "all groups unset" based on the
1156+
PCRE2_SUBSTITUTE_UNKNOWN_UNSET option. */
1157+
if ((suboptions & PCRE2_SUBSTITUTE_UNKNOWN_UNSET) == 0)
1158+
{
1159+
rc = PCRE2_ERROR_NOSUBSTRING;
1160+
goto PTREXIT;
1161+
}
1162+
group = 0;
1163+
}
1164+
else
1165+
{
1166+
/* If we have any capture groups, then the ovector needs to be large
1167+
enough for all of them, or the result won't be accurate. */
1168+
if (match_data->oveccount < code->top_bracket + 1)
1169+
{
1170+
rc = PCRE2_ERROR_UNAVAILABLE;
1171+
goto PTREXIT;
1172+
}
1173+
for (group = code->top_bracket; group > 0; group--)
1174+
if (ovector[2*group] != PCRE2_UNSET) break;
1175+
}
1176+
if (group == 0)
1177+
{
1178+
if ((suboptions & PCRE2_SUBSTITUTE_UNSET_EMPTY) != 0) continue;
1179+
rc = PCRE2_ERROR_UNSET;
1180+
goto PTREXIT;
1181+
}
1182+
goto GROUP_SUBSTITUTE;
1183+
}
11461184

11471185
if (next == CHAR_LEFT_CURLY_BRACKET)
11481186
{

testdata/testinput2

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5010,6 +5010,34 @@ B)x/alt_verbnames,mark
50105010
LabcR\=replace=>$'<
50115011
LabcR\=replace=>$_<
50125012

5013+
/A(.)|B(.)/
5014+
AMZ\=replace=>$+<
5015+
BMZ\=replace=>$+<
5016+
AMZ\=replace=>$+
5017+
AMZ\=replace=>$+{name}
5018+
BMZ\=ovector=2
5019+
BMZ\=ovector=3
5020+
BMZ\=ovector=2,replace=>$+<
5021+
BMZ\=ovector=3,replace=>$+<
5022+
BMZ\=ovector=2,substitute_matched,replace=>$+<
5023+
BMZ\=ovector=3,substitute_matched,replace=>$+<
5024+
5025+
/A|(B)/
5026+
B\=replace=>$1<
5027+
A\=replace=>$1<
5028+
B\=substitute_unset_empty,replace=>$1<
5029+
A\=substitute_unset_empty,replace=>$1<
5030+
B\=replace=>$+<
5031+
A\=replace=>$+<
5032+
B\=substitute_unset_empty,replace=>$+<
5033+
A\=substitute_unset_empty,replace=>$+<
5034+
5035+
/ABC/
5036+
ABC\=replace=$+z
5037+
ABC\=substitute_unknown_unset,replace=$+z
5038+
ABC\=substitute_unset_empty,replace=$+z
5039+
ABC\=substitute_unknown_unset,substitute_unset_empty,replace=$+z
5040+
50135041
/^(o(\1{72}{\"{\\{00000059079}\d*){74}}){19}/I
50145042

50155043
/((p(?'K/

testdata/testoutput2

Lines changed: 64 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14154,8 +14154,8 @@ Failed: error 119 at offset 3: parentheses are too deeply nested
1415414154

1415514155
/abc/replace=a$++
1415614156
123abc
14157-
Failed: error -35 at offset 2 in replacement: invalid replacement string
14158-
here: a$ |<--| ++
14157+
Failed: error -49 at offset 3 in replacement: unknown substring
14158+
here: a$+ |<--| +
1415914159

1416014160
/abc/replace=a$bad
1416114161
123abc
@@ -15980,6 +15980,68 @@ Failed: error -49 at offset 3 in replacement: unknown substring
1598015980
LabcR\=replace=>$_<
1598115981
1: L>LabcR<R
1598215982

15983+
/A(.)|B(.)/
15984+
AMZ\=replace=>$+<
15985+
1: >M<Z
15986+
BMZ\=replace=>$+<
15987+
1: >M<Z
15988+
AMZ\=replace=>$+
15989+
1: >MZ
15990+
AMZ\=replace=>$+{name}
15991+
Failed: error -35 at offset 2 in replacement: invalid replacement string
15992+
here: >$ |<--| +{name}
15993+
BMZ\=ovector=2
15994+
Matched, but too many substrings
15995+
0: BM
15996+
1: <unset>
15997+
BMZ\=ovector=3
15998+
0: BM
15999+
1: <unset>
16000+
2: M
16001+
BMZ\=ovector=2,replace=>$+<
16002+
Failed: error -54 at offset 3 in replacement: requested value is not available
16003+
here: >$+ |<--| <
16004+
BMZ\=ovector=3,replace=>$+<
16005+
1: >M<Z
16006+
BMZ\=ovector=2,substitute_matched,replace=>$+<
16007+
Failed: error -54 at offset 3 in replacement: requested value is not available
16008+
here: >$+ |<--| <
16009+
BMZ\=ovector=3,substitute_matched,replace=>$+<
16010+
1: >M<Z
16011+
16012+
/A|(B)/
16013+
B\=replace=>$1<
16014+
1: >B<
16015+
A\=replace=>$1<
16016+
Failed: error -55 at offset 3 in replacement: requested value is not set
16017+
here: >$1 |<--| <
16018+
B\=substitute_unset_empty,replace=>$1<
16019+
1: >B<
16020+
A\=substitute_unset_empty,replace=>$1<
16021+
1: ><
16022+
B\=replace=>$+<
16023+
1: >B<
16024+
A\=replace=>$+<
16025+
Failed: error -55 at offset 3 in replacement: requested value is not set
16026+
here: >$+ |<--| <
16027+
B\=substitute_unset_empty,replace=>$+<
16028+
1: >B<
16029+
A\=substitute_unset_empty,replace=>$+<
16030+
1: ><
16031+
16032+
/ABC/
16033+
ABC\=replace=$+z
16034+
Failed: error -49 at offset 2 in replacement: unknown substring
16035+
here: $+ |<--| z
16036+
ABC\=substitute_unknown_unset,replace=$+z
16037+
Failed: error -55 at offset 2 in replacement: requested value is not set
16038+
here: $+ |<--| z
16039+
ABC\=substitute_unset_empty,replace=$+z
16040+
Failed: error -49 at offset 2 in replacement: unknown substring
16041+
here: $+ |<--| z
16042+
ABC\=substitute_unknown_unset,substitute_unset_empty,replace=$+z
16043+
1: z
16044+
1598316045
/^(o(\1{72}{\"{\\{00000059079}\d*){74}}){19}/I
1598416046
Capture group count = 2
1598516047
Max back reference = 1

0 commit comments

Comments
 (0)