Skip to content

Commit 83a42a9

Browse files
committed
Normative: add RegExp.escape
1 parent f2b2d52 commit 83a42a9

File tree

1 file changed

+57
-0
lines changed

1 file changed

+57
-0
lines changed

spec.html

+57
Original file line numberDiff line numberDiff line change
@@ -37833,6 +37833,63 @@ <h1>RegExp.prototype</h1>
3783337833
<p>This property has the attributes { [[Writable]]: *false*, [[Enumerable]]: *false*, [[Configurable]]: *false* }.</p>
3783437834
</emu-clause>
3783537835

37836+
<emu-clause id="sec-regexp.escape" number="2">
37837+
<h1>RegExp.escape ( _S_ )</h1>
37838+
<p>This function returns a copy of _S_ in which characters that are potentially special in a regular expression |Pattern| have been replaced by equivalent escape sequences.</p>
37839+
<p>It performs the following steps when called:</p>
37840+
37841+
<emu-alg>
37842+
1. If _S_ is not a String, throw a TypeError exception.
37843+
1. Let _escaped_ be the empty String.
37844+
1. Let _cpList_ be StringToCodePoints(_S_).
37845+
1. For each code point _c_ in _cpList_, do
37846+
1. If _escaped_ is the empty String, and _c_ is matched by |DecimalDigit| or |AsciiLetter|, then
37847+
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a `\0` character escape or a |DecimalEscape| such as `\1` and still match _S_ rather than be interpreted as an extension of the preceding escape sequence. Escaping a leading ASCII letter does the same for the context after `\c`.
37848+
1. Let _numericValue_ be the numeric value of _c_.
37849+
1. Let _hex_ be Number::toString(𝔽(_numericValue_), 16).
37850+
1. Assert: The length of _hex_ is 2.
37851+
1. Set _escaped_ to the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), *"x"*, and _hex_.
37852+
1. Else,
37853+
1. Set _escaped_ to the string-concatenation of _escaped_ and EncodeForRegExpEscape(_c_).
37854+
1. Return _escaped_.
37855+
</emu-alg>
37856+
37857+
<emu-note>
37858+
<p>Despite having similar names, EscapeRegExpPattern and `RegExp.escape` do not perform similar actions. The former escapes a pattern for representation as a string, while this function escapes a string for representation inside a pattern.</p>
37859+
</emu-note>
37860+
</emu-clause>
37861+
37862+
<emu-clause id="sec-encodeforregexpescape" type="abstract operation">
37863+
<h1>
37864+
EncodeForRegExpEscape (
37865+
_c_: a code point,
37866+
): a String
37867+
</h1>
37868+
<dl class="header">
37869+
<dt>description</dt>
37870+
<dd>It returns a string representing a |Pattern| for matching _c_. If _c_ is white space or an ASCII punctuator, the returned value is an escape sequence. Otherwise, the returned value is a string representation of _c_ itself.</dd>
37871+
</dl>
37872+
37873+
<emu-alg>
37874+
1. If _c_ is matched by |SyntaxCharacter| or _c_ is U+002F (SOLIDUS), then
37875+
1. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and UTF16EncodeCodePoint(_c_).
37876+
1. Else if _c_ is the code point listed in some cell of the “Code Point” column of <emu-xref href="#table-controlescape-code-point-values"></emu-xref>, then
37877+
1. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and the string in the “ControlEscape” column of the row whose “Code Point” column contains _c_.
37878+
1. Let _otherPunctuators_ be the string-concatenation of *",-=<>#&!%:;@~'`"* and the code unit 0x0022 (QUOTATION MARK).
37879+
1. Let _toEscape_ be StringToCodePoints(_otherPunctuators_).
37880+
1. If _toEscape_ contains _c_, or _c_ is matched by |WhiteSpace| or |LineTerminator|, or _c_ has the same numeric value as a leading surrogate or trailing surrogate, then
37881+
1. If _c_ ≤ 0xFF, then
37882+
1. Let _hex_ be Number::toString(𝔽(_c_), 16).
37883+
1. Return the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), *"x"*, and StringPad(_hex_, 2, *"0"*, ~start~).
37884+
1. Let _escaped_ be the empty String.
37885+
1. Let _codeUnits_ be UTF16EncodeCodePoint(_c_).
37886+
1. For each code unit _cu_ of _codeUnits_, do
37887+
1. Set _escaped_ to the string-concatenation of _escaped_ and UnicodeEscape(_cu_).
37888+
1. Return _escaped_.
37889+
1. Return UTF16EncodeCodePoint(_c_).
37890+
</emu-alg>
37891+
</emu-clause>
37892+
3783637893
<emu-clause oldids="sec-get-regexp-@@species" id="sec-get-regexp-%symbol.species%">
3783737894
<h1>get RegExp [ %Symbol.species% ]</h1>
3783837895
<p>`RegExp[%Symbol.species%]` is an accessor property whose set accessor function is *undefined*. Its get accessor function performs the following steps when called:</p>

0 commit comments

Comments
 (0)