| Tutorial | Tools & Languages | Examples | Books & Reference |
| Regular Expr. Cookbook | Teach Yourself Reg. Expr. | Mastering Regular Expr. | Java Regular Expressions | Oracle Regular Expr. | Regular Expr. Pocket Ref. | Regular Expr. Recipes | Regex Recipes for Windows |
| Basic Regex Syntax | Advanced Regex Syntax | Unicode-Specific Syntax | Flavor-Specific Syntax | Flavor Comparison | Replacement Syntax |
The table below compares the various tokens that the various tools and languages discussed on this website recognize in the replacement text during search-and-replace operations.
The list of replacement text flavors is not the same as the list of regular expression flavors in the regex features comparison. The reason is that the replacements are not made by the regular expression engine, but by the tool or programming library providing the search-and-replace capability. The result is that tools or languages using the same regex engine may behave differently when it comes to making replacements. E.g. The PCRE library does not provide a search-and-replace function. All tools and languages implementing PCRE use their own search-and-replace feature, which may result in differences in the replacement text syntax. So these are listed separately.
To make the table easier to read, I did group tools and languages that use the exact same replacement text syntax. The labels for the replacement text flavors are only relevant in the table below. E.g. the .NET framework does have built-in search-and-replace function in its Regex class, which is used by all tools and languages based on the .NET framework. So these are listed together under ".NET".
Note that the escape rules below only refer to the replacement text syntax. If you type the replacement text in an input box in the application you're using, or if you retrieve the replacement text from user input in the software you're developing, these are the only escape rules that apply. If you pass the replacement text as a literal string in programming language source code, you'll need to apply the language's string escape rules on top of the replacement text escape rules. E.g. for languages that require backslashes in string literals to be escaped, you'll need to use "\\1" instead of "\1" to get the first backreference.
A flavor can have four levels of support (or non-support) for a particular token:
| Syntax Using Backslashes | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Feature | JGsoft | .NET | Java | Perl | ECMA | Python | Ruby | Tcl | PHP ereg | PHP preg | REAL basic | Oracle | Post gres | XPath | R |
| \& (whole regex match) | YES | no | no | no | no | no | YES | no | no | no | no | no | YES | error | no |
| \0 (whole regex match) | YES | no | no | no | no | no | YES | YES | YES | YES | YES | no | no | error | no |
| \1 through \9 (backreference) | YES | no | no | depr. | no | YES | YES | YES | YES | YES | YES | YES | YES | error | YES |
| \10 through \99 (backreference) | YES | no | no | no | no | YES | no | no | no | YES | YES | no | no | error | no |
| \10 through \99 treated as \1 through \9 (and a literal digit) if fewer than 10 groups | YES | n/a | n/a | n/a | n/a | no | n/a | n/a | n/a | no | no | n/a | n/a | error | n/a |
| \g<group> (named backreference) | YES | no | no | no | no | YES | no | no | no | no | no | no | no | error | no |
| \` (backtick; subject text to the left of the match) | YES | no | no | no | no | no | YES | no | no | no | no | no | no | error | no |
| \' (straight quote; subject text to the right of the match) | YES | no | no | no | no | no | YES | no | no | no | no | no | no | error | no |
| \+ (highest-numbered participating group) | YES | no | no | no | no | no | YES | no | no | no | no | no | no | error | no |
| Backslash escapes one backslash and/or dollar | YES | no | YES | YES | no | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
| Unescaped backslash as literal text | YES | YES | no | YES | YES | YES | YES | YES | YES | YES | no | YES | YES | error | no |
| Character Escapes | |||||||||||||||
| Feature | JGsoft | .NET | Java | Perl | ECMA | Python | Ruby | Tcl | PHP ereg | PHP preg | REAL basic | Oracle | Post gres | XPath | R |
| \u0000 through \uFFFF (Unicode character) | YES | no | string | no | string | u string | no | string | no | no | no | no | no | error | string |
| \x{0} through \x{FFFF} (Unicode character) | YES | no | no | string | no | no | no | no | no | no | no | no | no | error | no |
| \x00 through \xFF (ASCII character) | YES | no | no | string | string | non-r string | no | string | string | string | YES | no | no | error | string |
| Syntax Using Dollar Signs | |||||||||||||||
| Feature | JGsoft | .NET | Java | Perl | ECMA | Python | Ruby | Tcl | PHP ereg | PHP preg | REAL basic | Oracle | Post gres | XPath | R |
| $& (whole regex match) | YES | YES | error | YES | YES | no | no | no | no | no | YES | no | no | error | no |
| $0 (whole regex match) | YES | YES | YES | no | no | no | no | no | no | YES | YES | no | no | YES | no |
| $1 through $9 (backreference) | YES | YES | YES | YES | YES | no | no | no | no | YES | YES | no | no | YES | no |
| $10 through $99 (backreference) | YES | YES | YES | YES | YES | no | no | no | no | YES | YES | no | no | YES | no |
| $10 through $99 treated as $1 through $9 (and a literal digit) if fewer than 10 groups | YES | no | YES | no | YES | n/a | n/a | n/a | n/a | no | no | n/a | n/a | YES | n/a |
| ${1} through ${99} (backreference) | YES | YES | error | YES | no | no | no | no | no | YES | no | no | no | error | no |
| ${group} (named backreference) | YES | YES | error | no | no | no | no | no | no | no | no | no | no | error | no |
| $` (backtick; subject text to the left of the match) | YES | YES | error | YES | YES | no | no | no | no | no | YES | no | no | error | no |
| $' (straight quote; subject text to the right of the match) | YES | YES | error | YES | YES | no | no | no | no | no | YES | no | no | error | no |
| $_ (entire subject string) | YES | YES | error | YES | IE only | no | no | no | no | no | no | no | no | error | no |
| $+ (highest-numbered participating group) | YES | no | error | YES | no | no | no | no | no | no | no | no | no | error | no |
| $+ (highest-numbered group in the regex) | no | YES | error | no | IE and Firefox | no | no | no | no | no | no | no | no | error | no |
| $$ (escape dollar with another dollar) | YES | YES | error | no | YES | no | no | no | no | no | no | no | no | error | no |
| $ (unescaped dollar as literal text) | YES | YES | error | no | YES | YES | YES | YES | YES | YES | buggy | YES | YES | error | YES |
| Tokens Without a Backslash or Dollar | |||||||||||||||
| Feature | JGsoft | .NET | Java | Perl | ECMA | Python | Ruby | Tcl | PHP ereg | PHP preg | REAL basic | Oracle | Post gres | XPath | R |
| & (whole regex match) | no | no | no | no | no | no | no | YES | no | no | no | no | no | no | no |
| General Replacement Text Behavior | |||||||||||||||
| Feature | JGsoft | .NET | Java | Perl | ECMA | Python | Ruby | Tcl | PHP ereg | PHP preg | REAL basic | Oracle | Post gres | XPath | R |
| Backreferences to non-existent groups are silently removed | YES | no | error | YES | no | error | YES | YES | no | YES | YES | YES | YES | YES | error |
The $+ token is listed twice, because it doesn't have the same meaning in the languages that support it. It was introduced in Perl, where the $+ variable holds the text matched by the highest-numbered capturing group that actually participated in the match. In several languages and libraries that intended to copy this feature, such as .NET and JavaScript, $+ is replaced with the highest-numbered capturing group, whether it participated in the match or not.
E.g. in the regex a(\d)|x(\w) the highest-numbered capturing group is the second one. When this regex matches a4, the first capturing group matches 4, while the second group doesn't participate in the match attempt at all. In Perl, $+ will hold the 4 matched by the first capturing group, which is the highest-numbered group that actually participated in the match. In .NET or JavaScript, $+ will be substituted with nothing, since the highest-numbered group in the regex didn't capture anything. When the same regex matches xy, Perl, .NET and JavaScript will all store y in $+.
Also note that .NET numbers named capturing groups after all non-named groups. This means that in .NET, $+ will always be substituted with the text matched by the last named group in the regex, whether it is followed by non-named groups or not, and whether it actually participated in the match or not.
| Tutorial | Tools & Languages | Examples | Books & Reference |
| Regular Expr. Cookbook | Teach Yourself Reg. Expr. | Mastering Regular Expr. | Java Regular Expressions | Oracle Regular Expr. | Regular Expr. Pocket Ref. | Regular Expr. Recipes | Regex Recipes for Windows |
| Basic Regex Syntax | Advanced Regex Syntax | Unicode-Specific Syntax | Flavor-Specific Syntax | Flavor Comparison | Replacement Syntax |
Page URL: http://regular-expressions.mobi/refreplace.html
Page last updated: 03 September 2010
Site last updated: 02 December 2010
Copyright © 2003-2012 Jan Goyvaerts. All rights reserved.