Ruby supports regular expressions as a language feature. In Ruby, a regular expression is written in the form of /pattern/modifiers where "pattern" is the regular expression itself, and "modifiers" are a series of characters indicating various options. The "modifiers" part is optional. This syntax is borrowed from Perl. Ruby supports the following modifiers:
You can combine multiple modifiers by stringing them together as in /regex/is.
In Ruby, the caret and dollar always match before and after newlines. Ruby does not have a modifier to change this. Use \A and \Z to match at the start or the end of the string.
Since forward slashes delimit the regular expression, any forward slashes that appear in the regex need to be escaped. E.g. the regex 1/2 is written as /1\/2/ in Ruby.
/regex/ creates a new object of the class Regexp. You can assign it to a variable to repeatedly use the same regular expression, or use the literal regex directly. To test if a particular regex matches (part of) a string, you can either use the =~ operator, call the regexp object's match() method, e.g.: print "success" if subject =~ /regex/ or print "success" if /regex/.match(subject).
The =~ operator returns the character position in the string of the start of the match (which evaluates to true in a boolean test), or nil if no match was found (which evaluates to false). The match() method returns a MatchData object (which also evaluates to true), or nil if no matches was found. In a string context, the MatchData object evaluates to the text that was matched. So print(/\w+/.match("test")) prints "test", while print(/\w+/ =~ "test") prints "0". The first character in the string has index zero. Switching the order of the =~ operator's operands makes no difference.
The =~ operator and the match() method sets the special variables $~. This variables is thread-local and method-local. That means you can use this variable until your method exits, or until the next time you use the =~ operator in your method, without worrying that another thread or another method in your thread will overwrite them. $~ holds the same MatchData object returned by Regexp.match().
A number of other special variables are derived from the $~ variable. All of these are read-only. If you assign a new MatchData instance to $~, all of these variables will change too. $& holds the text matched by the whole regular expression. $1, $2, etc. hold the text matched by the first, second, and following capturing groups. $+ holds the text matched by the highest-numbered capturing group that actually participated in the match. $` and $' hold the text in the subject string to the left and to the right of the regex match.
Use the sub() and gsub() methods of the String class to search-and-replace the first regex match, or all regex matches, respectively, in the string. Specify the regular expression you want to search for as the first parameter, and the replacement string as the second parameter, e.g.: result = subject.gsub(/before/, "after").
To re-insert the regex match, use \0 in the replacement string. You can use the contents of capturing groups in the replacement string with backreferences \1, \2, \3, etc. Note that numbers escaped with a backslash are treated as octal escapes in double-quoted strings. Octal escapes are processed at the language level, before the sub() function sees the parameter. To prevent this, you need to escape the backslashes in double-quoted strings. So to use the first backreference as the replacement string, either pass '\1' or "\\1". '\\1' also works.
To collect all regex matches in a string into an array, pass the regexp object to the string's scan() method, e.g.: myarray = mystring.scan(/regex/). Sometimes, it is easier to create a regex to match the delimiters rather than the text you are interested in. In that case, use the split() method instead, e.g.: myarray = mystring.split(/delimiter/). The split() method discards all regex matches, returning the text between the matches. The scan() method does the opposite.
If your regular expression contains capturing groups, scan() returns an array of arrays. Each element in the overall array will contain an array consisting of the overall regex match, plus the text matched by all capturing groups.
Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!
| Quick Start | Tutorial | Tools & Languages | Examples | Reference | Book Reviews |
| grep | PowerGREP | RegexBuddy | RegexMagic |
| EditPad Lite | EditPad Pro |
| MySQL | Oracle | PostgreSQL |
Page URL: http://regular-expressions.mobi/ruby.html
Page last updated: 08 December 2016
Site last updated: 06 March 2017
Copyright © 2003-2017 Jan Goyvaerts. All rights reserved.