In Perl:
$foo =~ m/^(F|f)oo\s*(B|b)ar$/
$foo =~ s/foo bar/bas bat/
$foo =~ tr/[a,e,i,o,u]/[A,E,I,O,U]/</pre>
Does not match:
$bar !~ m/foobar/i
*
Zero or more+
One or more?
Zero or one{7}
Exactly seven{3,}
Three or more{2,5}
Two, three, four, or fivePutting a question mark after the repetition (like x*?
) makes in non-greedy.
^
Match beginning of line or string\A
Start of string$
Match end of line or string\Z
End of string\b
Match word boundary\B
Non-word boundary\<
Start of word\>
End of word\w
Word (alphanumeric plus “_”)\W
Non-word\s
Whitespace\S
Non-whitespace\d
Digit\D
Non-digit\c
Control character\x
Hex digit\O
Octal digit\n
Newline\r
Carriage return\t
Tab\v
Vertical tab\f
Formfeed\a
Alarm (bell, beep)\e
Escape\
Excape next character (e.g. \^ for literal carrot rather than line start)\Q
Begin sequence of literals\E
End literal sequenceExample: An i
at the end of the expression makes it case insensitive:
$bar =~ m/foobar/i
g
Global (match all)m
Multi-line (^ and $ match anywhere, not just at the very right and left edges of the string)s
Single string (. matches anything, including newlines)x
Improve legibility by permitting whitespace and comments in patterna
ASCII-safe matching against Unicodex
Ignore whitespace in pattern unless it’s backslashed or inside brackets (allows writing the regex itself in a more readable format, with line breaks)If you wanted to ignore case for only part of a regular expression:
/(?i)foobar(?-i)BaT/
if ($string =~ m/John (Smith|Smyth|Psmith)/) {
print "I found John!\n"
}
.
Any character except \n
(foo|bar)
foo or bar(?:foo)
Non-capturing group[xyz]
x or y or z (single character)[^xyz]
NOT x or y or z[a-f]
Single character in range a through fExample: If we want to match “All the king’s horse” but not match the escaped “All the king”s horses” (doubled single quote) we combine negating groups with a negative lookahead to match one single quote but not two:
[^']*'(?!')[^']*
Grouping with parens is also the way to capture matches (group $1
, $2
, etc.). This can also be used for backreferences, like:
s/(November) 3rd/\1 4th/g
$1
, $2
, $3
First, second, third matches$+
Last/final match$&
The entire match$
Before match$'
After match?=
Positive lookahead?!
Negative lookahead?<=
Positive lookbehind?<!
Negative lookbehind?>
Once-only sub-expression?()
Conditional if-then?()|
Conditional if-then-else?#
CommentA regex with positive lookahead matches something followed by something else.
foo(?=t).*
matches “football” but not “foobar”.
A regex with negative lookahead matches something not followed by something else.
foo(?!t).*
matches “foobar” but not “football”.
Lookbehind works the same way, with (?<=foot)ball
(“ball” preceded by “foot”) and (?<!wrecking)ball
(“ball” not preceded by “wrecking”).
[:upper:]
Like [A-Z][:lower:]
Like [a-z][:alpha:]
Like [a-zA-Z][:digit:]
Like [0-9][:alnum:]
Like [a-zA-Z0-9][:word:]
Like [a-zA-Z0-9_][:xdigit:]
Like [0-9a-f][:punct:]
Any punctuation[:space:]
Like [\t\r\n\f\v][:blank:]
Space or tabPOSIX regular expressions come in two types: Basic and Extended.
Extended POSIX regular expressions are more Perl-like and generally more powerful, although they lack back-references.
Basic POSIX regular expressions include back references, like \1\2
for the first and second matches.
However, basic regular expressions lack support for alternate either/or groups, like (foo|bar)
.
See re_format(7).