<?php include("HEADER.php"); ?>

<h1>Regular Expressions</h1>

<p>In Perl:</p>
<pre>$foo =~ m/^(F|f)oo\s*(B|b)ar$/
$foo =~ s/foo bar/bas bat/
$foo =~ tr/[a,e,i,o,u]/[A,E,I,O,U]/</pre>

<p>Does <em>not</em> match:</p>
<code class="prettyprint">$bar !~ m/foobar/i</code>

<h2>Quantifiers</h2>

<pre>
*     Zero or more
+     One or more
?     Zero or one
{7}   Exactly seven
{3,}  Three or more
{2,5} Two, three, four, or five
</pre>

<p>Putting a question mark after the repetition (like <code>x*?</code>) makes in non-greedy.</p>

<h2>Anchors</h2>

<pre>
^     Match beginning of line or string
\A    Start of string
$     Match end of line or string
\Z    End of string
\b    Match word boundary
\B    Non-word boundary
\&lt;    Start of word
\&gt;    End of word
</pre>

<h2>Character classes</h2>

<pre>
\w    Word (alphanumeric plus "_")
\W    Non-word
\s    Whitespace
\S    Non-whitespace
\d    Digit
\D    Non-digit
\c    Control character
\x    Hex digit
\O    Octal digit

\n    Newline
\r    Carriage return
\t    Tab
\v    Vertical tab
\f    Formfeed
\a    Alarm (bell, beep)
\e    Escape
</pre>

<h2>Escapes</h2>

<pre>
\    Excape next character (e.g. \^ for literal carrot rather than line start)
\Q   Begin sequence of literals
\E   End literal sequence
</pre>

<h2>Pattern modifiers</h2>

<p><b>Example:</b> An "i" at the end of the expression makes it case insensitive: <code class="prettyprint">$bar =~ m/foobar/i</code></p>

<pre>
g   Global (match all)
m   Multi-line (^ and $ match anywhere, not just at the very right and left edges of the string)
s   Single string (. matches anything, including newlines)
x   Improve legibility by permitting whitespace and comments in pattern
a   ASCII-safe matching against Unicode
x   Ignore whitespace in patter unless it's backslashed or inside brackets (allows writting the regex itself in a more readable format, with line breaks)
</pre>

<p>If you wanted to ignore case for only part of a regular expression:</p>
<pre class="prettyprint">/(?i)foobar(?-i)BaT/</pre>

<h2>Grouping and ranges and backreferences</h2>

<code class="prettyprint">if($string =~ m/John (Smith|Smyth|Psmith)/) {print "I found John!\n"}</code>

<pre>
.         Any character except \n
(foo|bar) foo or bar
(?:foo)   Non-capturing group
[xyz]     x or y or z (single character)
[^xyz]    NOT x or y or z
[a-f]     Single character in range a through f
</pre>

<p><b>Example:</b> If we want to match "All the king's horse" but not match the escaped "All the king''s horses" (doubled single quote) we combine negating groups with a negative lookahead to match one single quote but not two:</p>

<pre>[^']*'(?!')[^']*</pre>

<p>Grouping with parens is also the way to capture matches (group $1, $2, etc.). This can also be used for backreferences, like: <code>s/(November) 3rd/\1 4th/g</code></p>

<pre>
$1, $2, $3  First, second, third matches
$+     Last/final match
$&amp;     The entire match
$`     Before match
$'     After match
</pre>

<h2>Asertions, lookahead and lookbehind</h2>

<pre>
?=     Positive lookahead
?!     Negative lookahead
?&lt;=    Positive lookbehind
?&lt;!    Negative lookbehind
?&gt;     Once-only sub-expression
?()    Conditional if-then
?()|   Conditional if-then-else
?#     Comment
</pre>

<p>A regex with positive lookahead matches something followed by something else. <code>foo(?=t).*</code> matches "football" but not "foobar".</p>

<p>A regex with negative lookahead matches something <em>not</em> followed by something else. <code>foo(?!t).*</code> matches "foobar" but not "football".</p>

<p>Lookbehind works the same way, with <code>(?&lt;=foot)ball</code> ("ball" preceded by "foot") and <code>(?&lt;!wrecking)ball</code> ("ball" <em>not</em> preceded by "wrecking").</p>

<h2>POSIX classes</h2>

<pre>
[:upper:]    Like [A-Z]
[:lower:]    Like [a-z]
[:alpha:]    Like [a-zA-Z]
[:digit:]    Like [0-9]
[:alnum:]    Like [a-zA-Z0-9]
[:word:]     Like [a-zA-Z0-9_]
[:xdigit:]   Like [0-9a-f]
[:punct:]    Any punctuation
[:space:]    Like [\t\r\n\f\v]
[:blank:]    Space or tab
</pre>

<h2>Links</h2>
<ul>
    <li><a href="http://perldoc.perl.org/perlre.html">Perldoc for perlre</a></li>
    <li><a href="http://www.gnu.org/software/findutils/manual/html_mono/find.html#Regular-Expressions">Regular Expressions for GNU find</a></li>
	<li><a href="http://www.troubleshooters.com/codecorn/littperl/perlreg.htm">Steve Litt's Perls of Wisdom on Perl Regular Expressions</a></li>
    <li><a href="http://code.google.com/p/kiki-re/">Kiki</a> is a nice regex writing/testing utility</li>
</ul>

<?php include("../FOOTER.php"); ?>
