Appendix A. Regular Expressions
The following
tables
summarize the regular-expression grammar and syntax supported by the
regular-expression classes in
System.Text.RegularExpression. Each of the
modifiers and qualifiers in the tables can substantially change the
behavior of the matching and searching patterns. For further
information on regular expressions, we recommend the definitive
Mastering Regular Expressions by Jeffrey E. F.
Friedl (O'Reilly & Associates, 1997).
All the syntax described in the tables should match the Perl5 syntax,
with specific exceptions noted.
Table A-1. Character escapes|
\a
|
Bell
|
\u0007
|
\b
|
Backspace
|
\u0008
|
\t
|
Tab
|
\u0009
|
\r
|
Carriage return
|
\u000A
|
\v
|
Vertical tab
|
\u000B
|
\f
|
Form feed
|
\u000C
|
\n
|
Newline
|
\u000D
|
\e
|
Escape
|
\u001B
|
\040
|
ASCII character as octal
| |
\x20
|
ASCII character as hex
| |
\cC
|
ASCII control character
| |
\u0020
|
Unicode character as hex
| |
\non-escape
|
A nonescape character
| |
Special case: within a regular expression, \b
means word boundary, except in a [ ] set, in which
\b means the backspace character.
Table A-2. Substitutions|
$group-number
|
Substitutes last substring matched by group-number
|
${group-name}
|
Substitutes last substring matched by
(?<group-name>)
|
Substitutions are specified only within a replacement pattern.
Table A-3. Character sets|
.
|
Matches any character except \n
|
[characterlist]
|
Matches a single character in the list
|
[^characterlist]
|
Matches a single character not in the list
|
[char0-char1]
|
Matches a single character in a range
|
\w
|
Matches a word character; same as [a-zA-Z_0-9]
|
\W
|
Matches a nonword character
|
\s
|
Matches a space character; same as [\n\r\t\f]
|
\S
|
Matches a nonspace character
|
\d
|
Matches a decimal digit; same as [0-9]
|
\D
|
Matches a nondigit
|
Table A-4. Positioning assertions|
^
|
Beginning of line
|
$
|
End of line
|
\A
|
Beginning of string
|
\Z
|
End of line or string
|
\z
|
Exactly the end of string
|
\G
|
Where search started
|
\b
|
On a word boundary
|
\B
|
Not on a word boundary
|
Table A-5. Quantifiers|
*
|
0 or more matches
|
+
|
1 or more matches
|
?
|
0 or 1 matches
|
{n}
|
Exactly n matches
|
{n,}
|
At least n matches
|
{n,m}
|
At least n, but no more than
m matches
|
*?
|
Lazy *, finds first match that has minimum repeats
|
+?
|
Lazy +, minimum repeats, but at least 1
|
??
|
Lazy ?, zero or minimum repeats
|
{n}?
|
Lazy {n}, exactly n matches
|
{n,}?
|
Lazy {n}, minimum repeats, but at least
n
|
{n,m}?
|
Lazy {n,m}, minimum
repeats, but at least n, and no more than
m
|
Table A-6. Grouping constructs|
( )
|
Capture matched substring
|
(?<name>)
|
Capture matched substring into group
name
|
(?<number>)
|
Capture matched substring into group
number*
|
(?<name1-name2>)
|
Undefine name2, and store interval and current
group into name1; if name2
is undefined, matching backtracks; name1 is
optional*
|
(?: )
|
Noncapturing group
|
(?imnsx-imnsx: )
|
Apply or disable matching options
|
(?= )
|
Continue matching only if subexpression matches on right
|
(?! )
|
Continue matching only if subexpression doesn't
match on right
|
(?<= )
|
Continue matching only if subexpression matches on left
|
(?<! )
|
Continue matching only if subexpression doesn't
match on left
|
(?> )
|
Subexpression is matched once, but isn't backtracked
|
|
The named capturing group syntax follows a suggestion made by Friedl
in Mastering Regular Expressions. All other
grouping constructs use the Perl5 syntax.
|
|
Table A-7. Back references|
\count
|
Back reference count occurrences
|
\k<name>
|
Named back reference
|
Table A-8. Alternation|
|
|
Logical OR
|
(?(expression)yes|no)
|
Matches yes if expression matches, else
no; the no is optional
|
(?(name)yes|no)
|
Matches yes if named string has a match, else
no; the no is optional
|
Table A-9. Miscellaneous Constructs|
(?imnsx-imnsx)
|
Set or disable options in midpattern
|
(?# )
|
Inline comment
|
# [to end of line]
|
X-mode comment
|
Table A-10. Regular expression options|
i
|
Case-insensitive match
|
m
|
Multiline mode; changes ^ and $
so they match beginning and ending of any line
|
n
|
Capture explicitly named or numbered groups
|
c
|
Compile to MSIL
|
s
|
Single-line mode; changes meaning of
"." so it matches every character
|
x
|
Eliminates unescaped whitespace from the pattern
|
r
|
Search from right to left; can't be specified in
midstream
|
|