DekGenius.com
[ Team LiB ] Previous Section Next Section

6.6 Cookbook Regular Expressions

To wrap up this overview of how regular expressions are used in C# applications, the following is a set of useful expressions that have been used in other environments.[1]

[1] These expressions were taken from the Perl Cookbook by Tom Christiansen and Nathan Torkington (O'Reilly), and updated for the C# environment by Brad Merrill of Microsoft.

  • Matching roman numerals:

    string p1 = "^m*(d?c{0,3}|c[dm])"
      + "(l?x{0,3}|x[lc])(v?i{0,3}|i[vx])$";
    string t1 = "vii";
    Match m1 = Regex.Match(t1, p1);
  • Swapping first two words:

    string t2 = "the quick brown fox";
    string p2 = @"(\S+)(\s+)(\S+)";
    Regex x2 = new Regex(p2);
    string r2 = x2.Replace(t2, "$3$2$1", 1);
  • Matching "keyword = value" patterns:

    string t3 = "myval = 3";
    string p3 = @"(\w+)\s*=\s*(.*)\s*$";
    Match m3 = Regex.Match(t3, p3);
  • Matching lines of at least 80 characters:

    string t4 = "********************"
      + "******************************"
      + "******************************";
    string p4 = ".{80,}";
    Match m4 = Regex.Match(t4, p4);
  • Extracting date/time values (MM/DD/YY HH:MM:SS):

    string t5 = "01/01/01 16:10:01";
    string p5 =
      @"(\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)";
    Match m5 = Regex.Match(t5, p5);
  • Changing directories (for Windows):

    string t6 =
      @"C:\Documents and Settings\user1\Desktop\";
    string r6 = Regex.Replace(t6,
      @"\\user1\\",
      @"\user2\");
  • Expanding (%nn) hex escapes:

    string t7 = "%41"; // capital A
    string p7 = "%([0-9A-Fa-f][0-9A-Fa-f])";
    // uses a MatchEvaluator delegate
    string r7 = Regex.Replace(t7, p7,
      HexConvert);
  • Deleting C comments (imperfectly):

    string t8 = @"
    /*
     * this is an old cstyle comment block
     */
    ";
    string p8 = @"
      /\*  # match the opening delimiter
      .*? # match a minimal numer of characters
      \*/ # match the closing delimiter
    ";
    string r8 = Regex.Replace(t8, p8, "", RegexOptions.Singleline
                 | RegexOptions.IgnorePatternWhitespace);
  • Removing leading and trailing whitespace:

    string t9a = "   leading";
    string p9a = @"^\s+";
    string r9a = Regex.Replace(t9a, p9a, "");
      
    string t9b = "trailing  ";
    string p9b = @"\s+$";
    string r9b = Regex.Replace(t9b, p9b, "");
  • Turning "\" followed by "n" into a real newline:

    string t10 = @"\ntest\n";
    string r10 = Regex.Replace(t10, @"\\n", "\n");
  • Detecting IP addresses:

    string t11 = "55.54.53.52";
    string p11 = "^" +
      @"([01]?\d\d|2[0-4]\d|25[0-5])\." +
      @"([01]?\d\d|2[0-4]\d|25[0-5])\." +
      @"([01]?\d\d|2[0-4]\d|25[0-5])\." +
      @"([01]?\d\d|2[0-4]\d|25[0-5])" +
      "$";
    Match m11 = Regex.Match(t11, p11);
  • Removing leading path from filename:

    string t12 = @"c:\file.txt";
    string p12 = @"^.*\\";
    string r12 = Regex.Replace(t12, p12, "");
  • Joining lines in multiline strings:

    string t13 = @"this is 
    a split line";
    string p13 = @"\s*\r?\n\s*";
    string r13 = Regex.Replace(t13, p13, " ");
  • Extracting all numbers from a string:

    string t14 = @"
    test 1
    test 2.3
    test 47
    ";
    string p14 = @"(\d+\.?\d*|\.\d+)";
    MatchCollection mc14 = Regex.Matches(t14, p14);
  • Finding all caps words:

    string t15 = "This IS a Test OF ALL Caps";
    string p15 = @"(\b[^\Wa-z0-9_]+\b)";
    MatchCollection mc15 = Regex.Matches(t15, p15);
  • Finding all lowercase words:

    string t16 = "This is A Test of lowercase";
    string p16 = @"(\b[^\WA-Z0-9_]+\b)";
    MatchCollection mc16 = Regex.Matches(t16, p16);
  • Finding all initial caps words:

    string t17 = "This is A Test of Initial Caps";
    string p17 = @"(\b[^\Wa-z0-9_][^\WA-Z0-9_]*\b)";
    MatchCollection mc17 = Regex.Matches(t17, p17);
  • Finding links in simple HTML:

    string t18 = @"
    <html>
    <a href=""http://windows.oreilly.com/news/first.htm"">first tag text</a>
    <a href=""http://windows.oreilly.com/news/next.htm"">next tag text</a>
    </html>
    ";
    string p18 = @"<A[^>]*?HREF\s*=\s*[""']?"
      + @"([^'"" >]+?)[ '""]?>";
    MatchCollection mc18 = Regex.Matches(t18, p18, RegexOptions.IgnoreCase
              | RegexOptions.Singleline);
  • Finding middle initials:

    string t19 = "Hanley A. Strappman";
    string p19 = @"^\S+\s+(\S)\S*\s+\S";
    Match m19 = Regex.Match(t19, p19);
  • Changing inch marks to quotation marks:

    string t20 = @"2' 2"" ";
    string p20 = "\"([^\"]*)";
    string r20 = Regex.Replace(t20, p20, "``$1''");
    [ Team LiB ] Previous Section Next Section