DekGenius.com
[ Team LiB ] Previous Section Next Section

Recipe 8.9 Counting Lines of Text

Problem

You need to count lines of text within a string or within a file.

Solution

Read in the entire file and count the number of linefeeds, as shown in the following method:

using System;
using System.Text.RegularExpressions;
using System.IO;

public static long LineCount(string source, bool isFileName)
{
    if (source != null)
    {
        string text = source;

        if (isFileName)
        {
            FileStream FS = new FileStream(source, FileMode.Open, 
                                         FileAccess.Read, FileShare.Read);
            StreamReader SR = new StreamReader(FS);
            text = SR.ReadToEnd( );
            SR.Close( );
            FS.Close( );
        }

        Regex RE = new Regex("\n", RegexOptions.Multiline);
        MatchCollection theMatches = RE.Matches(text);

        // Needed for files with zero length
        //   Note that a string will always have a line terminator 
        //        and thus will always have a length of 1 or more
        if (isFileName)
        {
            return (theMatches.Count);
        }
        else
        {
            return (theMatches.Count) + 1;
        }
    }
    else
    {
        // Handle a null source here
        return (0);
    }
}

An alternative version of this method uses the StreamReader.ReadLine method to count lines in a file and a regular expression to count lines in a string:

public static long LineCount2(string source, bool isFileName)
{
    if (source != null)
    {
        string text = source;
        long numOfLines = 0;

        if (isFileName)
        {
            FileStream FS = new FileStream(source, FileMode.Open, 
                                         FileAccess.Read, FileShare.Read);
            StreamReader SR = new StreamReader(FS);

            while (text != null)
            {
                text = SR.ReadLine( );

                if (text != null)
                {
                    ++numOfLines;
                }
            }

            SR.Close( );
            FS.Close( );
            return (numOfLines);
        }
        else
        {
            Regex RE = new Regex("\n", RegexOptions.Multiline);
            MatchCollection theMatches = RE.Matches(text);

            return (theMatches.Count + 1);
        }
    }
    else
    {
        // Handle a null source here
        return (0);
    }
}

The following method counts the lines within a specified text file and a specified string:

public static void TestLineCount( )
{
    // Count the lines within the file TestFile.txt
    LineCount(@"C:\TestFile.txt", true);

    // Count the lines within a string
    // Notice that a \r\n characters start a new line 
    //    as well as just the \n character
    LineCount("Line1\r\nLine2\r\nLine3\nLine4", false);
}

Discussion

Every line ends with a special character. For Windows files, the line terminating characters are a carriage return followed by a linefeed. This sequence of characters is described by the regular expression pattern \r\n. Unix files terminate their lines with just the linefeed character (\n). The regular expression "\n" is the lowest common denominator for both sets of line-terminating characters. Consequently, this method runs a regular expression that looks for the pattern "\n" in a string or file.

Macintosh files usually end with a carriage-return character (\r). To count the number of lines in this type of file, the regular expression should be changed to the following in the constructor of the Regex object:

Regex RE = new Regex("\r", RegexOptions.Multiline);

Simply running this regular expression against a string returns the number of lines minus one because the last line does not have a line-terminating character. To account for this, one is added to the final count of linefeeds in the string.

The LineCount method accepts two parameters. The first is a string that either contains the actual text that will have its lines counted or the path and name of a text file whose lines are to be counted. The second parameter, isFileName, determines whether the first parameter (source) is a string or a file path. If this parameter is true, the source parameter is a file path; otherwise, it is simply a string.

See Also

See the ".NET Framework Regular Expressions," "FileStream Class," and "StreamReader Class" topics in the MSDN documentation.

    [ Team LiB ] Previous Section Next Section