Recipe 8.9 Counting Lines of Text
Problem
You need to count lines of text within a
string or within a file.
Solution
Read in the entire file and count the number of linefeeds, as shown
in the following method:
using System;
using System.Text.RegularExpressions;
using System.IO;
public static long LineCount(string source, bool isFileName)
{
if (source != null)
{
string text = source;
if (isFileName)
{
FileStream FS = new FileStream(source, FileMode.Open,
FileAccess.Read, FileShare.Read);
StreamReader SR = new StreamReader(FS);
text = SR.ReadToEnd( );
SR.Close( );
FS.Close( );
}
Regex RE = new Regex("\n", RegexOptions.Multiline);
MatchCollection theMatches = RE.Matches(text);
// Needed for files with zero length
// Note that a string will always have a line terminator
// and thus will always have a length of 1 or more
if (isFileName)
{
return (theMatches.Count);
}
else
{
return (theMatches.Count) + 1;
}
}
else
{
// Handle a null source here
return (0);
}
}
An
alternative version of this method uses the
StreamReader.ReadLine method to count lines in a
file and a regular expression to count lines in a
string:
public static long LineCount2(string source, bool isFileName)
{
if (source != null)
{
string text = source;
long numOfLines = 0;
if (isFileName)
{
FileStream FS = new FileStream(source, FileMode.Open,
FileAccess.Read, FileShare.Read);
StreamReader SR = new StreamReader(FS);
while (text != null)
{
text = SR.ReadLine( );
if (text != null)
{
++numOfLines;
}
}
SR.Close( );
FS.Close( );
return (numOfLines);
}
else
{
Regex RE = new Regex("\n", RegexOptions.Multiline);
MatchCollection theMatches = RE.Matches(text);
return (theMatches.Count + 1);
}
}
else
{
// Handle a null source here
return (0);
}
}
The following method counts the lines within a specified text file
and a specified string:
public static void TestLineCount( )
{
// Count the lines within the file TestFile.txt
LineCount(@"C:\TestFile.txt", true);
// Count the lines within a string
// Notice that a \r\n characters start a new line
// as well as just the \n character
LineCount("Line1\r\nLine2\r\nLine3\nLine4", false);
}
Discussion
Every line ends with a special character. For Windows files, the line
terminating characters are a carriage return followed by a linefeed.
This sequence of characters is described by the regular expression
pattern \r\n. Unix files terminate their lines
with just the
linefeed character
(\n). The regular expression
"\n" is the lowest common denominator for both
sets of line-terminating characters. Consequently, this method runs a
regular expression that looks for the pattern "\n"
in a string or file.
|
Macintosh files usually end with a
carriage-return character (\r). To count the
number of lines in this type of file, the regular expression should
be changed to the following in the constructor of the
Regex object:
Regex RE = new Regex("\r", RegexOptions.Multiline);
|
|
Simply running this regular expression against a string returns the
number of lines minus one because the last line does not have a
line-terminating character. To account for this, one is added to the
final count of linefeeds in the string.
The LineCount method accepts two parameters. The
first is a string that either contains the actual
text that will have its lines counted or the path and name of a text
file whose lines are to be counted. The second parameter,
isFileName, determines whether the first parameter
(source) is a string or a file path. If this
parameter is true, the source parameter is a file
path; otherwise, it is simply a string.
See Also
See the ".NET Framework Regular
Expressions," "FileStream
Class," and "StreamReader
Class" topics in the MSDN documentation.
|