Recipe 2.21 Improving String Comparison Performance
Problem
Your
application consists of many strings that are compared frequently.
You have been tasked with improving performance and making more
efficient use of resources.
Solution
Use
the intern pool to improve resource usage and, in turn, improve
performance. The Intern and
IsInterned instance methods of the
string class allow you to use the intern pool. Use
the following static methods to make use of the string intern
pool:
using System;
using System.Text;
public class InternedStrCls
{
public static void CreateInternedStr(char[] characters)
{
string NonInternedStr = new string(characters);
String.Intern(NonInternedStr);
}
public static void CreateInternedStr(StringBuilder strBldr)
{
String.Intern(strBldr.ToString( ));
}
public static void CreateInternedStr(string str)
{
String.Intern(str);
}
public static void CreateInternedStr(string[] strArray)
{
foreach(string s in strArray)
{
String.Intern(s);
}
}
}
Discussion
The CLR automatically stores all string literals declared in an
application in an area of memory called the intern
pool. The intern pool contains a unique instance of each
string literal found in your code, which allows for more efficient
use of resources by not storing multiple copies of strings that
contain the same string literal. Another benefit is speed. When two
strings are compared using either
the == operator or the
Equals instance method of the
string class, a test is done to determine whether
each string variable reference is the same; if they are not, then
each string's length is checked; if both
string's lengths are equal, each character is
compared individually. However, if we could guarantee that the
references, instead of the string contents, could be compared, much
faster string comparisons can be made. String interning does just
that: it guarantees that the references to equivalent string values
are the same, eliminating the possibility of attempting the length
and character-by-character checks. This yields better performance in
situations where the references to two equal strings are different
and the length and character-by-character comparisons have to be
made.
Note that the only strings automatically placed in this intern pool
by the compiler are string literals—strings surrounded by
double quotes—found in code by the compiler. The following
lines of code will place the string "foo" into the
intern pool:
string s = "foo";
StringBuilder sb = new StringBuilder("foo");
StringBuilder sb = new StringBuilder( ).Append("foo");
The following lines of code will not place the string
"foo" into the intern pool:
char[] ca = new char[3] {'f','o','o'};
StringBuilder sb = new StringBuilder( ).Append("f").Append("oo");
string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;
You can programmatically store a new string created by your
application in the intern pool using the static
string.Intern method. This method returns a string
referencing the string literal contained in the intern pool, or, if
the string is not found, the string is entered into the intern pool
and a reference to this newly pooled string is returned.
There is also another method used in string interning called
IsInterned. This method operates similarly to the
Intern method, except that it returns
null if the string is not in the intern pool,
rather than adding it to the pool. This method returns a string
referencing the string literal contained in the intern pool, or, if
the string is not found, it returns null.
An example of using this method is shown here:
string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;
if (String.IsInterned(s3) == null)
{
Console.WriteLine("NULL");
}
However, if we add the highlighted line of code, the
IsInterned test returns a
non-null string object:
string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;
InternedStrCls.CreateInternedStr(s3);
if (String.IsInterned(s3) == null)
{
Console.WriteLine("NULL");
}
The Intern method is useful when you need a
reference to a string, even if it does not exist in the intern pool.
The IsInterned method can optimize the comparison
of a single string to any string literal or manually interned string.
Consider that you need to determine whether a string variable
contains any string literal that has been defined in the application.
Call the string.IsInterned method with the string
variable as the parameter. If null is returned,
there is no match in the intern pool, and thus there is no match
between the string variable's value and any string
literals:
string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;
if (String.IsInterned(s3) != null)
{
// If the string "foo" has been defined in the app and placed
// into the intern pool, this block of code executes.
}
else
{
// If the string "foo" has NOT been defined in the app NOR been placed
// into the intern pool, this block of code executes.
}
Exercise caution when using the string interning methods. Calling the
Intern method for every possible string that could
be created by your application would actually cause the
application's performance to slow considerably,
since this method must search the intern pool for the string; if it
does not exist in the pool, it is added. The reference to the newly
created string in the intern pool is then returned.
Another potential problem with the IsInterned
method in particular stems from the fact that every string literal in
the application is stored in this intern pool at the start of the
application. If you are using IsInterned to
determine whether a string exists, you are comparing that string
against all string literals that exist in the application, as well as
any you might have explicitly interned, not just the ones in the
scope in which IsInterned is used.
See Also
See the "String.Intern Method" and
"String.IsInterned Method" topics
in the MSDN documentation.
|