DekGenius.com
[ Team LiB ] Previous Section Next Section

Recipe 2.21 Improving String Comparison Performance

Problem

Your application consists of many strings that are compared frequently. You have been tasked with improving performance and making more efficient use of resources.

Solution

Use the intern pool to improve resource usage and, in turn, improve performance. The Intern and IsInterned instance methods of the string class allow you to use the intern pool. Use the following static methods to make use of the string intern pool:

using System;
using System.Text;

public class InternedStrCls
{
    public static void CreateInternedStr(char[] characters)
    {
        string NonInternedStr = new string(characters);
        String.Intern(NonInternedStr);
    }

    public static void CreateInternedStr(StringBuilder strBldr)
    {
        String.Intern(strBldr.ToString( ));
    }

    public static void CreateInternedStr(string str)
    {
        String.Intern(str);
    }

    public static void CreateInternedStr(string[] strArray)
    {
        foreach(string s in strArray)
        {
            String.Intern(s);
        }
    }
}

Discussion

The CLR automatically stores all string literals declared in an application in an area of memory called the intern pool. The intern pool contains a unique instance of each string literal found in your code, which allows for more efficient use of resources by not storing multiple copies of strings that contain the same string literal. Another benefit is speed. When two strings are compared using either the == operator or the Equals instance method of the string class, a test is done to determine whether each string variable reference is the same; if they are not, then each string's length is checked; if both string's lengths are equal, each character is compared individually. However, if we could guarantee that the references, instead of the string contents, could be compared, much faster string comparisons can be made. String interning does just that: it guarantees that the references to equivalent string values are the same, eliminating the possibility of attempting the length and character-by-character checks. This yields better performance in situations where the references to two equal strings are different and the length and character-by-character comparisons have to be made.

Note that the only strings automatically placed in this intern pool by the compiler are string literals—strings surrounded by double quotes—found in code by the compiler. The following lines of code will place the string "foo" into the intern pool:

string s = "foo";
StringBuilder sb = new StringBuilder("foo");
StringBuilder sb = new StringBuilder( ).Append("foo");

The following lines of code will not place the string "foo" into the intern pool:

char[] ca = new char[3] {'f','o','o'};
StringBuilder sb = new StringBuilder( ).Append("f").Append("oo");

string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;

You can programmatically store a new string created by your application in the intern pool using the static string.Intern method. This method returns a string referencing the string literal contained in the intern pool, or, if the string is not found, the string is entered into the intern pool and a reference to this newly pooled string is returned.

There is also another method used in string interning called IsInterned. This method operates similarly to the Intern method, except that it returns null if the string is not in the intern pool, rather than adding it to the pool. This method returns a string referencing the string literal contained in the intern pool, or, if the string is not found, it returns null.

An example of using this method is shown here:

string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;
if (String.IsInterned(s3) == null)
{
    Console.WriteLine("NULL");
}

However, if we add the highlighted line of code, the IsInterned test returns a non-null string object:

string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;
InternedStrCls.CreateInternedStr(s3);
if (String.IsInterned(s3) == null)
{
    Console.WriteLine("NULL");
}

The Intern method is useful when you need a reference to a string, even if it does not exist in the intern pool.

The IsInterned method can optimize the comparison of a single string to any string literal or manually interned string. Consider that you need to determine whether a string variable contains any string literal that has been defined in the application. Call the string.IsInterned method with the string variable as the parameter. If null is returned, there is no match in the intern pool, and thus there is no match between the string variable's value and any string literals:

string s1 = "f";
string s2 = "oo";
string s3 = s1 + s2;

if (String.IsInterned(s3) != null)
{
    // If the string "foo" has been defined in the app and placed
    //   into the intern pool, this block of code executes.
}
else
{
    // If the string "foo" has NOT been defined in the app NOR been placed
    //   into the intern pool, this block of code executes.
}

Exercise caution when using the string interning methods. Calling the Intern method for every possible string that could be created by your application would actually cause the application's performance to slow considerably, since this method must search the intern pool for the string; if it does not exist in the pool, it is added. The reference to the newly created string in the intern pool is then returned.

Another potential problem with the IsInterned method in particular stems from the fact that every string literal in the application is stored in this intern pool at the start of the application. If you are using IsInterned to determine whether a string exists, you are comparing that string against all string literals that exist in the application, as well as any you might have explicitly interned, not just the ones in the scope in which IsInterned is used.

See Also

See the "String.Intern Method" and "String.IsInterned Method" topics in the MSDN documentation.

    [ Team LiB ] Previous Section Next Section