DekGenius.com
[ Team LiB ] Previous Section Next Section

Recipe 10.4 Determining Where Characters or Strings Do Not Balance

Problem

It is not uncommon to accidentally create strings that contain unbalanced parentheses. For example, a user might enter the following equation in your calculator application:

(((a) + (b)) + c * d

This equation contains four ( characters while only matching them with three ) characters. You cannot solve this equation, since the user did not supply the fourth ) character. Likewise, if a user enters a regular expression, you might want to do a simple check to see that all the (, {, [, and < characters match up to every other ), }, ], and > character.

In addition to determining whether the characters/strings/tags match, you should also know where the unbalanced character/string/tag exists in the string.

Solution

Use the various Check methods of the Balance class to determine whether and where the character/string is unbalanced:

using System;
using System.Collections;

public class Balance
{
    public Balance( ) {}

    private Stack bookMarks = new Stack ( );

    public int Check(string source, char openChar, char closeChar)
    {
        return (Check(source.ToCharArray( ), openChar, closeChar));
    }

    public int Check(char[] source, char openChar, char closeChar)
    {
        bookMarks.Clear( );

        for (int index = 0; index < source.Length; index++)
        {
            if (source[index] == openChar)
            {
                bookMarks.Push(Index);
            }
            else if (source[index] == closeChar)
            {
                if (bookMarks.Count <= 0)
                {
                    return (index);
                }
                else
                {
                    bookMarks.Pop( );
                }
            }
        }

        if (bookMarks.Count > 0)
        {
            return ((int)bookMarks.Pop( ));
        }
        else
        {
            return (-1);
        }
    }        

    public int Check(string source, string openChars, string closeChars)
    {
        return (Check(source.ToCharArray( ), openChars.ToCharArray( ), 
                closeChars.ToCharArray( )));
    }

    public int Check(char[] source, char[] openChars, char[] closeChars)
    {
        bookMarks.Clear( );

        for (int index = 0; index < source.Length; index++)
        {
            if (source[index] == openChars[0])
            {
                if (CompareArrays(source, openChars, index))
                {
                    bookMarks.Push(index);
                }
            }

            if (source[index] == closeChars[0])
            {
                if (CompareArrays(source, closeChars, index))
                {
                    if (bookMarks.Count <= 0)
                    {
                        return (index);
                    }
                    else
                    {
                        bookMarks.Pop( );
                    }
                }
            }
        }

        if (bookMarks.Count > 0)
        {
            return ((int)bookMarks.Pop( ));
        }
        else
        {
            return (-1);
        }
    }

    public bool CompareArrays(char[] source, char[] targetChars, int startPos)
    {
        bool isEqual = true;

        for (int index = 0; index < targetChars.Length; index++)
        {
            if (targetChars[index] != source[startPos + index])
            {
                isEqual = false;
                break;
            }
        }

        return (isEqual);
    }    
}

The Check method determines whether there is one closing element for every opening element. There are four overloaded Check methods, and each takes three parameters of varying types. These methods return an integer indicating where the offending character is located, or a negative number if each openChar has a matching closeChar.

These methods return an integer indicating where the offending string is located, or a negative number if each openChars has a matching closeChars.

The code to exercise the Balance class is shown here:

class CTest
{
   static void Main( )
   {
        Balance balanceUtil = new Balance( );

        // A string with an unbalanced } char. This unbalanced char is the final
        //    } char in the string.
        string unbalanced = @"{namespace Unbalanced
                            {
                                public class Tipsy
                                {
                                    public Tipsy( )
                                    {
                                }}}}}
                            ";

        // Use the various overloaded Check methods 
        //    to check for unbalanced } chars
        Console.WriteLine("Balance {}: " + 
                        balanceUtil.Check(unbalanced, '{', '}'));
        Console.WriteLine("Balance {}: " + 
                        balanceUtil.Check(unbalanced.ToCharArray( ), '{', '}'));

        Console.WriteLine("Balance {}: " + 
                       balanceUtil.Check(unbalanced.ToCharArray( ), 
                                new char[1] {'{'}, new char[1] {'}'}));
        Console.WriteLine("Balance {}: " + 
                        balanceUtil.Check(unbalanced.ToCharArray( ), 
                                new char[1] {'{'}, new char[1] {'}'}));
    }
}

This code produces the following output:

Balance {}: 136
Balance {}: 136
Balance {}: 136
Balance {}: 136
Balance {}: -1

where a -1 means that the items are balanced and a number greater than -1 indicates the character position that contains the unbalanced character.

Discussion

Determining whether characters have a matching character is actually quite easy when a Stack object is used. A stack works on a first-in, last-out (FILO) principle. The first item added to a stack is always the last one to be removed; conversely, the last item added to a stack is always the first removed.

To see how the stack is used in matching characters, let's see how we'd use it to handle the following equation:

((a + (b)) + c) * d

The algorithm works like this: we iterate through all characters in the equation, then any time we come upon a left or right parenthesis, we push or pop an item from the stack. If we see a left parenthesis, we know to push it onto the stack. If we see a right parenthesis, we know to pop a left parenthesis from the stack. In fact, the left parenthesis that was popped off of the stack is the matching left parenthesis to the current right parenthesis. If all parentheses are balanced, the stack will be empty after iterating through all characters in the equation. If the stack is not empty, the top left parenthesis on the stack is the one that does not have a matching right parenthesis. If there are two or more items in the stack, there is more than one unbalanced parenthesis in the equation.

For the previous equation, starting at the lefthand side, we would push one left parenthesis on the stack and then immediately push a second one. We consume the a and + characters and then come upon a third left parenthesis; our stack now contains three left parentheses. We consume the b character and come upon two right parentheses in a row. For each right parenthesis, we will pop one matching left parenthesis off of the stack. Our stack now contains only one left parenthesis. We consume the + and c characters and come upon the last right parenthesis in the equation. We pop the final left parenthesis off of the stack and then check the rest of the equation for any other parenthesis. Since the stack is empty and we are at the end of the equation, we know that each left parenthesis has a matching right parenthesis.

For our Check methods in this recipe, the location in the string where each left parenthesis is located is pushed onto the stack. This allows us to immediately locate the offending parenthesis.

See Also

See the "Stack Class" topic in the MSDN documentation.

    [ Team LiB ] Previous Section Next Section