One of the common features of most modern programming languages is the ability to do recursion. Recursion is the ability of functions or procedures to call themselves. Though not always mentioned, this ability is also available in SQL Query.
One of the classic programming problems is sorting. Sorting of rows is simple with SQL, but suppose you have a string field that contains comma separated values (CSV) that you need to sort; then SELECT with ORDER BY is not much use.
The requirement to sort the value in a file might arise if you are maintaining metadata. This is especially common for web back-end databases. Suppose you need to maintain a list of column numbers that have been changed by the front end. The first step is to find a way to number your columns. The best way is to use the same order as defined in the database.
DECLARE @TableName varchar(50) SET @TableName = 'Employees' SELECT name FROM syscolumns WHERE id = OBJECT_ID(@TableName) ORDER BY colorder
The script in Listing D-7 returns the column names for the Employees table. The result is shown in Figure D-2.
Going back to our original CSV problem, suppose that a user changes the value in City. Our CSV will contain one value, “9.” If the user then changes Notes, HomePhone, Title, and Address, the final CSV will be “9, 16, 13, 4, 8.” Suppose now that CSV needs to be sorted so that it becomes “4, 8, 9, 13, 16,” and for the purpose of this exercise, let’s assume that you need this done in a stored procedure. You will have to implement some sort of storing routine. One of the most common sorting algorithms is called Quick Sort.
The pseudocode for Quick Sort is fairly simple.
Start Sort Get First Value in List If there is other value then put all other values less than current value in <Left List> and all values more than current value in <Right List>; otherwise end the sort Sort <Left List> /* this is a recursive call*/ Sort <Right List> /* this is a recursive call*/ Sorted list = Left List, Value, Right List END Sort
The challenge now is in implementing the algorithm in SQL. We will call the stored procedure to do a Quick Sort of a CSV list sp_QuickSort.
The first step is to make sure that if the stored procedure exists, we delete it so that we can create a new one. We also create the stored procedure but with no logic inside. This is not required, but it prevents getting warning messages when we apply the script with recursive calls.
IF EXISTS (SELECT name FROM sysobjects WHERE name = 'sp_QuickSort' AND type = 'P') DROP PROCEDURE sp_QuickSort GO create procedure sp_QuickSort @@CSVList varchar(256) = '' OUTPUT AS /*Create PROCEDURE first to allow recursion call to compile without warning*/ GO
Now that we have the blank stored procedure, the next step is to alter it and add the logic. We begin by declaring the variable we will need.
ALTER procedure sp_QuickSort /****************************************** Procedure: sp_sp_QuickSort Date: 15-July-2002 Author: Ryan N. Payet Description: Quicksort a CSV list, also remove duplicate entry Version History --------------- 0.1, 15-July-2002, Ryan N. Payet Procedure Created ******************************************/ @@CSVList varchar(1000) = '' OUTPUT AS BEGIN IF @@CSVList = '' GOTO END_SORT /*Declare string variables to store list*/ DECLARE @Left_List varchar(1000) DECLARE @Right_List varchar(1000) /*Declare string variables to store a single item in list*/ DECLARE @strItem varchar(20) DECLARE @StrOneCSV varchar(20) /*Since items in list are numbers we need to covert string to numbers before we can do comparison First we declare integer variable to hold number for comparison*/ DECLARE @intItem int /*Declare variables used to process list*/ DECLARE @start int DECLARE @end int DECLARE @length int /*Set both lists to empty string*/ SET @Left_List = '' SET @Right_List = ''
Now that we have declared our variable, we move on to getting the first item in the list. If we do not find a “,” (comma) character in the list, we can safely assume that there is no value in the list (empty list) or there is only one value in the list. In either case, there is no need to sort the list. If we find that the list has more than one item, then we get the first item in the list.
/*Start processing CSV so as to get first item*/ SET @start = 1 /*Find position of first "," */ SET @end = CHARINDEX(',', @@CSVList, @start) /*if there is no "," then no need to sort.*/ IF @end = 0 GOTO END_SORT /*Get string length of first number*/ SET @length = @end - @start IF @length = 0 BEGIN SET @length = 1 SET @end = 0 END /*Get first Value*/ SET @strItem = substring( @@CSVList , @start, @length ) SET @intItem = cast(@strItem as integer) SET @start = @end + 1 SET @end = CHARINDEX(',', @@CSVList , @start) /*Get other values in list*/
Now that we know the value of the first item, we can find the other items and build our right list of items greater than current value and left list for items less than current value.
Note |
Using only “greater than” (>) and “less than” (<) also removes duplicates in the list. To keep duplicates you must either use “greater than or equal to” (>=) on right list or use “less than or equal to” (<=) on left list. |
WHILE (@end <> 0) BEGIN SET @length = @end - @start SET @StrOneCSV = substring( @@CSVList , @start, @length ) SET @start = @end + 1 SET @end = CHARINDEX(',', @@CSVList , @start) /*If greater than first value, then add to Right_List If less than first value, add to Left_List*/ IF CAST(@StrOneCSV as integer) > @intItem SET @Right_List = @StrOneCSV + ',' + @Right_List ELSE IF CAST(@StrOneCSV as integer) < @intItem SET @Left_List = @StrOneCSV + ',' + @Left_List END /*Process last item in list*/ SET @length = len( @@CSVList ) SET @StrOneCSV = substring( @@CSVList , @start, @length ) IF CAST(@StrOneCSV as integer) > @intItem SET @Right_List = @StrOneCSV + ',' + @Right_List ELSE IF CAST(@StrOneCSV as integer) < @intItem SET @Left_List = @StrOneCSV + ',' + @Left_List /*Clean lists of trailing "," chracter*/ SET @length = len(@Left_List) - 1 if @length > 0 SET @Left_List = substring( @Left_List , 1, @length) SET @length = len(@Right_List) - 1 if @length > 0 SET @Right_List = substring( @Right_List , 1, @length)
All that remains is for us to sort the right list and left list using recursive calls to sp_QuickSort. Once we are done, we can rebuild the list.
/*quicksort front of list: <Left_List>*/ EXECUTE sp_QuickSort @Left_List OUTPUT /*quicksort back of list: <Right_List>*/ EXECUTE sp_QuickSort @Right_List OUTPUT /*Rebuild sorted list*/ if len(@Left_List) > 0 SET @@CSVList = @Left_List + ',' + @strItem ELSE SET @@CSVList = @strItem if len(@Right_List) > 0 SET @@CSVList = @@CSVList + ',' + @Right_List END_SORT: END GO
Recursion is not really that difficult, once you master the basic algorithms. To simplify your SQL scripts, you should avoid having to do recursion logic. However, in certain circumstances where you do not have a choice, recursion can be very useful.
Let’s now move to something more fun that can save a lot of time when used properly.