ADO NET Programming-Server Cross Tabulations with SQL

Server Cross Tabulations with SQL

Cross tabulation (cross-tab or X-tab for short) is a statistical report where data is displayed in a way that allows for easy comparison. The displayed data is usually aggregated, denormalized, and displayed within a matrix table where the headings of certain rows are data themselves.

Consider this table, which shows some data about certain items from a small electronics store.

Name	Status	Amount
VCR player	Damaged	1
VCR player	Sold	10
Monitor	Damaged	3
Monitor	Sold	3
Mouse	Damaged	0
Mouse	Sold	1
Keyboard	Damaged	2
Keyboard	Sold	1

The table on the following page shows another way to display the same information.

Name	Sold	Damaged	Total
VCR player	10	1	11
Monitor	3	3	6
Mouse	1	0	1
Keyboard	1	2	3
Total	15	6	21

The new table formatted as a cross-tab now displays the data in a more compact manner that allows for easier comparison. Notice that the Status column, which contains two discrete values (Sold, Damaged), has been replaced by the Sold and Damaged columns. In other words, the value from Status has been converted to columns. If Status had more values, the cross-tab would generate a column for each additional value. There is also a column and a row to show the total for each.

Many front-end tools provide advanced cross-tab capabilities. PivotTable in Excel is a good example. A problem, however, arises when you have a huge amount of data that you need to transfer. This tends to slow down the front end. The way around this is to do the cross-tab, or some of it, on the server. Very complex cross-tabs and statistical analysis can be achieved through the use of expensive statistical tools, such as online analytical processing (OLAP) tools available with the Enterprise Edition of Microsoft SQL Server 2000. However, this might be too expensive or overkill for what you need.

Let’s start with a simple example. If you look at the Employees table, you will notice that there are two distinct values for the Country field. These are “USA” and “UK.” Suppose that for each job title (Employees.Title), we want to know how many employees are from each country. This would be a simple matter of using COUNT and GROUP BY.

Listing D-14a

SELECT Employees.Title,
       Employees.Country,
       COUNT(1) as total
 FROM Employees
GROUP BY Employees.Title, Employees.Country
ORDER BY Employees.Title

The result is shown in Figure D-8:

Figure D-8: Result of counting employees for each Title and Country

Though the result list is small, imagine for a minute that there are many more employees and many more titles; then the result set would be bigger and difficult to read. To overcome this, you can put USA and UK under their own column heading so that you end up with:

Title	USA	UK	Total
Inside Sales Coordinator	1	0	1
Sales Manager	0	1	1
Sales Representative	3	3	6
Vice President, Sales	1	0	1
<-- ALL -->	5	4	9

To work the query out, it helps if you consider how you would do it manually.

For USA: You will only count a row if the country is USA.
For UK: You will only count a row if the country is UK.

Let’s now put this into a SQL query.

Listing D-14b

SELECT Employees.Title,
       SUM (CASE WHEN Employees.Country = 'USA' THEN 1
            ELSE 0 END) as USA,
       SUM (CASE WHEN Employees.Country = 'UK' THEN 1 
            ELSE 0 END) as UK,
       COUNT(1) as total
  FROM Employees
GROUP BY Employees.Title
ORDER BY Employees.Title

Since Country is no longer a column in the SELECT but has been split into two columns, USA and UK, we must remove it from the GROUP BY clause. This query, however, does not have our grand total (<-- ALL -->). To generate it, we need to make use of ROLLUP, like we learned in the previous section. We also have to generate an extra column [Group No] to help with the ordering.

SELECT GROUPING(Employees.Title) as [Group No],
       CASE WHEN
         (GROUPING(Employees.Title) = 1)
       THEN '<-- ALL -->'
       ELSE
         ISNULL(Employees.Title, 'UNKNOWN')
       END AS Title,
       SUM (CASE WHEN Employees.Country = 'USA' THEN 1
            ELSE 0 END) as USA,
       SUM (CASE WHEN Employees.Country = 'UK' THEN 1 
            ELSE 0 END) as UK,
       COUNT(1) as Total
 FROM Employees
GROUP BY Employees.Title WITH ROLLUP
ORDER BY [Group no], Employees.Title

The result is shown in Figure D-9.

Figure D-9: Simple cross-tab for employees

There is one problem, however. If we get an employee from another country, say for example, Seychelles (SEZ), we will have to recreate the script with an extra column. What if there are 20 more countries and 50 more different titles? It is not viable to create the columns one by one. There is something that we learned earlier that we can use: dynamic queries.

Cross-tab with Dynamic Queries

To demonstrate the power of dynamic queries, GROUP BY, and cross-tabs, let’s suppose we want to compare the sales value of every product against every employee to generate a product-employee cross-tab. We will do this through a stored procedure called [sp_Product_Employee_xtab].

The first design consideration is to choose which of the Product and Employee columns you want to split into separate columns. Since there are fewer employees than there are products, it is better to choose Employee so that we have fewer columns to build.

The SELECT Templates

The general template of the SELECT in this case is:

Listing D-15a

SELECT GROUPING(ProductName) as [Group No],
  CASE WHEN (GROUPING(Products.ProductName) = 1)
  THEN '<-- ALL -->'
  ELSE
    ISNULL(Products.ProductName, 'UNKNOWN ')
  END AS [Product Name],
    <CASE WHEN for each Employee Name>
  SUM( [Order Details].UnitPrice *
    [Order Details].Quantity *
    (1 - [Order Details].Discount)
  ) as [Total Sale Value] 
FROM Employees
  INNER JOIN Orders
    ON Employees.EmployeeID = Orders.EmployeeID
  INNER JOIN [Order Details]
    ON Orders.OrderID = [Order Details].OrderID
  INNER JOIN Products
    ON [Order Details].ProductID = Products.ProductID
  GROUP BY Products.ProductName WITH ROLLUP
  ORDER BY [Group No], Products.ProductName

The template for <CASE WHEN for each Employee Name> is shown in Listing D-15b.

Listing D-15b

SUM ( CASE WHEN Employees.EmployeeID = <@EmployeeID>
THEN ( [Order Details].UnitPrice *
       [Order Details].Quantity *
       (1 - [Order Details].Discount)
     )
ELSE
  0
END )as [<@EmployeeName>]

The templates can be generated into a string and the string executed using EXCUTE.

The sp_Product_Employee_xtab

Now that we know the template for the SELECT statement, let’s go ahead and build sp_Product_Employee_xtab. First, we do the standard declarations.

Listing D-16a

/******************************************
Drop stored procedure if already exist    *
******************************************/
if exists
  (select * from sysobjects
    where id = object_id(N'sp_Product_Employee_xtab') and
    OBJECTPROPERTY(id, N'IsProcedure') = 1
  )
DROP procedure sp_Product_Employee_xtab
GO

CREATE procedure sp_Product_Employee_xtab
/******************************************
 Procedure:  sp_OrderAddProduct
 Date:       17 July 2002
 Author:     Ryan N. Payet
 Description:
        Generate Cross-tab to compare Sale Value
        of every product against every Employee

 Version History
 ---------------
 0.1, 17 July 2002, Ryan N. Payet:
      Procedure Created
******************************************/
AS
BEGIN
  
  /*Declare Variable for script*/
  DECLARE @@SQLTOPScript as nvarchar(4000)
  DECLARE @@SQLCOLScript  as nvarchar(4000)
  DECLARE @@SQLBOTTOMScript as nvarchar(4000)
  
  /*Declare Variable for employee*/
  DECLARE @@ID as integer
  DECLARE @@NAME as nvarchar(100)

Now we need to generate the <CASE WHEN for each Employee Name> part. To do this, we need to use a cursor that accesses the Employee table. Using the cursor, we can then loop through each row in the Employees table and generate the script.

Listing D-16b

  /*Declare cursor for Loop though Employees table */
  DECLARE Empoyee_Cursor CURSOR FOR
    SELECT EmployeeID,
        Employees.FirstName + ' ' + Employees.LastName as
        [Employee Name]
    FROM Employees ORDER BY [Employee Name]

  OPEN Employee_Cursor
  
  /*Get the first Employee*/
  FETCH NEXT FROM Employee_Cursor
  INTO @@ID,@@NAME
  SET @@SQLCOLScript = ''
  /*LOOP until end of cursor*/
  WHILE @@FETCH_STATUS = 0
  BEGIN
    /*Build Column list*/
    SET @@SQLCOLScript = @@SQLCOLScript + ',' + Char(13)
      + ' SUM ( CASE WHEN Employees.EmployeeID = ' 
      + CONVERT(nvarchar(10),@@ID)
      + ' THEN ( [Order Details].UnitPrice '
      + ' * [Order Details].Quantity * '
      + ' (1 - [Order Details].Discount) )'
      + ' ELSE 0 END)as [' + @@NAME + ']'
    /*Get next table name*/
    FETCH NEXT FROM Employee_Cursor
    INTO @@ID,@@NAME
  END

  /*CLEAN up CURSOR*/
  CLOSE Employee_Cursor
  DEALLOCATE Employee_Cursor

The final part is simple, and all that needs to be done is to generate the rest of the scripts, join them, and execute.

Listing D-16c

  /*Declare top part of query*/
  SET @@SQLTOPScript = ''
    + 'SELECT '
    + 'GROUPING(ProductName) as [Group No],' + Char(13)
    + ' CASE WHEN '
    + '(GROUPING(Products.ProductName) = 1)'
    + ' THEN ' + CHAR(39) + '<-- ALL -->' + CHAR(39)
    + ' ELSE'
    + ' ISNULL(Products.ProductName, '
    + CHAR(39) + 'UNKNOWN '+ CHAR(39) + ')'
    + 'END AS [Product Name]'
  /*Declare bottom part of query*/
  SET @@SQLBOTTOMScript = ',' + Char(13)
    + ' SUM( [Order Details].UnitPrice *'
    + ' [Order Details].Quantity *'
    + ' (1 - [Order Details].Discount)'
    + ' ) as [Total Sale Value] ' + Char(13)
    + 'FROM   Employees ' + Char(13)
    + '  INNER JOIN Orders ' + Char(13)
    + '    ON Employees.EmployeeID = Orders.EmployeeID ' 
      + Char(13)
    + '  INNER JOIN [Order Details] ' + Char(13)
    + '    ON Orders.OrderID = [Order Details].OrderID ' 
      + Char(13)
    + '  INNER JOIN Products' + Char(13)
    + '    ON [Order Details].ProductID 
           = Products.ProductID' + Char(13)
    + ' GROUP BY Products.ProductName WITH ROLLUP ' 
      + Char(13)
    + ' ORDER BY [Group No], Products.ProductName  ' 
      + Char(13)

   EXECUTE ( @@SQLTOPScript + @@SQLCOLScript 
           + @@SQLBOTTOMScript)

END

GO

Now that the stored procedure is done, all we have to do is excute it to get our cross-tab.

EXECUTE sp_Product_Employee_xtab

A sample of the result is shown in Figure D-10.

Figure D-10: Sample result of running sp_Product_Employee_xtab

The stored procedure returns a result that is formatted as a cross-tab. If the number of employees increases, the number of columns will automatically be adjusted to accommodate the new employees.