CTE’s…What are they and why would I ever want to use one?

March 16, 2008

CTE’s were a new introduction in SQL 2005 and even now we’re looking down the throat on SQL 2008 there are still many people who are unclear as to what CTE’s are, how they would use them and when.

So what are they?

CTE is short for Common Table Expression and is basically and expression that returns a temporary result set inside of another statement. In this way they are very similar to using a declared temporary table which you then reference. They are part of the ANSI SQL 99 standard and they are particularly useful for doing recursive queries.

A recursive query is one which can be said to call itself many times over. An example would be a query which allows you to find the Manager and staff of every employee in a company. In the Adventure Works database the Employee table holds details for each employee and the ManagerId of the staff member they report to. These managerid are self joined to the respective employeeid’s in the same table. A CTE to find the information could look something like:

USE AdventureWorks ;
GO

WITH DirectReports(LoginID, ManagerID, EmployeeID) AS
(

SELECT LoginID, ManagerID, EmployeeID
FROM HumanResources.Employee
WHERE ManagerID IS NULL

UNION ALL

SELECT e.LoginID, e.ManagerID, e.EmployeeID
FROM HumanResources.Employee e
INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID
)

SELECT *
FROM
DirectReports ;
GO

This gives us the login ID, Manager Id and the EmployeeID for each user.

This statement can be broken down into three parts. The first is the ‘Root’ or ‘Anchor Member’ :

SELECT LoginID, ManagerID, EmployeeID
FROM HumanResources.Employee
WHERE ManagerID IS NULL

This tells us our top, starting level in the hierarchy.

The second part is what makes the CTE recursive as it references itself using the join : INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID

The statements are linked together by a UNION

UNION ALL

SELECT e.LoginID, e.ManagerID, e.EmployeeID
FROM HumanResources.Employee e
INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID

Finally we want to see our result set and this is done using the select statement:

SELECT *
FROM
DirectReports ;

 

It is important to remember that the select statement must follow the CTE syntax immediatley. Try running the following as an example of how this fails when not called immediately after the CTE:

 

USE AdventureWorks ;
GO

WITH DirectReports(LoginID, ManagerID, EmployeeID) AS
(
SELECT LoginID, ManagerID, EmployeeID
FROM HumanResources.Employee
WHERE ManagerID IS NULL

UNION ALL

SELECT e.LoginID, e.ManagerID, e.EmployeeID
FROM HumanResources.Employee e
INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID
)

 

SELECT * FROM HumanResources.Employee

SELECT *
FROM
DirectReports ;
GO

The following error is thrown:

 

Msg 208, Level 16, State 1, Line 15
Invalid object name ‘DirectReports’.

 

CTE’s are just another weapon in the developers arsenal. There will be times when it is appropriate to use them and times when there is a better solution. I recently read a great article by Peter He about the performance implications of using CTE against general looping constructs. Its worth a read and can be found here.

 


SQL 2008 MERGE Functionality

March 11, 2008

One of the new SQL 2008 T-SQL features is MERGE.

The great thing about MERGE is it allows you to perform simultaneous UPDATE, INSERT and or DELETE operations on one table. Its supported by new physical operators that combine these operations so that they can be performed in a single scan rather than multiple scans.

This means that for the first time you can assign the contents of one table or query to another in a single operation

Here’s a small example which uses a stored proc:


/********Lets create the table object*******/

IF OBJECT_ID(‘TeamMember’) IS NOT NULL

BEGIN
DROP
TABLE TeamMember
END

GO

CREATE TABLE TeamMember

(

EmpId INT PRIMARY KEY IDENTITY (1,1),

FirstName VARCHAR(200),

LastName VARCHAR(200),

Department VARCHAR(200)

)

GO

/********Insert some Team Members*******/
/********here using the new insert multi row functionality available in 2008*******/

INSERT INTO TeamMember (FirstName,LastName,Department)
VALUES

(‘James’,‘Hill’,‘IT’),

(‘Sarah’,‘Parker’,‘Sales’),

(‘Nick’,‘Thompson’,‘Marketing’)

GO

/********Create a procedure that might do some Team Member admin in ‘the REAL world’ !*******/

IF OBJECT_ID(‘up_InsertTeamMember’) IS NOT NULL
BEGIN
DROP
PROCEDURE up_InsertTeamMember
END

GO

CREATE PROCEDURE up_InsertTeamMember
(

@FirstName VARCHAR(200),
@LastName VARCHAR(200),
@Department VARCHAR(200),
@EmpId INT OUTPUT

)

AS

BEGIN

/********Lets do the MERGE here*******/
MERGE
TeamMember
USING (SELECT @FirstName,@LastName)
tm (FirstName,LastName)
ON
TeamMember.FirstName = @FirstName AND
TeamMember.LastName = @LastName

/********If its not there lets create a new Team Member*******/

WHEN NOT MATCHED THEN
INSERT
(FirstName,LastName,Department)
VALUES
(@FirstName,@LastName,@Department)

/********If it is there lets just set the existing id to the ouput variable*******/

WHEN MATCHED THEN
UPDATE
SET @EmpId=EmpId,Department = @Department;

/********Grab the new ID*******/
SET
@EmpID = COALESCE(SCOPE_IDENTITY(),@EmpID);

END

RETURN;

go

/********Heres the procedure, try adding a new memberand then a team member with the same FirstName and LastName but different department*******/

DECLARE @EmpId INT,@Message VARCHAR(100)

EXEC up_InsertTeamMember @FirstName = ‘James’,@LastName = ‘Smith’,@Department = ‘Sales’,@EmpId = @EmpId OUTPUT

PRINT @EmpId

GO

SELECT * FROM TeamMember

GO