CTE’s were a new introduction in SQL 2005 and even now we’re looking down the throat on SQL 2008 there are still many people who are unclear as to what CTE’s are, how they would use them and when.
So what are they?
CTE is short for Common Table Expression and is basically and expression that returns a temporary result set inside of another statement. In this way they are very similar to using a declared temporary table which you then reference. They are part of the ANSI SQL 99 standard and they are particularly useful for doing recursive queries.
A recursive query is one which can be said to call itself many times over. An example would be a query which allows you to find the Manager and staff of every employee in a company. In the Adventure Works database the Employee table holds details for each employee and the ManagerId of the staff member they report to. These managerid are self joined to the respective employeeid’s in the same table. A CTE to find the information could look something like:
USE AdventureWorks ;
GO
WITH DirectReports(LoginID, ManagerID, EmployeeID) AS
(
SELECT LoginID, ManagerID, EmployeeID
FROM HumanResources.Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT e.LoginID, e.ManagerID, e.EmployeeID
FROM HumanResources.Employee e
INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID
)
SELECT *
FROM DirectReports ;
GO
This gives us the login ID, Manager Id and the EmployeeID for each user.
This statement can be broken down into three parts. The first is the ‘Root’ or ‘Anchor Member’ :
SELECT LoginID, ManagerID, EmployeeID
FROM HumanResources.Employee
WHERE ManagerID IS NULL
This tells us our top, starting level in the hierarchy.
The second part is what makes the CTE recursive as it references itself using the join : INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID
The statements are linked together by a UNION
UNION ALL
SELECT e.LoginID, e.ManagerID, e.EmployeeID
FROM HumanResources.Employee e
INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID
Finally we want to see our result set and this is done using the select statement:
SELECT *
FROM DirectReports ;
It is important to remember that the select statement must follow the CTE syntax immediatley. Try running the following as an example of how this fails when not called immediately after the CTE:
USE AdventureWorks ;
GO
WITH DirectReports(LoginID, ManagerID, EmployeeID) AS
(
SELECT LoginID, ManagerID, EmployeeID
FROM HumanResources.Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT e.LoginID, e.ManagerID, e.EmployeeID
FROM HumanResources.Employee e
INNER JOIN DirectReports d
ON e.ManagerID = d.EmployeeID
)
SELECT * FROM HumanResources.Employee
SELECT *
FROM DirectReports ;
GO
The following error is thrown:
Msg 208, Level 16, State 1, Line 15
Invalid object name ‘DirectReports’.
CTE’s are just another weapon in the developers arsenal. There will be times when it is appropriate to use them and times when there is a better solution. I recently read a great article by Peter He about the performance implications of using CTE against general looping constructs. Its worth a read and can be found here.