Recursive Queries using Common Table Expressions (CTE)

SQL Server 2005 uses a new T-SQL Entity called a  Common Table Expression(CTE), which can be thought of a resultant dataset that can be used within a query(SELECT,INSERT,UPDATE,DELETE or even a View). In many respects a CTE is similar to a derived table variable because it lingers around only for the duration of the statement. What is often unused in these instances,is the ability of the CTE to self-reference and therefore provide a way we can use it to perform recursive queries. In order to do this you would structure the CTE like this…


WITH [CTE_Name](<Column names>)
AS
(
--Put the anchor statement here
--This would give you the root of the recursive resultset
UNION ALL
--Put the recursive part of the query here
--This will handle the recursions and should reference [CTE_Name]
)

If we take for an example a table of Employees with the following structure

EmployeeID
ManagerID
FirstName.
LastName.
JobTitle,
Phone

In which the ManagerID references back to the EmployeeID of the manager whom the Employee reports to. Now, if we needed to create a list of all of the Employees whom reported to a particular manager and to make that list recursive …we could use the following in order to get the full recursive list.


DECLARE @EmployeeTable TABLE
(
-- Add the column definitions for the TABLE variable here
FirstName nvarchar(50)
,LastName nvarchar(50)
,JobTitle nvarchar(50)
,Phone nvarchar(25)
,ManagerID int
,EmployeeID int
)

-- INSERT the initial level of employees
INSERT INTO @EmployeeTable(FirstName,LastName,JobTitle,Phone,ManagerID,EmployeeID)
SELECT E.FirstName,E.LastName,E.JobTitle,E.Phone,Emp.ManagerID,Emp.EmployeeID
FROM HumanResources.vEmployee E INNER JOIN HumanResources.Employee Emp
ON E.EmployeeID=Emp.EmployeeID
WHERE Emp.ManagerID=5

DECLARE @EmpCount int
SELECT @EmpCount=COUNT(*) FROM HumanResources.Employee WHERE ManagerID IN
(SELECT DISTINCT EmployeeID FROM @EmployeeTable) AND ManagerID IS NOT NULL
AND EmployeeID NOT IN (SELECT DISTINCT EmployeeID FROM @EmployeeTable)

WHILE @EmpCount>0
BEGIN
INSERT INTO @EmployeeTable(FirstName,LastName,JobTitle,Phone,ManagerID,EmployeeID)
SELECT E.FirstName,E.LastName,E.JobTitle,E.Phone,Emp.ManagerID,Emp.EmployeeID
FROM HumanResources.vEmployee E INNER JOIN HumanResources.Employee Emp
ON E.EmployeeID=Emp.EmployeeID AND Emp.ManagerID IS NOT NULL
WHERE Emp.ManagerID IN (SELECT DISTINCT EmployeeID FROM @EmployeeTable)
AND Emp.EmployeeID NOT IN (SELECT DISTINCT EmployeeID FROM @EmployeeTable)

--Now refresh the numbers again
SELECT @EmpCount=COUNT(*) FROM HumanResources.Employee WHERE ManagerID IN
(SELECT DISTINCT EmployeeID FROM @EmployeeTable) AND ManagerID IS NOT NULL
AND EmployeeID NOT IN (SELECT DISTINCT EmployeeID FROM @EmployeeTable)
END

However, we could simplify the above code if we were to use a Common Table expression as the following


With EmployeeTable (EmployeeID, ManagerID, FirstName, LastName, JobTitle, Phone)
AS
(
--Anchor Query comes first
SELECT E.FirstName,E.LastName,E.JobTitle,E.Phone,Emp.ManagerID,Emp.EmployeeID
FROM HumanResources.vEmployee E INNER JOIN HumanResources.Employee Emp
ON E.EmployeeID=Emp.EmployeeID
WHERE Emp.ManagerID=5
--Then the recursive part
SELECT E.FirstName,E.LastName,E.JobTitle,E.Phone,Emp.ManagerID,Emp.EmployeeID
FROM HumanResources.vEmployee E INNER JOIN HumanResources.Employee Emp
ON E.EmployeeID=Emp.EmployeeID
INNER JOIN EmployeeTable as ET
ON Emp.ManagerID=ET.EmployeeID
)

--Now select from the CTE
SELECT * FROM EmployeeTable

As you can see a much simpler piece of code to understand. Add this to the fact that you could couple this inside of a user-defined function and it creates a powerful tool to use within your applications.