Another Difference Among CTE, Table Variable and Temp Table


There are many articles online talking about this topic, and I kind of know why – how many of us have been asked the question ‘tell me the difference between CTE, table variable or temp table’ in a job interview regarding SQL programming? Well I can tell you that I encountered it many times in many of my previous interviews. Like what most people would probably do, I’ve searched this topic online multiple times and read many articles about it. Below are some of the key points:
  •          Both table variable and temp tables are created in the TempDB, whereas CTE is processed in memory.
  •          Temp tables and table variables are quite similar in many occasions and can be used as a substitute to each other, except when you’ll need to create an index – in this occasion you can only use a temp table.

·         This is the one I haven’t proved -  that when your records are less than a certain amount (about tens of thousands) , then table variable is slightly better in performance; while your data is more, temp table will be a better choice (that’s also when we start to benefit from indexes).

Of course, there are other minor points such as you could create a global temp table with double number signs (##) and invoke it in any session, and we’ll not cover all of them. However, today let’s have a look at this topic from another point of view – the execution plan. Let’s look at the below T-SQL scripts along with their execution plans as I think they are self-explained pretty well:

SELECT
       BEA.BusinessEntityID
       ,BEA.AddressID
       ,BEA.AddressTypeID
       ,BE.ModifiedDate
       ,AD.AddressLine1
       ,AD.AddressLine2
       ,AD.City
       ,Ad.PostalCode
       ,AD.StateProvinceID
       ,ADT.Name
FROM [Person].[BusinessEntity] BE
INNER JOIN [Person].[BusinessEntityAddress] BEA ON BE.BusinessEntityID = BEA.BusinessEntityID
INNER JOIN [Person].[Address] AD ON BEA.AddressID = AD.AddressID
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID



;WITH CET_Sample AS
       (
              SELECT
                     BEA.BusinessEntityID
                     ,BEA.AddressID
                     ,BEA.AddressTypeID
                     ,ADT.Name
              FROM [Person].[BusinessEntityAddress] BEA
              INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID
       )

SELECT
       CS.BusinessEntityID
       ,CS.AddressID
       ,CS.AddressTypeID
       ,BE.ModifiedDate
       ,AD.AddressLine1
       ,AD.AddressLine2
       ,AD.City
       ,Ad.PostalCode
       ,AD.StateProvinceID
       ,CS.Name
FROM CET_Sample CS
INNER JOIn [Person].[BusinessEntity] BE ON CS.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[Address] AD ON CS.AddressID = AD.AddressID



SELECT
       BEA.BusinessEntityID
       ,BEA.AddressID
       ,BEA.AddressTypeID
       ,ADT.Name
INTO #CTE
FROM [Person].[BusinessEntityAddress] BEA
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID

SELECT
       CS.BusinessEntityID
       ,CS.AddressID
       ,CS.AddressTypeID
       ,BE.ModifiedDate
       ,AD.AddressLine1
       ,AD.AddressLine2
       ,AD.City
       ,Ad.PostalCode
       ,AD.StateProvinceID
       ,CS.Name
FROM #CTE CS
INNER JOIn [Person].[BusinessEntity] BE ON CS.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[Address] AD ON CS.AddressID = AD.AddressID



DECLARE @CET TABLE
(
       [BusinessEntityID] int
       ,[AddressID] int
       ,[AddressTypeID] int
       ,[Name] [nvarchar](50)
)

INSERT INTO @CET
SELECT
       BEA.BusinessEntityID
       ,BEA.AddressID
       ,BEA.AddressTypeID
       ,ADT.Name
FROM [Person].[BusinessEntityAddress] BEA
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID

SELECT
       CS.BusinessEntityID
       ,CS.AddressID
       ,CS.AddressTypeID
       ,BE.ModifiedDate
       ,AD.AddressLine1
       ,AD.AddressLine2
       ,AD.City
       ,Ad.PostalCode
       ,AD.StateProvinceID
       ,CS.Name
FROM @CET CS
INNER JOIn [Person].[BusinessEntity] BE ON CS.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[Address] AD ON CS.AddressID = AD.AddressID



After comparing the above T-SQL scripts along with their execution plans, we can get the below conclusions:
·         Compared to a base join statement, CTE will not change the execution plan. For example, suppose you have
1) a base join statement with syntax as ‘FROM A; JOIN B; JOIN C’, when you
2) create a CTE with the join results of ‘FROM A; JOIN C’ and then use that CTE to join B, 1) and 2) will create the same execution plan.
·         The patterns of execution plans by a table variable and a temp table are quite similar, especially in the definition and data population part. However, in the second part when we use the table variable or temp table to join the rest of the tables, the execution plans start to get different. We’ll dig deeper in the next post.
There’s another discovery which I didn’t put here yet – as it was made in my previous employment and the data was very sensitive, hence any disclosure of the script or the execution plan was against the company’s security policy, but the conclusion was quite interesting – although CTE will not change the execution plan, it does improve the performance in some occasions, and hence the following performance tuning tips (especially for Views, and you can’t use table variable or temp table):
  •          If you have multiple ‘JOIN’ statements, try to break them into small chunks and put each small chunk in a CTE, and then JOIN the CTEs instead. Usually this will increase the performance quite well.
  •          If the same statement is appearing multiple times, declare it as a CTE and then apply it in the rest of the script.



Comments

Popular posts from this blog

SQL Server - Using SQL Agent JOBs to Schedule and Execute Multiple Stored Procedures in Parallel

sp_executesql – Let's learn How to Use It

Installing Azure PowerShell Module in Your Local Windows PowerShell