Another Difference Among CTE, Table Variable and Temp Table
There are many
articles online talking about this topic, and I kind of know why – how many of
us have been asked the question ‘tell me the difference between CTE, table
variable or temp table’ in a job interview regarding SQL programming? Well I
can tell you that I encountered it many times in many of my previous
interviews. Like what most people would probably do, I’ve searched this topic
online multiple times and read many articles about it. Below are some of the key
points:
- Both
table variable and temp tables are created in the TempDB, whereas CTE is
processed in memory.
- Temp
tables and table variables are quite similar in many occasions and can be used
as a substitute to each other, except when you’ll need to create an index – in
this occasion you can only use a temp table.
·
This
is the one I haven’t proved - that when
your records are less than a certain amount (about tens of thousands) , then
table variable is slightly better in performance; while your data is more, temp
table will be a better choice (that’s also when we start to benefit from indexes).
Of course,
there are other minor points such as you could create a global temp table with
double number signs (##) and invoke it in any session, and we’ll not cover all
of them. However, today let’s have a look at this topic from another point of
view – the execution plan. Let’s look at the below T-SQL scripts along with
their execution plans as I think they are self-explained pretty well:
SELECT
BEA.BusinessEntityID
,BEA.AddressID
,BEA.AddressTypeID
,BE.ModifiedDate
,AD.AddressLine1
,AD.AddressLine2
,AD.City
,Ad.PostalCode
,AD.StateProvinceID
,ADT.Name
FROM [Person].[BusinessEntity]
BE
INNER JOIN [Person].[BusinessEntityAddress] BEA ON BE.BusinessEntityID = BEA.BusinessEntityID
INNER JOIN [Person].[Address] AD ON BEA.AddressID = AD.AddressID
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID
;WITH CET_Sample AS
(
SELECT
BEA.BusinessEntityID
,BEA.AddressID
,BEA.AddressTypeID
,ADT.Name
FROM [Person].[BusinessEntityAddress]
BEA
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID
)
SELECT
CS.BusinessEntityID
,CS.AddressID
,CS.AddressTypeID
,BE.ModifiedDate
,AD.AddressLine1
,AD.AddressLine2
,AD.City
,Ad.PostalCode
,AD.StateProvinceID
,CS.Name
FROM CET_Sample CS
INNER JOIn [Person].[BusinessEntity] BE ON CS.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[Address] AD ON CS.AddressID = AD.AddressID
SELECT
BEA.BusinessEntityID
,BEA.AddressID
,BEA.AddressTypeID
,ADT.Name
INTO #CTE
FROM [Person].[BusinessEntityAddress]
BEA
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID
SELECT
CS.BusinessEntityID
,CS.AddressID
,CS.AddressTypeID
,BE.ModifiedDate
,AD.AddressLine1
,AD.AddressLine2
,AD.City
,Ad.PostalCode
,AD.StateProvinceID
,CS.Name
FROM #CTE CS
INNER JOIn [Person].[BusinessEntity] BE ON CS.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[Address] AD ON CS.AddressID = AD.AddressID
DECLARE @CET TABLE
(
[BusinessEntityID] int
,[AddressID] int
,[AddressTypeID] int
,[Name] [nvarchar](50)
)
INSERT INTO @CET
SELECT
BEA.BusinessEntityID
,BEA.AddressID
,BEA.AddressTypeID
,ADT.Name
FROM [Person].[BusinessEntityAddress]
BEA
INNER JOIN [Person].[AddressType] ADT ON BEA.AddressTypeID = ADT.AddressTypeID
SELECT
CS.BusinessEntityID
,CS.AddressID
,CS.AddressTypeID
,BE.ModifiedDate
,AD.AddressLine1
,AD.AddressLine2
,AD.City
,Ad.PostalCode
,AD.StateProvinceID
,CS.Name
FROM @CET CS
INNER JOIn [Person].[BusinessEntity] BE ON CS.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[Address] AD ON CS.AddressID = AD.AddressID
After
comparing the above T-SQL scripts along with their execution plans, we can get
the below conclusions:
·
Compared
to a base join statement, CTE will not change the execution plan. For example,
suppose you have
1) a base join statement with
syntax as ‘FROM A; JOIN B; JOIN C’, when you
2) create a CTE with the join
results of ‘FROM A; JOIN C’ and then use that CTE to join B, 1) and 2) will
create the same execution plan.
·
The
patterns of execution plans by a table variable and a temp table are quite
similar, especially in the definition and data population part. However, in the
second part when we use the table variable or temp table to join the rest of
the tables, the execution plans start to get different. We’ll dig deeper in the
next post.
There’s
another discovery which I didn’t put here yet – as it was made in my previous employment
and the data was very sensitive, hence any disclosure of the script or the
execution plan was against the company’s security policy, but the conclusion
was quite interesting – although CTE will not change the execution plan, it
does improve the performance in some occasions, and hence the following
performance tuning tips (especially for Views, and you can’t use table variable
or temp table):
- If
you have multiple ‘JOIN’ statements, try to break them into small chunks and
put each small chunk in a CTE, and then JOIN the CTEs instead. Usually this
will increase the performance quite well.
- If
the same statement is appearing multiple times, declare it as a CTE and then
apply it in the rest of the script.
Comments
Post a Comment