Inside the SQL Server Query Optimizer - part 5 The cost based optimization process and the Rules

Hi Guys,
Welcome back!

I hope you liked the last post of this series when we speak about trivial plan.
Today we continue our topic looking at what happens next...

Introduction

In the last post we saw that some queries are admissible for a trivial plan while others are not.
Queries that are not admissible for a trivial plan need to go through the cost based optimization process.
Are you ready to enter inside the heart of the optimizer?
Yes I suppose!

The cost based optimization process

If you remember, we have talked about this topic before.
The goal of this step is to find quickly a good execution plan and not to find the best execution plan at all.

During this step many important activities will be performed by the optimizer:

Reading the logical tree, logical alternatives that produce the same results are explored.

For each alternative:

A Physical implementation is generated
An estimated cost is assigned.Finally the cheapest physical solution is choosed.

Wow! Really it's not an easy task to do!
So let’s take as example the query below based on the our tables OrdTes and OrdRig.

       
SELECT
   T.ANNDOC, T.NUMDOC, SUM(R.QTA1) AS SUM_QTA
FROM dbo.ORDTES T
   JOIN dbo.ORDRIG R ON R.IDORDTES = T.ID
WHERE
   T.ANNDOC = 2018
GROUP BY T.ANNDOC, T.NUMDOC
ORDER BY T.ANNDOC, T.NUMDOC

Now visualize logical tree after the project normalization by adding at the end of the Query the hint: OPTION (QUERYTRACEON 3604, QUERYTRACEON 8606, QUERYTRACEON 8608)

       
*** Tree After Project Normalization ***

LogOp_GbAgg OUT(QCOL: [T].ANNDOC,QCOL: [T].NUMDOC,COL: Expr1004 ,) BY(QCOL: [T].NUMDOC,)
  LogOp_Join
    LogOp_Select
      LogOp_Get TBL: ORDTES(alias TBL: T) ORDTES ..
      ScaOp_Comp x_cmpEq
        ScaOp_Identifier QCOL: [T].ANNDOC
        ScaOp_Const TI(smallint,ML=2) XVAR(smallint,Not Owned,Value=2018)
    LogOp_Get TBL: ORDRIG(alias TBL: R) ORDRIG ..
    ScaOp_Comp x_cmpEq
      ScaOp_Identifier QCOL: [T].ID
      ScaOp_Identifier QCOL: [R].IDORDTES
AncOp_PrjList
   AncOp_PrjEl COL: Expr1004
     ScaOp_AggFunc stopAccum
       ScaOp_Identifier QCOL: [R].QTA1

Graphically our tree will look like:

Moreover a certain number of properties associated to each node are added by the previous steps.
For example the cardinality
Some associated properties are logical: for example the Nullability of a columns.
Others are physical for example the sort order used in a node.

The cost based optimize process starts in this moment reading these information and applying rules. So let's talk about Rules!

Rules

Starting from the tree saw above the cost based optimization apply some types of rules called Exploration, Implementation and physical property enforcement.

Exploration

An exploration rule generates a new logical equivalent for a part of the existing logical tree.
For example the SELonJN is an exploration rule that transform a cross product into a JOIN.

The JoinCommute is instead a rule that try various different logical order taking advantage of the fact that in a inner join “A JOIN B” is equivalent to “B JOIN A”

So we can write also:

(A JOIN B) JOIN C => (B JOIN C) JOIN A

The GbAggBeforeJoin is a rule that explores the possibility of pushing an aggregation operation under a join. Today we don't talk about it in mode deatail.

Implementation

Implementation rules are rules that produce a physical operation for a logical operation.
For example trasform a logical join into a hash or a merge join.

JNtoNL rule transform a JOIN into a nested loop.
JNtoHS rule transform a JOIN into Hash join.
JNtoSM rule trasform a JOIN into a Sort-Merge join.

GBAggToStrm rule transform a GBAGG into a stream aggregate
GBAggToHS rule transform a GBAGG into a Hash aggregate.

Other rules transform a logical GET operation into a SCAN or into a SEEK
For example, GetToScan, GetToTrivialScan, GetIdxToRng.

Maybe we well look to these rules in more detail in future..

How to view all the rules available

Note that rules described here are only a very little part of all the rules available.
Infact SQL Server 2017 have about 405 rules while SQL Server 2008R2 have 377 rules.

For you information, you can view all rules through the DMV below:

       
SELECT * FROM  sys.dm_exec_query_transformation_stats
order by succeeded desc

Et voilà:

But how to view rules used by our test query?

In order to view rules used to optimize our query you could use the DMV listed above.


SELECT *
INTO before_query_transformation_stats
FROM sys . dm_exec_query_transformation_stats
 
-- THIS IS OUR QUERY!
GO
SELECT
ANNDOC, NUMDOC, SUM(R.QTA1) AS SUM_QTA
FROM ORDTES T
JOIN ORDRIG R ON R.IDORDTES = T.ID
WHERE
ANNDOC = 2018
GROUP BY ANNDOC, NUMDOC
ORDER BY ANNDOC, NUMDOC
OPTION ( RECOMPILE )

GO
SELECT *
INTO after_query_transformation_stats
FROM sys.dm_exec_query_transformation_stats

GO
SELECT a.name , ( a.promised - b.promised ) as promised
FROM before_query_transformation_stats b
JOIN after_query_transformation_stats a
ON b.name = a.name
WHERE b.succeeded <> a.succeeded

DROP TABLE before_query_transformation_stats
DROP TABLE after_query_transformation_stats

These are the rules applied:

OK there is much to say but i think its enough for today
I hope you enjoyed this post.
If so, continue to follow my posts and subscribe to this blog!

See you soon!

Luca Biondi @ SQLServerPerformance blog!

Previous post: Inside the SQL Server Query Optimizer - part 4 Trivial Plan

A song with the mood of the day: Stop the rock

Search This Blog

SQL Server Performance & Troubleshooting – Where Milliseconds Matter 🚀