Inside the SQL Server Query Optimizer - part 2 All about the Simplification
After the introduction done in the previous article, today we talk about the simplication step of the Query Optimizer.
In my opinion, knowing what types of optimization SQL Server might apply is very important in order to write a Query that run Quickly!
So, are you ready for the second part of this exciting journey to discover the Query Optimizer?
Ready? Go!
The Simplification
Let now start speaking about the Simplification phase.Simplification is the first step of the optimization pipeline.
It's an important step because during this phase the optimizer try to modify the logical tree in order to remove redundancies.
Optimizer try also to change the order of the logical operation in order to facilitate later step.
We can split the simplification phase into various sub phases that are:
- Constant Folding
- Domain simplification
- Predicate push-down
- Join simplification
- Contradiction detection
The constant folding
During this phase SQL server evaluates one or more expression before the Query is compiled.
This happens in limited cases:
1) Arithmetic expressions that contain only constant and without variables or parameters.
SELECT ID FROM TABLE WHERE IDFIELD = 5 + 7
And so after the Folding we have:
SELECT ID FROM TABLE WHERE IDFIELD = 12
SELECT ID FROM TABLE WHERE IDFIELD = 5 + @a
Because @a is a parameter2) Logical expressions that contains only constants
SELECT ID FROM TABLE WHERE 7 > 5
Is foldable: since 7 > 5 after the folding we have TRUE.
SELECT ID FROM TABLE WHERE 7 > 5
3) Built-in deterministic functions whose input rely solely on provided values. Non deterministic functions and scalar user-defined functions (UDF) are not foldable.
Note: we can verify using the traceflag 8605 and then watching at the execution plan: Example n° 1
SELECT UnitPrice from Product where UnitPrice = 18+12
OPTION
(
RECOMPILE,
QUERYTRACEON 3604,
QUERYTRACEON 8605
);
Example n° 2
SELECT UnitPrice from Product where UnitPrice + 1+2 = 2+2
OPTION
(
RECOMPILE,
QUERYTRACEON 3604,
QUERYTRACEON 8605
);
Note:
- Only the part at the right of the “=” 2 + 2 is folded.
- The part at the left of the “=” is not folded because a column
- SQL doesn’t effect this calculation: UnitPrice + 1+2 = 2+2 => Unitprice = 2+2 – (1+2) => UnitPrice = 1
Note that is a best practice write T-SQL code in order to permit the use of the constant folding feature. In this manner we alleviate the need to repeatedly evaluate the expression at runtime.
The Domain simplification
Example:
SELECT * from Product
where (UnitPrice between 1 and 2) OR
(UnitPrice between 2 and 5)
OPTION
(
RECOMPILE,
QUERYTRACEON 3604,
QUERYTRACEON 8605
);
Predicate push-down
Note that this operation should have always a positive effect.
This is because we try to reduce the number of rows that the later logical operations must do.
In other words, the predicate push-down is the action of using the predicate directly in the seek or in the scan and so reading only the rows that match the give predicate.
Example:
We have a table “Product”
SELECT P.SupplierId , P.UnitPrice
FROM Product P
WHERE
P.SupplierId = 1 and
P.UnitPrice = 18
If we look at the execution plan we can see we read only 3 rows from the table products.
The Join Simplification is another great feature!
It’s goal is to remove JOIN when they are not necessary.
Convert also a left outer join into an inner join when the NULLs introduced by the left outer join are later by another feature in the Query.
Lastly remove the empty subexpression.
Example 1
Join Simplification
It’s goal is to remove JOIN when they are not necessary.
Convert also a left outer join into an inner join when the NULLs introduced by the left outer join are later by another feature in the Query.
Lastly remove the empty subexpression.
Example 1
SELECT P.SupplierId, P.UnitPrice
From Product P
LEFT OUTER JOIN Supplier S on S.Id = P.SupplierId
Where
P.SupplierId = 1 and
P.UnitPrice = 18
SELECT P.SupplierId , P.UnitPrice,
(SELECT Id FROM Supplier S where S.Id = P.SupplierId AND S.Id = 9999)
from Product P
Where
P.SupplierId = 1 and
P.UnitPrice = 18
These two Queries produce the same execution plan
SELECT P.SupplierId , P.UnitPrice, S.id
from Product P
LEFT OUTER JOIN Supplier S ON S.Id = P.SupplierId
Where
P.SupplierId = 1 and
P.UnitPrice = 18 and
S.Id = 1
SELECT P.SupplierId , P.UnitPrice, S.id
from Product P
INNER JOIN Supplier S ON S.Id = P.SupplierId
Where
P.SupplierId = 1 and
P.UnitPrice = 18 and
S.Id = 1
Example 4
Select
O.CustomerId,
OI.Id,
P.id
from [Order] O
join [OrderItem] OI on OI.OrderId = O.id
join [Product] P on OI.ProductId = P.Id
But to implement the INNER JOIN shouldn't table product still be needed?
Actually NOT!
The Optimizer use the foreign key and so there is no reason to access to the table Product
CREATE NONCLUSTERED INDEX [IndexOrderItemProductId] ON [dbo].[OrderItem]([ProductId] ASC)
Contradiction detection
The optimizer is able to resolve contradictions.
Explicit contradiction inside the Query:
SELECT * FROM Product where UnitPrice > 18.0 and UnitPrice < 18.0
Can use the constraint.
For example with this constraint
ALTER TABLE [dbo].[Product] WITH CHECK ADD CONSTRAINT [CK_Product] CHECK (([UnitPrice]<=(100000)))
Running this select
SELECT * FROM Product where UnitPrice > 100000
We obtain a constant scan
That's all for today! If you have found this Article AWE then wait for the continuation where we will talk about the next step of the Optimizer pipeline: the cardinality estimation!
See you soon!
Luca Biondi @ SQLServerPerformance blog!
Comments
Post a Comment