SQL Server, the story of a ROUND function inside a query
Hi Friends,
Today is the day of the light post since tomorrow it’s Saturday: Light but interesting!
Today I would show you a little example in order to see how much the execution plan of a Query may change by adding a simple ROUND function!
We have only a rule: a post and a example!
So, In order to do the simplest example I will use only very simple Queries: one Table and no join at all.
Enjoy the reading!
Round & Statistics
For the example we will use the OrdRig table (do you remember.. is one of the tables we usually use for our examples). It contains the n rows of order
So, take a look to this two Queries.
select id from OrdRig where qta1 > 10000000
select id from OrdRig where qta1 > round(10000000,0)
What do you expect to happen? Do they have the same result?
Yes if course the result it is the same but take a look to the execution plan:
If we apply the ROUND function then the execution plan has a decidedly different cost .. even 99 times greater!
What's up?
Let's understand what happen looking at the estimated number of lines to read properties:
Without the ROUND function the number of pages to be read is just 10 (ten pages of 8KB each)
But if we look at the same property by running the query with the ROUND we see that read pages are many many more! 179K!!!
We can see it clearly there is an incorrect number of pages estimation and this is a the problem.
REMEMBER: Incorrect estimation of the number of pages is a problem because it can cause SQL Server to choose a suboptimal execution plan.
Now, in order to inspecting the problem in a more deep level we should ask us: how does the optimizer estimate the number of lines to read?
We must remember that the optimizer can take advantage of the statistical informations that it keeps for each index.
We can see statistics associates to an index through the DBCC SHOW_STATISTICS T-SQL command ('nometab', <index name>);
Today we
will not explain in detail how statistics works but for our example
just looking at the picture below we can see that using the index <xxx>
there are only 10 lines to extract where qta is great
than 10000000.
But what if I apply the ROUND function?
The SQL Server optimizer simply does not use the detail of the statistics but proposes a fixed value regardless of the value specified as a parameter of the round.
Doing so, the value that comes out is not absolutely optimal.
This is a well known behaviour, and in fact with the new versions, from SQL2014 inclusive onwards, this problem has been fixed!
In our example ORDRIG table has 599845 row which divided by 1/3 makes our 179950 value proposed by the execution plan!
What other functions have this behavior besides the ROUND?
Enough…. FLOOR, CEIL, ABS, SQUARE
Instead who is saved? Fortunately the CAST and the CONVERT!
NOTES:
All of the above is valid with a compatibility level (CL) lower than SQL2014. Starting from SQL2014 inclusive onwards with the new Cardinality estimator this behaviour has been solved.
Comments
Post a Comment