SQL Server, A little about composite indices and a little about the relative statistics ..

Hi Guys,

Today a short midweek post!
 
We will talk about the composite index and statistics.
Nothing particularly profound, but any information we can learn, I'm sure will be useful in the future.

Enjoy!

 

Composite index

What is a composite index? 
Well, a composite index is simply an index that defined on more than one column.

For example.

Upon the following table ORDTES


CREATE TABLE ORDTES
(Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
DOC_YEAR int,
DOC_NUMBER int,
CUSTOMER_NAME varchar(80)
);
     

 

We can define an index on both the columns DOC_YEAR, DOC_NUMBER:


CREATE INDEX IDX_ORDTES_DOC_YEAR_DOC_NUMBER ON ORDTES (DOC_YEAR, DOC_NUMBER)
     
 

 

This means that if I search for a specific pair of DOC_YEAR, DOC_NUMBER values I should arrive directly with a seek of the searched record.

Is this true?

Yes, but let's see more in detail by taking a look at the statistics ... we will find out interesting things ..


Statistics

Statistics are critical because SQL Server uses their data to get the most accurate execution plan possible.

Let’s populate the table with some data:

WITH CTE AS ( 
SELECT 1 AS Numb
UNION ALL
SELECT a.Numb + 1
FROM CTE a
WHERE a.Numb < 10
),
CTE2 AS (
SELECT 1 AS Numb
UNION ALL
SELECT a.Numb + 1
FROM CTE2 a
WHERE a.Numb < 100000
)
INSERT INTO ORDTES (DOC_YEAR, DOC_NUMBER, CUSTOMER_NAME)
SELECT
1999 + CTE.Numb,
CTE2.Numb,
'customer'
FROM CTE, CTE2
OPTION (MAXRECURSION 0)

 

Now let's see how the statistics for the previously created index are made.

  • Open the Statistics Menu of the OrdTes table.
  • Right-click the statistic that has the index name.
  • Choose the Properties item.

We can see the statistics is made by thw two columns DOC_YEAR and DOC_NUMBER.


If now we click on the Detail item at the left of the windows we can see that...


Statistics are built with data from the first column of the index only.

So, from the pics above:

  • We can see that each DOC_YEAR value is present 100,000 times.
  • We have no details on the DOC_NUMBER field

 

Did you expect it? No I suppose. 

So where is the trick?
Where are the statistics of the second column hidden?
 
SQL Server uses the first key for the histogram and if it needs records on [DOC_NUMBER] column, it creates a new statistic for this column
Auto-created statistics starting from _WA_sys.

 

Examples on composite index

 
Let's analyze these three T-SQL statements how they behave with the composite index we had created:


SELECT DOC_YEAR, DOC_NUMBER FROM ORDTES WHERE DOC_YEAR = 2000

SELECT DOC_YEAR, DOC_NUMBER FROM ORDTES
WHERE DOC_YEAR = 2000 and DOC_NUMBER = 3

SELECT DOC_YEAR, DOC_NUMBER FROM ORDTES
WHERE DOC_NUMBER = 3  

Run all the Queries, with the command: SET STATISTICS IO ON



 


Tabella 'ORDTES'. Conteggio analisi 1, letture logiche 226, letture fisiche 0, letture server di pagine 0, letture read-ahead 0, letture read-ahead server di pagine 0, letture logiche LOB 0, letture fisiche LOB 0, letture LOB server di pagine 0, letture LOB read-ahead 0, letture read-ahead LOB server di pagine 0.

Tabella 'ORDTES'. Conteggio analisi 1, letture logiche 3, letture fisiche 0, letture server di pagine 0, letture read-ahead 0, letture read-ahead server di pagine 0, letture logiche LOB 0, letture fisiche LOB 0, letture LOB server di pagine 0, letture LOB read-ahead 0, letture read-ahead LOB server di pagine 0.

Tabella 'ORDTES'. Conteggio analisi 1, letture logiche 2237, letture fisiche 0, letture server di pagine 0, letture read-ahead 0, letture read-ahead server di pagine 0, letture logiche LOB 0, letture fisiche LOB 0, letture LOB server di pagine 0, letture LOB read-ahead 0, letture read-ahead LOB server di pagine 0.

 

The first Query uses the composite index in seek even if the where predicate has only the DOC_YEAR predicate.

SELECT DOC_YEAR, DOC_NUMBER FROM ORDTES WHERE DOC_YEAR = 2000

 

 The second Query uses the composite index in seek too.

SELECT DOC_YEAR, DOC_NUMBER FROM ORDTES WHERE DOC_YEAR = 2000 and DOC_NUMBER = 3

 

The third Query use the composite index but in scan. All the index is scanned and 2227 pages are read.

SELECT DOC_YEAR, DOC_NUMBER FROM ORDTES WHERE DOC_NUMBER = 3

 

 

Thats all for today but stay tuned for the next post!
Luca
 

If you are genuinely and truly interested in one of the products that appear in advertising ... click the adv from my blog to support my site! 

Or click the (    ) button


 

 





Previous post: SQL Server, the Predicate and the Residual Predicate property of the Execution Plan

Comments

I Post più popolari

SQL Server, datetime vs. datetime2

SQL Server, execution plan and the lazy spool (clearly explained)

La clausola NOLOCK. Approfondiamo e facciamo chiarezza!