SQL Server, How to choose a clustered key value and the Last Page Insert Latch Contention problem
Hi Guys,
Today you are the data architect! your first task is to design a table from scratch.
But...how do you choose the clustered key value?
It sounds easy but rember that your decisions will have a lot of consequences for the long life of the project ... and you can't go wrong!
How to choose a clustered key value
Today let's review some theory ..
A good Clustered key must be narrow. It is not only a question of data occupation (today storage devices are spacious) but a lot of data takes time to be read.
We always consider that the value used will be used not only in the clustered key. It will in fact be used in any non-clustered index.
So for example fields Id and page_id, category_id must have the same data type.
Normally you should use a surrogate key value (like an ID identity, a progressive numerber) instead of a natural key (whose values are formed by attributes that exist and are used in the external world, for example surname+name).
A surrogate key value cannot change and this fact is very important for the next gold rule Which says that the clustered Key value must be static.
The clustered Key value should not change as its value is replicated in every non clustered index
Finally you should choose an Ever increasing Clustered key value.
The reason depends on how SQL Server stores the information.
Using an ever increasing clustered key value means that new records will be always be stored at the end of our clustered index (within the last actual page)
This avoids page split which are expensive and avoids index fragmentation.
Here, however, there is a danger problem that occurs when many users enter data in our table at the same time. In jargon it is said that there is a problem of scalability or that a procedure does not scale.
The Last Page Insert Latch Contention problem
Why is there a problem when many users enter data into our table at the same time?
I will try to explain it in the simplest way possible.
In order to manage data concurrency SQL Server engine need to guarantee consistency of its in-memory structures like index, data pages and internal structures, such as non-leaf pages in a B-Tree.
Latches are synchronization primitives that are used to do this job!
There are actually different types of latches:
- Buffer (pool) latches are used to protect pages in the buffer pool.
- I/O Latches are used to protect pages not yet loaded in the buffer pool.
In our case we are talking about buffer latches.
Let's pay attention to this logic:
- When you need to read a page, a worker thread must acquire a shared latch (PAGELATCH_SH)
- When you need to write a page, a worker thread must acquire an exclusive Latch (PAGELATCH_EX).
Now what happens when many users execute an insert statemement?
We are trying to insert data on the same page so there are many worker thread that are contending about an exclusive latch on the same last page in the clustered index.
This is the Last Page Insert Latch Contention problem.
How to resolve?
- Adopt a random clustered key value
- Reverse index technique
- Enable OPTIMIZE_FOR_SEQUENTIAL_KEY option in sql 2019
- In memory technique
- Partitioning technique
Comments
Post a Comment