SQL server, more on the Ghost cleanup task

Hi Guys,


During the last post we saw that data are not physically deleted from the table where you execute a delete statement. Data are marked as deleted only.

It is a scheduled task called Ghost cleanup task that physically removes data from the table.

We have also seen that we can disable this process via global traceflag number 611.

So, if you enable flag 611, the records will no longer be automatically removed from the table. (In reality this is not true  for the heap tables where the traceflag 661 has no effect)

Sure, they are not of those configurations to try in production!
But what would that entail?

The positive thing is that in the presence of many cancellations the workload of the server should be less.
The bad thing is that the disk space is not freed up and that as the tables are physically larger, the table scan should take longer.

But let's see if that's true or not with an example!

 

What happens if you disable the Ghost cleanup task?

Before running the example i will show you how to visualize when the Ghost cleanup starts.
 
Well, SQL Server export an extended event called ghost_cleanup so we can create an event session to trace this event.
This will help us to understand how often this event is performed.

Execute the following script:


CREATE EVENT SESSION TrackGhostCleanup
ON SERVER
ADD EVENT sqlserver.ghost_cleanup
( ACTION(sqlserver.database_id))
ADD TARGET package0.asynchronous_file_target(
     SET filename='C:\TrackGhostCleanup.xel',
         metadatafile='C:\TrackGhostCleanup.xem')
WITH (MAX_MEMORY = 4MB, EVENT_RETENTION_MODE = NO_EVENT_LOSS )
GO
ALTER EVENT SESSION TrackGhostCleanup
ON SERVER
STATE=START

From now everytime the Ghost cleanup start we will record this information inside some files.

We can retrieve the stored infos by execution the following query:


SELECT 
event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
event_data.value('(event/@package)[1]', 'varchar(50)') AS package_name,
event_data.value('(event/@id)[1]', 'int') AS id,
event_data.value('(event/@version)[1]', 'int') AS version,
DATEADD(hh,
DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP),
event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
event_data.value('(event/action[@name="database_id"]/value)[1]', 'int') as database_id,
event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') as file_id,
event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') as page_id
FROM
(SELECT
CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\TrackGhostCleanup*.xel', 'C:\TrackGhostCleanup*xem', null, null)
) as tab

Ok we will execute this statement later when we run our example.

For the example we will create a loop in which we will insert and delete the same row.

This is the Query:


DECLARE @j int
SET @j = 100000;
WHILE @j > 0
BEGIN
INSERT INTO DataTable (code) VALUES ('Test Ghostrow')
DELETE FROM DataTable WHERE id = (SELECT MAX(id) FROM DataTable)
SET @j = @j -1;
END

After running this query 3 times, we will run the same query 3 times again but with the ghost cleanup task disabled by enabling the traceflag:


DBCC TRACEON(3604,-1)
GO
DBCC TRACEON(661,-1)

 

Results and considerations.

 

First time i run the Query i saw these execution times: 0:28 , 0:28 , 0:29 seconds

I can see that the Ghost cleanup process start every 5 seconds

When the ghost cleanup process start we can observe only barely visible peaks in the disk activity.


Now i repeat the same test disabling the cleanup process.

This time i saw these execution times: 0:29 , 0:29 , 0:31 seconds

Obviousbly the ghost process never starts.

 

Execution times are almost identical with or without the cleanup process. The cleanup process requires few disk or processor resources.

 

That's all for today mates and stay tuned for the next post! 
Luca


 




 

Help me to share knowledge on my blog  

Previous post: SQL server, How to recover just deleted data (...ops i did it again!)

Comments

I Post più popolari

SQL Server, execution plan and the lazy spool (clearly explained)

SQL Server, datetime vs. datetime2

La clausola NOLOCK. Approfondiamo e facciamo chiarezza!