SQL SERVER CPU Starvation: When 40% CPU Behaves Like 100%
SQL SERVER. A deep analysis on CPU Starvation
Why a server at 40% CPU can still behave like a completely saturated system 💣
Deep dive into SOS Scheduler fairness, Hekaton garbage collection, hash index scans, and why SQL Server CU5 matters far more than most DBAs think.
Before we dive into today's topic, if you missed my previous post you can take a look at SQL SERVER CU5: Fixes CPU starvation issues
👉 If you found this deep-dive helpful, feel free to check out the ads—your support helps me keep creating high-quality SQL Server content for the community.
🧠 TL;DR
✔️ SQL Server uses cooperative scheduling through SOS Scheduler and workers must voluntarily yield CPU
✔️ SQL Server CU5 improves scheduler fairness during In-Memory OLTP hash index garbage collection scans 🚀
✔️ CPU starvation can occur even when total CPU usage looks moderate 💣
✔️ Bad hash index bucket sizing and large version chains can dramatically increase GC traversal costs ⚡
✔️ Persistent runnable_tasks_count growth is often more dangerous than raw CPU percentage 🔍
🚀 The Hook
In this post, I’ll show you why CPU starvation is one of the most misunderstood performance problems in SQL Server.
⚡ More importantly, you will understand why SQL Server CU5 quietly fixes a potentially catastrophic scheduler fairness problem inside In-Memory OLTP garbage collection scans.
💣 CPU starvation is one of the few SQL Server problems where average CPU usage can completely hide catastrophic scheduler unfairness.
⚡ A server at 40% CPU can still behave like a server at 100% CPU when scheduler fairness collapses.
Hi SQL SERVER Guys and Gals,
Many DBAs monitor CPU percentage.
Very few monitor scheduler fairness.
And that is exactly where some of the nastiest SQL Server production problems hide.
The CU5 fix around In-Memory OLTP garbage collection is important because it touches the heart of SQL Server execution internals:
- ⚡ SOS Scheduler
- ⚡ Cooperative scheduling
- ⚡ Worker yielding
- ⚡ Runnable queues
- ⚡ Scheduler monopolization
🧠 What Cooperative Scheduling Really Means
SQL Server does not rely on Windows preemptive scheduling alone.
Internally, SQL Server uses cooperative scheduling through SOS Scheduler.
Workers are expected to voluntarily yield execution periodically.
If an internal task runs too long without yielding, runnable queues grow and other workers experience starvation.
⚡ Key concepts every serious DBA should understand:
- SOS Scheduler → SQL Server internal scheduler abstraction
- Worker → execution entity assigned to tasks
- Quantum → execution time slice
- Yield → voluntary CPU release
- Runnable Queue → waiting workers ready to execute
- Scheduler Fairness → balanced CPU access across workers
💣 The problem starts when some workers fail to yield frequently enough.
💣 Why Hash Indexes Can Make Things Worse
In-Memory OLTP hash indexes are extremely fast for point lookups.
But internally they rely on buckets and hash chains.
Under bad conditions, these chains become expensive to traverse.
- ⚡ Incorrect bucket_count
- ⚡ Heavy delete workloads
- ⚡ Skewed key distribution
- ⚡ High collision rates
- ⚡ Massive row version churn
Large hash chains increase garbage collection traversal costs.
Under high churn workloads, GC workers may spend excessive time scanning version chains and hash buckets.
⚡ That is where scheduler monopolization begins.
🧠 Garbage Collection Internals
This is where things become very interesting under the hood.
In-Memory OLTP uses:
- ⚡ Row versioning
- ⚡ Timestamp visibility
- ⚡ Version chains
- ⚡ Epoch-based garbage collection
- ⚡ Memory residency
In-Memory OLTP does not immediately remove deleted rows.
Older row versions remain visible for active transactions until epoch advancement allows garbage collection cleanup.
💣 This means garbage collection is not optional.
It is a core architectural requirement of Hekaton.
And under heavy workloads, garbage collection can become extremely CPU-intensive.
🔍 DIAGNOSIS
Start by checking scheduler pressure.
-- 🔍 Scheduler Pressure Diagnostic Query
SELECT
scheduler_id,
cpu_id,
current_tasks_count,
runnable_tasks_count,
active_workers_count,
pending_disk_io_count,
load_factor
FROM sys.dm_os_schedulers
WHERE status = 'VISIBLE ONLINE';
⚡ Persistent runnable_tasks_count greater than zero is suspicious.
💣 Especially on systems using Hekaton workloads.
🧪 QUERY
Now inspect SOS_SCHEDULER_YIELD waits.
-- 🔍 Top Scheduler Waits
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type = 'SOS_SCHEDULER_YIELD';
⚡ Important:
- Healthy yielding is NORMAL
- SOS_SCHEDULER_YIELD alone is NOT automatically bad
- Runnable queue growth is the real danger signal
💣 The issue appears when workers fail to yield fairly while runnable queues continue growing.
🧪 Hash Index Statistics (THIS DMV IS GOLD)
-- 🧪 Hash Index Statistics
SELECT
OBJECT_NAME(object_id) AS table_name,
name,
total_bucket_count,
empty_bucket_count,
avg_chain_length,
max_chain_length
FROM sys.dm_db_xtp_hash_index_stats;
⚡ This DMV can reveal:
- Bad bucket sizing
- Excessive collisions
- Long traversal chains
- Potential GC traversal amplification
🧠 CPU Saturation vs CPU Starvation
| CPU Saturation | CPU Starvation |
|---|---|
| CPU genuinely full | Scheduler fairness collapse |
| High query CPU cost | Worker monopolization |
| Parallelism pressure | Yield imbalance |
| System busy | System unfair |
💣 Real Symptoms DBAs Actually See
- ⚡ CPU not at 100% but system extremely slow
- ⚡ Random latency spikes
- ⚡ Login delays
- ⚡ AG synchronization lag
- ⚡ THREADPOOL waits
- ⚡ Query timeouts
- ⚡ Runnable queue spikes
- ⚡ Intermittent stalls
💣 Many DBAs misdiagnose these as generic CPU issues.
In reality, the problem may be scheduler fairness collapse.
⚡ NUMA and Scheduler Imbalance
Starvation often does NOT affect all schedulers equally.
This becomes especially visible on:
- NUMA systems
- Soft-NUMA environments
- Large core-count servers
⚡ Some schedulers may become “hot” while others remain underutilized.
This creates:
- Uneven worker distribution
- Localized scheduler pressure
- Artificial latency spikes
🚀 My REAL Strategy
In real production systems, I NEVER evaluate only raw CPU percentage.
⚡ I correlate:
- SOS_SCHEDULER_YIELD waits
- Runnable queue growth
- THREADPOOL waits
- Hekaton workload intensity
- Hash index chain statistics
- NUMA imbalance
- Latency spikes
💣 If runnable queues stay elevated for sustained periods, I immediately suspect scheduler unfairness.
🚀 My recommendations:
- Apply CU5 if you use In-Memory OLTP heavily
- Continuously monitor runnable_tasks_count
- Validate hash index bucket sizing
- Avoid oversized bucket_count configurations
- Monitor version chain growth
- Watch for scheduler imbalance across NUMA nodes
📊 TAKEAWAY
SQL Server CU5 is not “just another cumulative update.”
⚡ It directly improves scheduler fairness inside one of the most CPU-sensitive areas of the SQL Server Engine.
For latency-sensitive systems such as:
- Trading platforms
- Telemetry ingestion
- IoT pipelines
- Real-time APIs
- Microservices backends
💣 Even small starvation bursts can destroy P99 latency.
And that is exactly why understanding SOS Scheduler internals is no longer optional for serious SQL Server performance troubleshooting.
📚 Official References
- Microsoft Learn - In-Memory OLTP
- Microsoft Learn - sys.dm_db_xtp_hash_index_stats
- Microsoft Learn - sys.dm_os_schedulers
- Microsoft Learn - sys.dm_os_wait_stats
- Microsoft Learn - Hash Indexes for Memory-Optimized Tables
📢 Support the Blog: Did you find this deep-dive helpful? The ads you see here are selected to reflect your interests. If a partner's offer catches your eye, give it a look! Your engagement helps me continue publishing premium SQL Server content for the community.
Performance Tuning Knowledge Hub
Everything you need to master SQL Server, all in one place.
Comments
Post a Comment