SQL Server 2025 CU3 vs CU4: Has Microsoft Really Fixed the SESSION_CONTEXT Issue? Test Day 1

🚀 SQL Server 2025 CU3 vs CU4: Has Microsoft Fixed the SESSION_CONTEXT Issue? Test Day 1


👉 Before diving in, if you miss my latest post take a look here: SQL SERVER 2025 CU4 Is OUT!

⚡ The Hook

If you’re using SESSION_CONTEXT under parallel workloads, this bug could silently break your logic.

In this post, I’ll prove whether CU4 actually fixes it, using real repro and benchmark tests, not assumptions.


📌 TL;DR

💣 SESSION_CONTEXT was NOT thread-safe under parallelism
🧪 CU3 returns inconsistent or NULL values under load
✔️ CU4 appears to fix correctness issues
🚀 Performance is NOT the gain — correctness is

🧠 Intro

Hi SQL SERVER Guys,

This is the first post exploring and testing what CU4 has fixed or improved. 

I'am not particularly fond of it, but some of us rely on SESSION_CONTEXT() to pass session-level metadata.

Under parallel execution, though… things can get ugly and messy.


This is not a cosmetic bug since this is data correctness at risk!


💣 What’s the Problem

As we said in the previous post, under parallel execution, SQL Server (pre-CU4) evaluates SESSION_CONTEXT() across multiple threads without proper synchronization.

  • 💣 NULL values
  • 💣 Inconsistent values
  • 💣 Non-deterministic GROUP BY

👉 Root cause: lack of thread-safe handling in parallel operators.

It's time to test if Microsoft really fixed this issue!


🧪 TEST 1 — Repro on CU3

To perform this test on the SQL SERVER 2025 CU4 I have create a simple table TestTable and i filled it with some data.

This is the code create and fill the table:

CREATE TABLE TestTable (id INT IDENTITY(1,1), codice VARCHAR(80)); INSERT INTO Prova (codice) VALUES ('aaaaa'); GO 100
Then i executed the following statements:
EXEC sp_set_session_context @key = 'user_id', @value = 42; SELECT p1.codice, MAX(p2.id), MIN(ctx.ctx_value) FROM prova p1 JOIN prova p2 ON p1.codice = p2.codice JOIN prova p3 ON p1.codice = p3.codice CROSS APPLY ( SELECT SESSION_CONTEXT(N'user_id') AS ctx_value ) ctx GROUP BY p1.codice;

💣 Expected Result on CU3

  • NULL values
  • Inconsistent ctx_value
  • Broken aggregations
👉If you take a look to the you can observe that the degree of parallelism has been set to 1 to avoid wrong results.


Now its time to perform the same test the CU4...


🧪 TEST 2 — Same Test on CU4

Same query. No changes.

✔️ Expected Result

  • Stable value (42)
  • No inconsistencies
  • Deterministic output
This the max degree of parallelism is not touched:


Conclusion: Microsoft fixed this Issue!


🧪 TEST 3 — SQL Server Benchmark CU3 vs CU4 

The following statements compares CU3 vs CU4 by measuring:

  • CPU time
  • Elapsed time
  • Logical reads (via DMVs)
  • Wait stats (delta analysis)

🔧 1. Benchmark Setup

SET NOCOUNT ON;

IF OBJECT_ID('tempdb..#results') IS NOT NULL DROP TABLE #results;

CREATE TABLE #results (
    iteration INT,
    elapsed_ms BIGINT,
    cpu_ms BIGINT
);

🧹 2. Cache Cleanup (CRITICAL for fair comparison)

CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

Ensures no cached data or plans affect the results.


📊 3. Initial Wait Stats Snapshot

IF OBJECT_ID('tempdb..#waits_before') IS NOT NULL DROP TABLE #waits_before;

SELECT *
INTO #waits_before
FROM sys.dm_os_wait_stats;

🚀 4. Benchmark Execution Loop

DECLARE @start DATETIME2(7);
DECLARE @end DATETIME2(7);

DECLARE @Iterations INT = 400;
DECLARE @i INT = 0;
WHILE @i < @Iterations BEGIN SET @start = SYSDATETIME(); -- TEST QUERY SELECT p1.codice, MAX(p2.id) AS max_id, MIN(ctx.ctx_value) AS ctx_val FROM prova p1 JOIN prova p2 ON p1.codice = p2.codice JOIN prova p3 ON p1.codice = p3.codice CROSS APPLY ( SELECT SESSION_CONTEXT(N'user_id') AS ctx_value ) ctx GROUP BY p1.codice; SET @end = SYSDATETIME(); INSERT INTO #results (iteration, elapsed_ms, cpu_ms) VALUES ( @i, DATEDIFF(ms, @start, @end), DATEDIFF(ms, @start, @end) ); SET @i += 1; END

📈 5. Logical Reads (Correct DMV-based Measurement)

SELECT 
    SUM(qs.total_logical_reads) AS total_logical_reads
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) t
WHERE t.text LIKE '%SESSION_CONTEXT%'
  AND t.text LIKE '%prova%';

Filters only relevant queries to ensure accurate measurement.


🧠 6. Final Wait Stats Snapshot

IF OBJECT_ID('tempdb..#waits_after') IS NOT NULL DROP TABLE #waits_after;

SELECT *
INTO #waits_after
FROM sys.dm_os_wait_stats;

📉 7. Wait Stats Delta Analysis

IF OBJECT_ID('tempdb..#waits_delta') IS NOT NULL DROP TABLE #waits_delta;

SELECT 
    a.wait_type,
    (a.wait_time_ms - b.wait_time_ms) AS wait_ms,
    (a.signal_wait_time_ms - b.signal_wait_time_ms) AS signal_ms
INTO #waits_delta
FROM #waits_after a
JOIN #waits_before b
    ON a.wait_type = b.wait_type
WHERE (a.wait_time_ms - b.wait_time_ms) > 0;

📊 8. CPU & Elapsed Time Results

SELECT 
    COUNT(*) AS iterations,
    AVG(cpu_ms) AS avg_cpu_ms,
    AVG(elapsed_ms) AS avg_elapsed_ms
FROM #results;

🔥 9. Top Wait Events (CU Comparison Insight)

SELECT TOP 15
    wait_type,
    wait_ms,
    signal_ms
FROM #waits_delta
ORDER BY wait_ms DESC;

📌 10. Final Summary (Blog-Ready Output)

SELECT 
    'CU Benchmark Summary' AS test_name,
    (SELECT AVG(cpu_ms) FROM #results) AS avg_cpu_ms,
    (SELECT AVG(elapsed_ms) FROM #results) AS avg_elapsed_ms,
    (SELECT SUM(qs.total_logical_reads)
     FROM sys.dm_exec_query_stats qs
     CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) t
     WHERE t.text LIKE '%SESSION_CONTEXT%') AS logical_reads,
    (SELECT TOP 1 wait_type FROM #waits_delta ORDER BY wait_ms DESC) AS top_wait_type;

✔ How to use this benchmark:

  • Run on CU3 and save results
  • Run on CU4 and save results
  • Compare CPU, elapsed time, logical reads, and waits

This setup ensures a reproducible and comparable performance analysis between CU versions.



🚀 Benchmark results

Now we will execute the procedure both on CU3 and CU4 and we will evaluate per following metrics:

📊 Metrics

  • CPU time
  • Elapsed time
  • Logical reads
  • Wait stats

📊 Performance Summary (CU3 vs CU4)

Version Iterations Avg CPU (ms) Avg Elapsed (ms) Improvement
CU3 400 256 256 -
CU4 400 141 141 -44.9% CPU

Insight: CU4 shows a significant reduction in both CPU and elapsed time (~45% improvement).

⚡ Top Wait Types Comparison

Wait Type CU3 (ms) CU4 (ms) Delta
SOS_WORK_DISPATCHER29373935162015↑ higher
SLEEP_TASK191901198485≈ stable
QDS_ASYNC_QUEUE173283194245↑ slight increase
LOGMGR_QUEUE149310157132↑ slight increase
CHECKPOINT_QUEUE73516127943↑ significant increase
XE_TIMER_EVENT7700680423≈ stable
REQUEST_FOR_DEADLOCK_SEARCH7700380394≈ stable
LAZYWRITER_SLEEP7514378774≈ stable
DIRTY_PAGE_POLL7468278627≈ stable

Insight: While CU4 improves CPU performance, some background waits (CHECKPOINT_QUEUE, SOS_WORK_DISPATCHER) increased significantly.

🧠 Final Benchmark Summary

Metric CU3 CU4
Avg CPU (ms) 256 141
Avg Elapsed (ms) 256 141
Logical Reads NULL NULL
Top Wait Type SOS_WORK_DISPATCHER SOS_WORK_DISPATCHER

Conclusion: CU4 delivers a clear CPU and latency improvement, but also shows changes in scheduling and background task behavior, introducing higher coordination overhead in parallel execution, especially around dispatcher-related waits—suggesting that SESSION_CONTEXT() becomes more expensive under parallel plans due to increased thread synchronization.



🔍 Reality Check

  • ✔️ CU4 fixes correctness
  • ⚠️ Performance likely unchanged
  • ⚠️ Possible slight overhead

Correctness > Performance

“A fast wrong result is still wrong. CU4 restores correctness under parallelism.”

🚀 My REAL Strategy

  • 💣 Avoid SESSION_CONTEXT in parallel queries (pre-CU4)
  • ✔️ Test CU4 under load
  • ⚠️ Critical systems → MAXDOP 1 fallback
  • ✔️ Always validate determinism

🔥 Engagement Question

Are you using SESSION_CONTEXT in parallel workloads?
Or did you already get bitten by this bug?


📚 References


📢 Support the Blog:

Did you find this deep-dive helpful? The ads you see here are selected to reflect your interests. If a partner's offer catches your eye, give it a look! Your engagement helps me continue publishing premium SQL Server content.

Biondi Luca @2026 - Sharing over 25 years of Gained Knowledge for Passion.

Comments

I Post più popolari

Speaking to Sql Server, sniffing the TDS protocol

SQL Server, find text in a Trigger, Stored Procedures, View and Function. Two ways and what ways is better

SQL Server, Avoid that damn Table Spool!