Why REGEX functions in SQL SERVER 2025 can melt you CPU? Try yourself!
SQL Server 2025 REGEXP_LIKE Can Melt Your CPU 😱
Hi SQL SERVER Guys,
With the introduction of REGEX functions in SQL Server 2025, we finally have powerful pattern matching inside the engine. But… there is a very interesting (and dangerous) behavior that many developers don’t expect.
It is called catastrophic backtracking.
PS. if you missed my previous post: SQL Server 2025 LocalDB Crash – REGEXP Functions Break Due to Missing RegExpr.dll
What is Catastrophic Backtracking?
In the world of Regular Expressions, catastrophic backtracking happens when:
- the pattern contains nested repetitions
- the regex engine tries millions of combinations
The result?
- CPU spikes to 100%
- queries look stuck
- execution time grows exponentially
A Simple Test Query
This is a safe but very effective demo:
DECLARE @s NVARCHAR(MAX) =
REPLICATE('aaaaaaaaaaaaaaaaaaaaaaaaaaaa', 10) + 'b';
SELECT REGEXP_LIKE(@s,'(a+)+b');
At first glance, this looks harmless.
But the pattern:
(a+)+b
is exactly what creates the problem.
It tells the engine:
- match many 'a'
- inside another repetition
- then find 'b'
The engine will try all possible combinations.
Watch the CPU 😅
If you increase the input size:
DECLARE @s NVARCHAR(MAX) =
REPLICATE('a',10000) + 'b';
SELECT REGEXP_LIKE(@s,'(a+)+b');
You may observe:
sqlservr.exe → 100% CPU
The query can take seconds… or even minutes.
Even More Interesting Case
This version makes the issue very clear:
DECLARE @s NVARCHAR(MAX) =
REPLICATE('a',20000) + 'X';
SELECT REGEXP_LIKE(@s,'(a+)+$');
Here the engine tries to verify that the string contains only 'a'.
But it doesn’t… because of the final 'X'.
So SQL Server explores thousands (or millions) of backtracking paths before failing.
Why This Happens
Most regex engines are backtracking engines.
This means they:
- try one possibility
- fail
- go back and try another
Some patterns are linear:
O(n)
Others become:
O(2^n)
That’s where things explode.
How to Avoid It
Write safer patterns.
❌ Dangerous
(a+)+ (.*)+ (.+)+
✔ Better
a+
a{1,}
.*
Avoid nested quantifiers.
Real World Impact
Imagine a query like this:
SELECT * FROM Logs WHERE REGEXP_LIKE(Message,'(.*error)+')
On a large table, this can lead to:
- massive CPU spikes
- very slow queries
- server performance degradation
Final Thought
Regex in SQL Server 2025 is powerful… but with great power comes great responsibility 😉
A small pattern can:
- run instantly
- or bring your server to its knees
Understanding how the engine works is more important than ever.
See you in the next deep dive 👌
Luca Biondi @2026
Comments
Post a Comment