Why Most Developers Should Stop Using XML in SQL Server!
Why Most Developers Should Stop Using XML in SQL Server
SQL Server Performance Series – Understanding the Cost of Modern Data Formats
Previous article: JSON vs XML Indexing – Which One Really Performs Better?
Hi SQL SERVER guys, today we talk about something that may sound controversial.
XML in SQL Server.
For years it was considered the "enterprise" way to store structured documents.
And it was true but modern workloads are changing.
And due to this reason in many cases XML is now the wrong choice.
Of course, this does not mean XML is useless.
But it does mean that most developers still use it without understanding the real performance cost.
So let's break it down.
Why Developers Loved XML
When SQL Server introduced native XML support, it solved many real problems.
- Hierarchical data representation
- Schema validation
- Powerful XQuery language
- Integration with enterprise systems
For complex documents it was a big step forward.
But there was a price.
The Hidden Cost of XML
XML storage is expensive in several ways.
- Large storage footprint
- Expensive parsing
- Complex execution plans
- Heavy indexing structures
Even when indexed, XML queries can consume significantly more CPU than equivalent relational or JSON queries.
A Simple Example
Imagine storing order information inside XML.
DECLARE @x XML =
'
5000
250
';
SELECT
@x.value('(/Order/CustomerID)[1]', 'INT');
Looks clean.
But under the hood SQL Server must parse the XML structure and evaluate the XQuery expression.
This is CPU intensive.
Benchmark Scenario
Let's simulate a realistic workload.
- 1 million rows
- Customer data stored inside XML
- Frequent filtering by CustomerID
Test query:
SELECT *
FROM Orders_XML
WHERE OrderData.value('(/Order/CustomerID)[1]', 'INT') = 5000;
Results
- Execution Time: 4200 ms
- CPU Time: High
- Execution Plan: XML operators + internal shredding
Now Compare With JSON
The same data stored in JSON.
SELECT *
FROM Orders_JSON
WHERE JSON_VALUE(OrderJson, '$.CustomerID') = 5000;
Add a computed column index:
ALTER TABLE Orders_JSON
ADD CustomerID_JSON AS JSON_VALUE(OrderJson, '$.CustomerID');
CREATE INDEX IX_Orders_JSON_Customer
ON Orders_JSON(CustomerID_JSON);
Results
- Execution Time: 120 ms
- CPU Time: Much lower
- Execution Plan: Simple index seek
That's more than 30x faster.
Storage Impact
Another hidden problem is storage overhead.
- XML Primary Index can increase storage by 50–100%
- JSON computed column indexes add minimal overhead
On large systems this becomes extremely important.
When XML Still Makes Sense
To be fair, XML still has strong use cases:
- Very complex hierarchical documents
- Strict schema validation
- XQuery-heavy workloads
- Integration with legacy enterprise systems
But these scenarios are much rarer than developers think.
The Real Problem
The real issue is not XML itself.
It's using XML when you don't actually need it.
Many systems store simple key-value attributes inside XML documents.
That is the worst possible scenario:
- high parsing cost
- complex queries
- unnecessary storage overhead
The Real Lesson
Performance tuning is about choosing the right structure.
Not the most "enterprise-looking" one.
XML is powerful.
But power always comes with cost.
Final Takeaways
- XML queries are CPU intensive.
- XML indexes are heavy.
- JSON indexing is often faster and simpler.
- Use XML only when its features are truly required.
- Always benchmark before choosing a storage format.
And remember…
No SQL Server was harmed during these benchmarks 😉
See you in the next deep dive — where we keep pushing SQL Server to its limits

Comments
Post a Comment