Why Most Developers Should Stop Using XML in SQL Server!

Why Most Developers Should Stop Using XML in SQL Server

SQL Server Performance Series – Understanding the Cost of Modern Data Formats

Previous article: JSON vs XML Indexing – Which One Really Performs Better?

 

Hi SQL SERVER guys, today we talk about something that may sound controversial.

XML in SQL Server.

For years it was considered the "enterprise" way to store structured documents.

And it was true but modern workloads are changing.

And due to this reason in many cases XML is now the wrong choice.

Of course, this does not mean XML is useless.

But it does mean that most developers still use it without understanding the real performance cost.

So let's break it down.


Why Developers Loved XML

When SQL Server introduced native XML support, it solved many real problems.

  • Hierarchical data representation
  • Schema validation
  • Powerful XQuery language
  • Integration with enterprise systems

For complex documents it was a big step forward.

But there was a price.


The Hidden Cost of XML

XML storage is expensive in several ways.

  • Large storage footprint
  • Expensive parsing
  • Complex execution plans
  • Heavy indexing structures

Even when indexed, XML queries can consume significantly more CPU than equivalent relational or JSON queries.


A Simple Example

Imagine storing order information inside XML.


DECLARE @x XML =
'
    5000
    250
';

SELECT
    @x.value('(/Order/CustomerID)[1]', 'INT');

Looks clean.

But under the hood SQL Server must parse the XML structure and evaluate the XQuery expression.

This is CPU intensive.


Benchmark Scenario

Let's simulate a realistic workload.

  • 1 million rows
  • Customer data stored inside XML
  • Frequent filtering by CustomerID

Test query:


SELECT *
FROM Orders_XML
WHERE OrderData.value('(/Order/CustomerID)[1]', 'INT') = 5000;

Results

  • Execution Time: 4200 ms
  • CPU Time: High
  • Execution Plan: XML operators + internal shredding

Now Compare With JSON

The same data stored in JSON.


SELECT *
FROM Orders_JSON
WHERE JSON_VALUE(OrderJson, '$.CustomerID') = 5000;

Add a computed column index:


ALTER TABLE Orders_JSON
ADD CustomerID_JSON AS JSON_VALUE(OrderJson, '$.CustomerID');

CREATE INDEX IX_Orders_JSON_Customer
ON Orders_JSON(CustomerID_JSON);

Results

  • Execution Time: 120 ms
  • CPU Time: Much lower
  • Execution Plan: Simple index seek

That's more than 30x faster.


Storage Impact

Another hidden problem is storage overhead.

  • XML Primary Index can increase storage by 50–100%
  • JSON computed column indexes add minimal overhead

On large systems this becomes extremely important.


When XML Still Makes Sense

To be fair, XML still has strong use cases:

  • Very complex hierarchical documents
  • Strict schema validation
  • XQuery-heavy workloads
  • Integration with legacy enterprise systems

But these scenarios are much rarer than developers think.


The Real Problem

The real issue is not XML itself.

It's using XML when you don't actually need it.

Many systems store simple key-value attributes inside XML documents.

That is the worst possible scenario:

  • high parsing cost
  • complex queries
  • unnecessary storage overhead

The Real Lesson

Performance tuning is about choosing the right structure.

Not the most "enterprise-looking" one.

XML is powerful.

But power always comes with cost.


Final Takeaways

  1. XML queries are CPU intensive.
  2. XML indexes are heavy.
  3. JSON indexing is often faster and simpler.
  4. Use XML only when its features are truly required.
  5. Always benchmark before choosing a storage format.

And remember…

No SQL Server was harmed during these benchmarks 😉

See you in the next deep dive — where we keep pushing SQL Server to its limits

Comments

I Post più popolari

Speaking to Sql Server, sniffing the TDS protocol

SQL Server, find text in a Trigger, Stored Procedures, View and Function. Two ways and what ways is better

SQL Server, Avoid that damn Table Spool!