How to read XML file from SQL Server. The OPENXML statement and a little bit of Datascience
Hi Guys,
Welcome back!
Would you like to learn how to read XML file from SQL Server?
Would you like to learn how to use the OPENXML statement?
Today we will do also some datascience.
So what are you waiting for? …Follow me!
As already mentioned many data are today memorized into an XML file.
For this example and in order to do some datascience in this post I will use data from Arpae
At address https://www.arpae.it/v2_rete_di_monitoraggio.asp?p=BO&s=7000027&idlivello=134&w=8#param you will find "air quality" data from the SAN PIETRO CAPOFIUME station.
SAN PIETRO CAPOFIUME station measures continuously (every hour) these parameters: NO2 (Nitrogen dioxide); O3 (Ozone); PM10; PM2.5
For this example we are interested to check the progress of the NO2 before and after the lockdown due to the coronavirus!
Inside XML file data are stored in this way:
So, how to read them?
Now we can read data in a clear way:
for example we can group by day through this Query:
And plot data:
From the graph you can see clearly that NO2 from January 2020 it has been decreasing.
In February and in March NO2 has halved again.
This, could be, due to minor vehicles running on the streets due to the lockdown.
And we also did a little datascience!
I hope you found this post useful and interesting.
If so don't forget to subscribe to this blog.
we are preparing many other interesting topics
That all for today
See you soon!
Luca
Luca Biondi @ SQLServerPerformance blog 2020!
Previuos post: From SQLServer to ML, a first little step into machine learning
Welcome back!
Would you like to learn how to read XML file from SQL Server?
Would you like to learn how to use the OPENXML statement?
Today we will do also some datascience.
So what are you waiting for? …Follow me!
Introduction
XML files are very popular nowadays.
They are nothing more than text files with a structure established by means of tags.
An example of an XML file could be this:
<ROOT>
<ROW
STAZIONE="san
pietro capofiume"
DATAINIZIO="31/12/2018 23.00.00"
DATAFINE="01/01/2019"
STRUMENTO="api 200e"
NO2="28"
UM="ug/m3"
F1=""
F2="1"
F3="1"
F4="1"
/>
<ROW
STAZIONE="san
pietro capofiume"
DATAINIZIO="01/01/2019"
DATAFINE="01/01/2019 1.00.00"
STRUMENTO="api 200e"
NO2="25"
UM="ug/m3"
F1=""
F2="1"
F3="1"
F4="1"
/>
</ROOT>
As already mentioned many data are today memorized into an XML file.
For this example and in order to do some datascience in this post I will use data from Arpae
At address https://www.arpae.it/v2_rete_di_monitoraggio.asp?p=BO&s=7000027&idlivello=134&w=8#param you will find "air quality" data from the SAN PIETRO CAPOFIUME station.
SAN PIETRO CAPOFIUME station measures continuously (every hour) these parameters: NO2 (Nitrogen dioxide); O3 (Ozone); PM10; PM2.5
For this example we are interested to check the progress of the NO2 before and after the lockdown due to the coronavirus!
Inside XML file data are stored in this way:
<ROOT>
<ROW
STAZIONE="san
pietro capofiume"
DATAINIZIO="31/12/2018 23.00.00"
DATAFINE="01/01/2019"
STRUMENTO="api 200e"
NO2="28"
UM="ug/m3"
F1=""
F2="1"
F3="1"
F4="1"
/>
<ROW
STAZIONE="san
pietro capofiume"
DATAINIZIO="01/01/2019"
DATAFINE="01/01/2019 1.00.00"
STRUMENTO="api 200e"
NO2="25"
UM="ug/m3"
F1=""
F2="1"
F3="1"
F4="1"
/>
</ROOT>So, how to read them?
Read an XML file from SQL Server
The instruction that read a XML is the OPENXML.
The basic sintax is pretty simple:
SELECT * FROM OPENXML(@docHandle, N'/ROOT')
So, how to use the OPENXML command?
I wrote this simply procedure to read XML data and write it into a SQL Server table called TABLEDATA.
DECLARE @docHandle
int;
DECLARE @xmlDocument
nvarchar(max);
SET @xmlDocument
=
N'<ROOT>
<ROW
STAZIONE="san
pietro capofiume"
DATAINIZIO="31/12/2018 23.00.00"
DATAFINE="01/01/2019"
STRUMENTO="api 200e"
NO2="28"
UM="ug/m3"
F1=""
F2="1"
F3="1"
F4="1"
/>
<ROW
STAZIONE="san
pietro capofiume"
DATAINIZIO="01/01/2019"
DATAFINE="01/01/2019 1.00.00"
STRUMENTO="api 200e"
NO2="25"
UM="ug/m3"
F1=""
F2="1"
F3="1"
F4="1"
/>
</ROOT>'
EXEC
sp_xml_preparedocument
@docHandle
OUTPUT,
@xmlDocument;
-- Use OPENXML to provide
rowset consisting of customer data.
SELECT
*
INTO TABLEDATA
FROM
OPENXML(@docHandle,
N'/ROOT')
Once data are inserted into a SQL Server table we can analyze it.
Watching at the content of TABLEDATA it is clear that we need some query to reorganize data:
For example:
Select cast(tt1.text as varchar) as DataInizio, cast(tt2.text as varchar) as NO2
into dati2
From
(
(select t2.parentid,t2.localname,t1.text from TABLEDATA t1
join TABLEDATA t2 on t1.parentid = t2.id
where t1.text is not null and t2.localname in ('DATAINIZIO')) tt1
join
(select t2.parentid,t2.localname,t1.text from TABLEDATA t1
join TABLEDATA t2 on t1.parentid = t2.id
where t1.text is not null and t2.localname in ('NO2')) tt2 on tt1.parentid = tt2.parentid
)
Now we can read data in a clear way:
for example we can group by day through this Query:
Select Left(DataInizio,10) as Giorno, sum(cast(NO2 as float)) as NO2
From dati2
Group by Left(DataInizio,10)
Order by cast(Left(DataInizio,10) as datetime)
And plot data:
From the graph you can see clearly that NO2 from January 2020 it has been decreasing.
In February and in March NO2 has halved again.
This, could be, due to minor vehicles running on the streets due to the lockdown.
Conclusion
This time we have seen how read an XML throgh SQL Server.And we also did a little datascience!
I hope you found this post useful and interesting.
If so don't forget to subscribe to this blog.
we are preparing many other interesting topics
That all for today
See you soon!
Luca
Luca Biondi @ SQLServerPerformance blog 2020!
Previuos post: From SQLServer to ML, a first little step into machine learning
thank you for info
ReplyDeletestay safe #covid-19
see more: http://coronavirusnpl.com
Thank you to you glad you find useful the post!
DeleteLuca