Learn SQL Server Window functions and Running Totals in five minutes
Hello friends,
Welcome back!
Someone asked me about this topic so today we have a little bit of T-SQL theory.
As usually I will do my best to explain concepts in the clearest way.
We will speak about Window functions (the over clause) and running totals!
I hope you enjoy it!
Ready? Let’s go!
What are window functions?
To understand the logic of window functions, we need to take a step back by noticing that the functions, the ones we have always seen, perform their calculations for each row of the result set.
Window functions, on the other hand, have the particularity of operating on a subset of the results (commonly called "window" or “partition”). The returned value is relative to the subset of the data in the sense that it is calculated using values from the rowset extracted from that window.
It is important to note that unlike a GROUP BY operator in which the rows of results are aggregated, using the window functions all the rows are kept.
The Syntax
A window is defined, through the OVER clause, according to the following syntax:
[function]
OVER (
PARTITION BY [window partitioning parameters]
ORDER BY [window sorting parameters]
[window limits]
)
Which includes 3 parameters:
- The PARTITION BY clause defines how the data is partitioned within the window.
- The ORDER BY clause defines how the data are sorted within each window.
- The ROWS specification further restricts the data contained in the window.
Window functions are the last set of operations performed in a Query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the window functions are processed. Therefore, window functions can only appear in the select list or the ORDER BY clause.
The functions we can apply to window functions
There are 3 types of functions we can apply:
Aggregation functions:
- APPROX_COUNT_DISTINCT
- AVG
- CHECKSUM_AGG
- COUNT
- COUNT_BIG
- GROUPING
- GROUPING_ID
- MAX
- MIN
- STDEV
- STDEVP
- STRING_AGG
- SUM
- VAR
- VARP
Ranking functions:
- DENSE_RANK
- NTILE
- RANK
- ROW_NUMBER
Analytical functions:
- CUME_DIST
- FIRST_VALUE
- LAG
- LAST_VALUE
- LEAD PERCENT_RANK
- PERCENTILE_CONT PERCENTILE_DISC
Examples
For the example we will use our master/detail OrdTes and OrdRig tables
Create table OrdTes (id integer identity(1,1), numdoc int)
Create table OrdRig (id integer identity(1,1), idordtes int, qta1 float)
Insert into OrdTes select 1
Insert into OrdRig (idordtes,Qta1) values (1,10)
Insert into OrdRig (idordtes,Qta1) values (1,20)
Insert into OrdRig (idordtes,Qta1) values (1,30)
1) Obtain the minimum and maximum value for each IDORDTES
Important: let's not lose the detail of the values as it would happen with a GROUP BY
SELECT
IDORDTES,
QTA1,
MIN (QTA1) OVER (PARTITION BY IDORDTES) AS MIN_QTA_PER_IDORDTES,
MAX (QTA1) OVER (PARTITION BY IDORDTES) AS MAX_QTA_PER_IDORDTES
FROM ORDRIG
ORDER BY IDORDTES
2) Current line number for the window
SELECT
IDORDTES,
QTA1,
ROW_NUMBER () OVER (PARTITION BY IDORDTES ORDER BY IDORDTES) AS CURRENT_ROW_NUMBER
FROM ORDRIG
ORDER BY IDORDTES
The ROW_NUMER function returns the current row number within the partition. It needs an ordering and therefore the ORDER BY clause
3) Value of the previous and next row
Through the two analytical functions LAG and LEAD it is possible to obtain the previous or subsequent values within our partition -> this can avoid having us write sub queries.
SELECT
E.QTA1,
LEAD(QTA1, 1,0) OVER (ORDER BY IDORDTES) AS VAL_RIGA_SUCCESSIVA,
LAG(QTA1, 1,0) OVER (ORDER BY IDORDTES) AS VAL_RIGA_PRECEDENTE
FROM ORDRIG AS E ORDER BY ID
Running totals
We can easily guess what running totals are from the photo below.
Within our window (or partition, as they are synonymous) as we scroll through the rows we add the value of a column.
This is accomplished simply by using the SUM function in a window function.
SELECT
IDORDTES,
QTA1,
SUM(QTA1) OVER (PARTITION BY IDORDTES ORDER BY ID) AS RUNNING_TOTALS
FROM ORDRIG
ORDER BY IDORDTES
Comments
Post a Comment