# SQL Windowing

- By Itzik Ben-Gan
- 11/4/2019

## Elements of Window Functions

The specification of a window function appears in the function’s *OVER* clause and involves multiple elements. The three core elements are window partitioning, ordering, and framing. Not all window functions support all elements. As I describe each element, I’ll also indicate which functions support it.

### Window Partitioning

The optional window partitioning element is implemented with a *PARTITION BY* clause and is supported by all window functions. It restricts the window of the current calculation to only those rows from the result set of the query that have the same values in the partitioning columns as in the current row. For example, if your function uses *PARTITION BY custid* and the *custid* value in the current row is *1*, the window partition with respect to the current row is all rows from the result set of the query that have a *custid* value of *1*. If the *custid* value of the current row is *2*, the window partition with respect to the current row is all rows with a *custid* of *2*.

If a *PARTITION BY* clause is not specified, the window is not restricted. Another way to look at it is that in case explicit partitioning wasn’t specified, the default partitioning is to consider the entire result set of the query as one partition.

If it wasn’t obvious, let me point out that different functions in the same query can have different partitioning specifications. Consider the query in Listing 1-5 as an example.

#### Listing 1-5 Query with Two RANK Calculations

SELECT custid, orderid, val, RANK() OVER(ORDER BY val DESC) AS rnkall, RANK() OVER(PARTITION BY custid ORDER BY val DESC) AS rnkcust FROM Sales.OrderValues;

Observe that the first *RANK* function (which generates the attribute *rnkall*) relies on the implied partitioning, and the second *RANK* function (which generates *rnkcust*) uses explicit partitioning by *custid*. Figure 1-5 illustrates the partitions defined for a sample of three results of calculations in the query: one *rnkall* value and two *rnkcust* values.

**FIGURE 1-5** Window partitioning.

The arrows point from the result values of the functions to the window partitions that were used to compute them.

### Window Ordering

The window ordering clause defines the ordering for the calculation, if relevant, within the partition. Interestingly, this clause has a slightly different meaning for different function categories. With ranking functions, ordering is intuitive. For example, when using descending ordering, the *RANK* function returns one more than the number of rows in your respective partition that have a greater ordering value than yours. When using ascending ordering, the function returns one more than the number of rows in the partition with a lower ordering value than yours. Figure 1-6 illustrates the rank calculations from Listing 1-5 shown earlier—this time including the interpretation of the ordering element.

**FIGURE 1-6** Window ordering.

Figure 1-6 depicts the windows of only three of the rank calculations. Of course, there are many more—1,660, to be precise. That’s because there are 830 rows involved, and for each row, two rank calculations are made. What’s interesting to note here is that conceptually it’s as if all those windows coexist simultaneously.

Aggregate window functions have a slightly different meaning for *window ordering*, when specified, compared to *ranking* window functions. With aggregates, contrary to what some might think, ordering has nothing to do with the order in which the aggregate is applied; rather, the ordering element gives meaning to the framing options that I will describe next. In other words, the ordering element is an aid as part of defining which rows to restrict in the window.

### Window Framing

Earlier, I referred to window partitioning as serving a filtering role. Window framing is essentially another filter that further restricts the rows in the window partition. It is applicable to aggregate window functions as well as to three of the offset functions: *FIRST_VALUE*, *LAST_VALUE*, and *NTH_VALUE*. As a reminder, the last is unsupported in SQL Server. Think of this windowing element as defining two end points, or delimiters, in the current row’s partition based on the given ordering, framing the rows that the calculation will apply to.

The framing specification in the standard includes a *ROWS*, *GROUPS* or *RANGE* option that defines the starting row and ending row of the frame, as well as a window frame-exclusion option. SQL Server supports framing, with full implementation of the *ROWS* option, partial implementation of the RANGE option, and no implementation of the *GROUPS* and window frame-exclusion options.

The *ROWS* option allows you to indicate the points in the frame as an offset in terms of the number of rows with respect to the current row, based on the window ordering. The *GROUPS* option is similar to *ROWS*, but you specify an offset in terms of the number of distinct groups with respect to the current group, based on the window ordering. The *RANGE* option is more dynamic, defining the offsets in terms of a difference between the ordering value of the frame point and the current row’s ordering value. The window frame-exclusion option specifies what to do with the current row and its peers in case of ties. This explanation might seem far from clear or sufficient, but I don’t want to get into the details just yet. There will be plenty of that later. For now, I just want to introduce the concept and provide a simple example. Following is a query against the *EmpOrders* view, calculating the running total quantity for each employee and order month:

SELECT empid, ordermonth, qty, SUM(qty) OVER(PARTITION BY empid ORDER BY ordermonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runqty FROM Sales.EmpOrders;

Observe that the window function applies the *SUM* aggregate function to the *qty* attribute, partitions the window by *empid*, orders the partition rows by *ordermonth*, and frames the partition rows based on the given ordering between unbounded preceding (no low boundary point) and the current row. In other words, the result will be the sum of all prior rows in the frame, inclusive of the current row. This query generates the following output, shown here in abbreviated form:

empid ordermonth qty runqty ------ ---------- ---- ------- 1 2017-07-01 121 121 1 2017-08-01 247 368 1 2017-09-01 255 623 1 2017-10-01 143 766 1 2017-11-01 318 1084 ... 2 2017-07-01 50 50 2 2017-08-01 94 144 2 2017-09-01 137 281 2 2017-10-01 248 529 2 2017-11-01 237 766 ...

Observe how the window specification is as easy to read as plain English. I will provide much more detail about the framing options in Chapter 2.