Creating a row context with iterators
You learned that DAX automatically creates a row context when you define a calculated column. In that case, the engine evaluates the DAX expression on a row-by-row basis. Now, it is time to learn how to create a row context inside a DAX expression by using iterators.
You might recall from Chapter 2, “Introducing DAX,” that all the X-ending functions are iterators, that is, they iterate over a table and evaluate an expression for each row, finally aggregating the results using different algorithms. For example, look at the following DAX expression:
[IncreasedSales] := SUMX ( Sales, Sales[SalesAmount] * 1.1 )
SUMX is an iterator, it iterates the Sales table and, for each row of the table, it evaluates the sales amount adding 10 percent to its value, finally returning the sum of all these values. In order to evaluate the expression for each row, SUMX creates a row context on the Sales table and uses it during the iteration. DAX evaluates the inner expression (the second parameter of SUMX) in a row context containing the currently iterated row.
It is important to note that different parameters of SUMX use different contexts during the full evaluation flow. Let’s look closer at the same expression:
= SUMX ( Sales, ← External contexts Sales[SalesAmount] * 1.1 ← External contexts + new Row Context )
The first parameter, Sales, is evaluated using the context coming from the caller (for example, it might be a pivot table cell, another measure, or part of a query), whereas the second parameter (the expression) is evaluated using both the external context plus the newly created row context.
All iterators behave in the same way:
- Create a new row context for each row of the table received as the first parameter.
- Evaluate the second parameter inside the newly created row context (plus any other context which existed before the iteration started), for each row of the table.
- Aggregate the values computed during step 2.
It is important to remember that the original contexts are still valid inside the expression: Iterators only add a new row context; they do not modify existing ones in any way. This rule is usually valid, but there is an important exception: If the previous contexts already contained a row context for the same table, then the newly created row context hides the previously existing row context. We are going to discuss this in more detail in the next section.
Using the EARLIER function
The scenario of having many nested row contexts on the same table might seem very rare, but, in reality, it happens quite often. Let’s see the concept with an example. Imagine you want to count, for each product, the number of other products with a higher price. This will produce a sort of ranking of the product based on price.
To solve this exercise, we use the FILTER function, which you learned in the previous chapter. As you might recall, FILTER is an iterator that loops through all the rows of a table and returns a new table containing only the ones that satisfy the condition defined by the second parameter. For example, if you want to retrieve the table of products with a price higher than US$100, you can use:
= FILTER ( Product, Product[UnitPrice] > 100 )
Now, let’s go back to our original example: creating a calculated column that counts the number of products that have a higher price than the current one. If you would name the price of the current product PriceOfCurrentProduct, then it is easy to see that this pseudo-DAX formula would do what is needed:
Product[UnitPriceRank] = COUNTROWS ( FILTER ( Product, Product[UnitPrice] > PriceOfCurrentProduct ) )
FILTER returns only the products with a price higher than the current one and COUNTROWS counts those products. The only remaining issue is a way to express the price of the current product, replacing PriceOfCurrentProduct with a valid DAX syntax. With “current,” we mean the value of the column in the current row when DAX computes the column. It is harder than you might expect.
You define this new calculated column inside the Product table. Thus, DAX evaluates the expression inside a row context. However, the expression uses a FILTER that creates a new row context on the same table. In fact, Product[UnitPrice] used in the fifth row of the previous expression is the value of the unit price for the current row iterated by FILTER - our inner iteration. Therefore, this new row context hides the original row context introduced by the calculated column. Do you see the issue? You want to access the current value of the unit price but not use the last introduced row context. Instead, you want to use the previous row context, that is, the one of the calculated column.
DAX provides a function that makes it possible: EARLIER. EARLIER retrieves the value of a column by using the previous row context instead of the last one. So you can express the value of PriceOfCurrentProduct using EARLIER(Product[UnitPrice]).
EARLIER is one of the strangest functions in DAX. Many users feel intimidated by EARLIER, because they do not think in terms of row contexts and they do not take into account the fact that you can nest row contexts by creating multiple iterations over the same table. In reality, EARLIER is a very simple function that will be useful many times. The following code finally solves the scenario:
Product[UnitPriceRank] = COUNTROWS ( FILTER ( Product, Product[UnitPrice] > EARLIER ( Product[UnitPrice] ) ) ) + 1
In Figure 4-8 you can see the calculated column defined in the Product table, which has been sorted using Unit Price in a descending order.
FIGURE 4-8 UnitPriceRank is a useful example of how EARLIER is useful to navigate in nested row contexts.
Because there are fourteen products with the same unit price, their rank is always one; the fifteenth product has a rank of 15, shared with other products with the same price. We suggest you study and understand this small example closely, because it is a very good test to check your ability to use and understand row contexts, how to create them using iterators (FILTER, in this case), and how to access values outside of them through the usage of EARLIER.
Before leaving this example, it is worth noting that, if you want to transform this value into a better ranking (that is, a value that starts with 1 and grows of one, creating a sequence 1, 2, 3...) then counting the prices instead of counting the products is sufficient. Here, the VALUES function, which you learned in the previous chapter, comes to help:
Product[UnitPriceRankDense] = COUNTROWS ( FILTER ( VALUES ( Product[UnitPrice] ), Product[UnitPrice] > EARLIER ( Product[UnitPrice] ) ) ) + 1
In Figure 4-9 you can see the new calculated column.
FIGURE 4-9 UnitPriceRankDense shows a better ranking, because it counts the prices, not the products.
We strongly suggest you learn and understand EARLIER thoroughly, because you will use it very often. Nevertheless, it is important to note that variables can be used—in many scenarios—to avoid the use of EARLIER. Moreover, a careful use of variables makes the code much easier to read. For example, you can compute the previous calculated column using this expression:
Product[UnitPriceRankDense] = VAR CurrentPrice = Product[UnitPrice] RETURN COUNTROWS ( FILTER ( VALUES ( Product[UnitPrice] ), Product[UnitPrice] > CurrentPrice ) ) + 1
In this final example, using a variable, you store the current unit price in the CurrentPrice variable, which you use later to perform the comparison. Giving a name to the variable, you make the code easier to read, without having to traverse the stack of row contexts every time you read the expression to make sense of the evaluation flow.