# Model the data

- By Daniil Maslyuk
- 11/5/2022

- Skill 2.1: Design a data model
- Skill 2.2: Develop a data model
- Skill 2.3: Create model calculations by using DAX
- Skill 2.4: Optimize model performance
- Chapter summary
- Thought experiment
- Thought experiment answers

## Skill 2.3: Create model calculations by using DAX

You used some DAX earlier in the chapter to create calculated tables and calculated columns as well as configure row-level security. In practice, DAX is most often used to create measures in Power BI.

Writing your own formulas is an important skill that allows you to perform much more sophisticated analysis based on your data compared to not using DAX.

In this section, we start by reviewing DAX fundamentals; then we look at `CALCULATE`, one of the most important functions in DAX, specifically in Time Intelligence or time-related calculations, which we review separately.

DAX can help you replace some columns with measures, allowing you to reduce the data model size. Not all DAX formulas need to be complex, and we review some basic statistical functions in this section as well.

### Create basic measures by using DAX

Although many things can be computed by using calculated columns, in most cases it’s preferable to write measures, because they don’t increase the model size. Additionally, some calculations are simply not possible with calculated columns. For example, to calculate a ratio dynamically, you need to write a measure.

As you saw earlier, quick measures already allow you to perform basic calculations without writing DAX yourself. In this section, you start using DAX to build complex measures.

It’s important to understand that Power BI allows you to aggregate columns in visuals without using measures, a practice sometimes called *implicit measures*. These can be useful when you want to quickly test how a visual might look or to perform a quick analysis on a column. However, it’s always best practice to create *explicit measures* by using DAX—even with trivial calculations such as `SUM`. Here are some reasons it’s preferable to create measures yourself:

Implicit measures may provide unexpected results in some cases due to the Summarize by column property. For example, if you have a column that contains product prices and Power BI sets the summarization to

`SUM`, then dragging the column in a visual will not produce meaningful results. Although you can change the summarization in the visual, following this approach means that you need to pay attention to this property every time you use implicit measures.Explicit measures can be reused in other measures. This is beneficial because you can write less code, which saves time and improves the maintainability of your data model.

Implicit measures cannot leverage inactive relationships.

Implicit measures are not supported by calculation groups.

Measures are different from calculated columns in a few ways. The main difference is that you can see the results of a calculated column immediately after defining the calculation, whereas you can’t see the results of a measure until you use it in a visual. This behavior allows measures to return different results depending on filters and where they’re used.

Another difference between calculated columns and measures is that calculated column formulas apply to each row of a table, whereas measures work on columns and tables, not specific rows. Therefore, measures most often use aggregation functions in DAX.

There are a few ways to create a measure in Power BI Desktop. Here’s one way:

Go to the

**Report**view.In the

**Fields**pane, right-click a table in which you want to create a new measure.Select

**New measure**.Enter the measure formula and press

**Enter**.

You can also create a measure by selecting **New measure** on the **Home** ribbon, but you have to make sure you’ve got the right table selected in the **Fields** pane; otherwise, your measure may not be created in the correct table. If you do create a measure in the wrong table, instead of re-creating the measure you can move it by performing the following steps:

Go to the

**Report**view.In the

**Fields**pane, select the measure you want to move.On the

**Measure tools**ribbon, select the table your measure should be stored in from the**Home table**dropdown list.

For example, to compute the total profit of Wide World Importers, use the following measure formula:

Total Profit = SUM(Sale[Profit])

You can compute total sales, excluding tax, by using the following measure formula:

Total Sales Excluding Tax = SUM(Sale[Total Excluding Tax])

If you want to compute the profit margin percentage, there are two ways of doing it. You could use this:

Profit % = DIVIDE( SUM(Sale[Profit]), SUM(Sale[Total Excluding Tax]) )

However, this approach involves repeating your own code, which is undesirable because formulas become more difficult to maintain. You can avoid this issue if you reference the measures you created previously:

Profit % = DIVIDE( [Total Profit], [Total Sales Excluding Tax] )

When you’re referencing measures, it’s best practice to not use table names in front of them. Unlike column names, measure names are unique; different tables may have the same column names, but it’s not possible to have measures that share the same name.

Another feature of DAX that allows you to avoid repeating yourself is variables. Think of a variable as a calculation within a measure. For instance, if you want to avoid showing zeros in your visuals, you could write a measure as follows:

Total Dry Items Units Sold = IF( SUM(Sale[Total Dry Items]) <> 0, SUM(Sale[Total Dry Items]) )

By using a variable, you can avoid calling `SUM` twice:

Total Dry Items Units Sold = VAR TotalDryItems = SUM(Sale[Total Dry Items]) VAR Result = IF( TotalDryItems <> 0, TotalDryItems ) RETURN Result

Variables are especially useful when you want to store computationally expensive values, because variables are evaluated no more than once. As you’ll see later in this chapter, you can use many variables within the same formula.

### Use CALCULATE to manipulate filters

Earlier in this chapter, you saw that the `CALCULATE` function can be used to alter relationships when paired with other DAX measures. The `USERELATIONSHIP` function with `CALCULATE` can activate inactive relationships, and `CROSSFILTER` with `CALCULATE` can change the filter direction.

The `CALCULATE` function also allows you to alter the filter context under which measures are evaluated; you can add, remove, or update filters, or you can trigger context transition. We cover row context, filter context, and context transition in more detail later in this chapter.

`CALCULATE` accepts a scalar expression as its first parameter, and subsequent parameters are filter arguments. Using `CALCULATE` with no filter arguments is only useful for context transition.

#### Adding filters

`CALCULATE` allows you to add filters in several formats. To calculate profit for the New England sales territory, you can write a measure that you can read as “Calculate the Total Profit where the Sales Territory is New England”:

New England Profit = CALCULATE( [Total Profit], City[Sales Territory] = "New England" )

Importantly, you’re not limited to using one value per filter. You can calculate profit for New England, Far West, and Plains:

New England, Far West, and Plains Profit = CALCULATE( [Total Profit], City[Sales Territory] IN {"New England", "Far West", "Plains"} )

You can specify filters for different columns at once too, which are combined by using the `AND` DAX function. For example, you can calculate profit in New England in 2020 that reads as “Calculate the Total Profit where the Sales Territory is New England and the Year is 2020”:

New England Profit 2020 = CALCULATE( [Total Profit], City[Sales Territory] = "New England", 'Date'[Year] = 2020 )

#### Removing filters

There are several DAX functions that you can use as `CALCULATE` modifiers to ignore filters, one of which is `ALL`. `ALL` can remove filters from:

One or more columns from the same table

An entire table

The whole data model (when

`ALL`is used with no parameters)

For example, you can show profit for all sales territories regardless of any filters on the City[Sales Territory] column:

Profit All Sales Territories = CALCULATE( [Total Profit], ALL(City[Sales Territory]) )

If you create a table that shows the new measure alongside Total Profit by Sales Territory, you get the results shown in Figure 2-20.

**FIGURE 2-20** Total Profit and Profit All Sales Territories by Sales Territory

Note that the new measure displays the same value for any sales territory, which is the total of all sales territories combined regardless of sales territory.

#### Updating filters

When you specify a filter such as `City[Sales Territory] = "New England"`, it’s an abbreviated way that corresponds to the following filter:

FILTER( ALL(City[Sales Territory]), City[Sales Territory] = "New England" )

By adding this filter, you are ignoring a filter by using `ALL`, and you’re adding a filter at the same time. This allows you to filter for New England regardless of the selected sales territory.

If you create a table that shows Total Profit and New England Profit by Sales Territory, the result should look like Figure 2-21.

**FIGURE 2-21** Total Profit and New England Profit by Sales Territory

When you have Sales Territory on rows, each row from the Total Profit column is filtered for a single sales territory and the Total row shows values for all sales territories. In contrast, by using the measure above in the New England Profit column, you are filtering regardless of the current sales territory, showing only the New England Profit.

#### Context transition

Another important function of `CALCULATE` is context transition, which refers to transitioning from row context to filter context.

In DAX, there are two evaluation contexts:

**Row context**This context can be understood as “the current row.” Row context is present in calculated columns and iterators. Iterators are functions that take a table and go row by row, evaluating an expression for each row. For example,`FILTER`is an iterator; it takes a table, and for each row, it evaluates a filter condition. Those rows that satisfy the condition are included in the result of`FILTER`.**Filter context**This context can be understood as “all applied filters.” Filters can come from slicers, from the Filter pane, or by selecting a visual element. Filters can also be applied programmatically by using DAX.

To review context transition, let’s create a sample table in the data model:

On the

**Home**ribbon, select**Enter data**.Enter

**Sample**in the**Name**box.Enter the data shown in Figure 2-22.

**FIGURE 2-22**Entering dataSelect

**Load**.

Now that you have the table, you can add two calculated columns to it to see the effect of context transition:

Go to the

**Data**view.Select the

**Sample**table in the**Fields**pane.Create a calculated column with the following formula:

`Sum Number = SUM('Sample'[Number])`Create another calculated column with the following formula:

`Calculate Sum Number = CALCULATE(SUM('Sample'[Number]))`

The result should look like Figure 2-23.

**FIGURE 2-23** Calculated columns in the Sample table

`SUM`, as an aggregation function, uses filter context. Because there are no filters in the data model—there are no visuals, and you’re not adding any filters by using DAX—`SUM` aggregates the whole Number column, so the result in the Sum Number column is 6 regardless of the row.

On the other hand, the Calculate Sum Number column uses the same formula as Sum Number, but importantly has been wrapped in `CALCULATE`. `CALCULATE` automatically performs context transition, so the result is different from using the `SUM` function alone. Context transition takes all values from all other columns and uses them as filters. Therefore, for the first row, you aggregate the Number column, where:

**Sample[Letter]**is**A****Sample[Number]**is**1****Sample[Sum Number]**is**6**

Where the sum of 1 is equal to 1, since there’s only one such row that meets these filters, you get 1. Separately for row 2, the sum of 2 equals 2, and for row 3, the sum of 3 equals 3. Context transition can be made even clearer by modifying the Sample table slightly as follows:

On the

**Home**ribbon, select**Transform data**.Select the

**Sample**query.Select the cog wheel in the

**Source**step.Change the third row to match the second row, as shown in Figure 2-24.

**FIGURE 2-24**Modified Sample tableSelect

**OK**.On the

**Home**ribbon of Power Query Editor, select**Close & Apply**.If you now look at the Sample table in the

**Data**view, the result will look like Figure 2-25.

**FIGURE 2-25** Sample table after update

Although the first row is calculated as you saw in the previous example, the second and third rows are now both showing 4. Intuitively, you could expect to see 2 and 2 in each row, though you’re getting 4 and 4. This is because for each row, due to context transition triggered by `CALCULATE`, you’re summing the Number column, where

**Sample[Letter]**is**B****Sample[Number]**is**2****Sample[Sum Number]**is**5**

Because there are two such rows, you get 2 + 2 = 4 in both rows.

### Implement Time Intelligence using DAX

It is common for business users to want to aggregate metrics—for example, revenue—across time, such as year-to-date revenue for a certain date, or prior-year revenue for the comparable period. Fortunately, DAX has a family of functions, referred to as Time Intelligence, that facilitate such calculations.

All Time Intelligence functions require a calendar table that has a date type column with unique values. If the date column is not part of a relationship, the calendar table must be marked as a date table, which can be done as follows:

Go to the

**Report**or**Data**view.Select the calendar table in the

**Fields**pane.On the

**Table tools**ribbon, select**Mark as date table**>**Mark as date table**.Select the date column from the

**Date column**dropdown list.Select

**OK**.

Most Time Intelligence functions return tables that can be used as filters in `CALCULATE`. For example, you can use the `DATESYTD` function to calculate a year-to-date amount as follows:

Profit YTD = CALCULATE( [Total Profit], DATESYTD('Date'[Date]) )

You can also combine Time Intelligence functions. For example, to calculate year-to-date profit for the previous year, use the following formula:

Profit PYTD = CALCULATE( [Profit YTD], DATEADD('Date'[Date], -1, YEAR) )

Some Time Intelligence functions, such as `DATESYTD`, can accommodate fiscal years. For example, if you had a fiscal year ending on June 30, you could calculate profit year-to-date for the fiscal year as follows:

Profit FYTD = CALCULATE( [Total Profit], DATESYTD('Date'[Date], "30-6") )

The Total Profit, Profit YTD, Profit PYTD, and Profit FYTD measures can be seen together in Figure 2-26.

**FIGURE 2-26** Time Intelligence calculations

Notice how the Profit YTD measure shows the cumulative total profit within each year. The Profit PYTD measure shows the same values as Profit YTD one year before. Profit FYTD shows the cumulative total profit for fiscal years, resetting on July 1 of each year.

### Replace implicit measures with explicit measures

It is sometimes possible to replace some numeric columns with measures, which can reduce the size of the data model. In our Wide World Importers example, there are several columns that could be replaced with measures.

For example, the Total Chiller Items and Total Dry Items columns in the Sale table show quantity of chiller and dry items, respectively. Essentially, these columns show filtered quantities depending on whether an item is a chiller or a dry item.

Before you replace the two columns with measures, create the following measure, which you’ll reference and build upon later:

Total Quantity = SUM(Sale[Quantity])

You can now create the following two measures and use them instead of columns:

Total Chiller Items (Measure)= CALCULATE( [Total Quantity], 'Stock Item'[Is Chiller Stock] = TRUE )Total Dry Items (Measure)= CALCULATE( [Total Quantity], 'Stock Item'[Is Chiller Stock] = FALSE )

If you remove the Total Chiller Items and Total Dry Items columns from the model, you’ll make it smaller and more efficient.

Another example of a column that can be replaced by a measure is Total Including Tax from the Sale table. Since Total Excluding Tax and Tax Amount added together equals Total Including Tax, you can use the following measure instead:

Total Including Tax (Measure) = SUMX( Sale, Sale[Total Excluding Tax] + Sale[Tax Amount] )

Again, once you have the measure, removing the Total Including Tax column would reduce the size of the data model.

### Use basic statistical functions

As mentioned previously, it’s best practice to create explicit measures even for basic calculations such as `SUM`, because you can build upon them to create more complex measures. You’ve already used `SUM` in our previous examples; here are several other basic statistical measures that are frequently used:

`AVERAGE``MEDIAN``COUNT``DISTINCTCOUNT``MIN``MAX`

All these functions take a column as a reference and produce a scalar value. In addition, every function except `DISTINCTCOUNT` has an equivalent iterator function with the X suffix—for instance, `SUMX` is the iterator counterpart of `SUM`. Iterators take two parameters: a table to iterate through, and an expression to evaluate for each row. The evaluated results are then aggregated according to the base function; for example, `SUMX` will sum the results. When you’re learning the difference, it can be helpful to create sample tables similar to the examples shown earlier to visually compare the nuances of the different functions.

### Create semi-additive measures

In general, there are three kinds of measures:

**Additive**These measures are aggregated by using the`SUM`function across any dimensions. A typical example is revenue, which can be added across different product categories, cities, and dates, as well as other dimensions. Revenue of all months within a year, when added together, equals the total year revenue.**Semi-additive**These measures can be added across some but not all dimensions. For example, inventory counts can be added across different product categories and cities, but not dates; if you had five units yesterday and two units today, that doesn’t mean you’ll have seven units tomorrow. On the other hand, if you have five units in Sydney and two units in Melbourne, this means you’ve got seven units in the two cities in total.**Non-additive**These measures cannot be added across any dimensions. For instance, you cannot add up the average price across any dimension, because the result would not make any practical sense. If the average sale price in Sydney was $4.50, and it was $3.50 in Melbourne, you cannot say that across both cities, the average price was $8.00 or even $4.00 because the number of units sold could be very different.

In this section, we focus on semi-additive measures. There are several ways to write a semi-additive measure, and the correct way for you depends on your business requirements. Let’s say your business is interested in inventory counts, and you have the data model shown in Figure 2-27.

**FIGURE 2.27** Inventory data model

If you have inventory figures for all dates of interest in your data, you can write the following measure:

Inventory Balance = CALCULATE( SUM(Inventory[Balance]), LASTDATE('Date'[Date]) )

In addition to `LASTDATE` and its sister function `FIRSTDATE`, there are some DAX functions that can help you retrieve the opening or closing balance for different time periods:

`OPENINGBALANCEMONTH``OPENINGBALANCEQUARTER``OPENINGBALANCEYEAR``CLOSINGBALANCEMONTH``CLOSINGBALANCEQUARTER``CLOSINGBALANCEYEAR`

The functions that start with `CLOSING` evaluate an expression for the last date in the period, and the functions that start with `OPENING` evaluate an expression for one day before the first date in the period. This means that the opening balance for February 1 is the same as the closing balance for January 31.

For example, you can calculate the opening month balance for inventory as follows:

Inventory Opening Balance Month = OPENINGBALANCEMONTH( SUM(Inventory[Balance]), 'Date'[Date] )

The date-based functions listed here only work if you have data for all dates of interest. For example, if you’d chosen to use `CLOSINGBALANCEMONTH` but your data ends on May 23, 2022, as is the case for sample data, you’d get a blank value for May 2022. For cases such as this, you can use `LASTNONBLANKVALUE` or `FIRSTNONBLANKVALUE` as shown here:

Inventory Last Nonblank = LASTNONBLANKVALUE( 'Date'[Date], SUM(Inventory[Balance]) )

This measure will show the latest available balance in the current context.

The Inventory Balance, Inventory Opening Balance Month, and Inventory Last Nonblank measures can be seen in Figure 2-28.

**FIGURE 2.28** Inventory measures

Determining which calculation you should use depends on your business requirements—there is no single correct answer that applies to all scenarios. Missing data may mean there’s no inventory, or it may mean that data isn’t captured frequently enough, so the data modeler should understand the underlying data before writing the calculations to ensure the data isn’t represented incorrectly.

### Use quick measures

A measure in Power BI is a dynamic evaluation of a DAX query that will change in response to interactions with other visuals, enabling quick, meaningful exploration of your data. Creating efficient measures will be one of the smartest things you can do to build insightful reports. If you’re new to DAX and writing measures, or you’re wanting to perform quick analysis, you have the option of creating a quick measure. There are several ways to create a quick measure:

Select

**Quick measure**from the**Home**ribbon.Right-click or select the ellipsis next to a table or column in the

**Fields**pane and select**New quick measure**. This method may prefill the quick measure form shown next.If you already use a field in a visual, select the dropdown arrow next to the field in the Values section and select

**New quick measure**. This method also may prefill the quick measure form shown next. If possible, this will add the new quick measure to the existing visualization. You’ll be able to use this measure in other visuals too.

The following calculations are available as quick measures:

Aggregate per category

Average per category

Variance per category

Max per category

Min per category

Weighted average per category

Filters

Filtered value

Difference from filtered value

Percentage difference from filtered value

Sales from new customers

Time Intelligence

Year-to-date total

Quarter-to-date total

Month-to-date total

Year-over-year change

Quarter-over-quarter change

Month-over-month change

Rolling average

Totals

Running total

Total for category (filters applied)

Total for category (filters not applied)

Mathematical operations

Addition

Subtraction

Multiplication

Division

Percentage difference

Correlation coefficient

Text

Star rating

Concatenated list of values

Each calculation has its own description and a list of field wells—you can see an example in Figure 2-29.

**FIGURE 2.29** Quick Measures dialog box

For example, by using quick measures, you can calculate average profit per employee for Wide World Importers as follows:

Ensure the

**Sale**table is selected in the**Fields**pane.Select

**Quick measure**on the**Home**ribbon.From the

**Calculation**dropdown list, select**Average per category**.Drag the

**Profit**column from the**Sale**table to the**Base value**field well.Drag the

**Employee**column from the**Employee**table to the**Category**field well.Select

**OK**.

Once done, you can find the new measure called **Profit average per Employee** in the **Fields** pane.

If you select the new measure, you’ll see its DAX formula:

Profit average per Employee = AVERAGEX( KEEPFILTERS(VALUES('Employee'[Employee])), CALCULATE(SUM('Sale'[Profit])) )

You can modify the formula, if needed. Reading the DAX can be a great way to learn how measures can be written.