Introducing DAX

By Alberto Ferrari and Marco Russo
7/20/2019

Contents

Understanding DAX calculations
Understanding calculated columns and measures
Introducing variables
Handling errors in DAX expressions
Formatting DAX code
Introducing aggregators and iterators
Using common DAX functions
Conclusions

In this sample chapter from The Definitive Guide to DAX, authors Russo and Ferrari start talking about the DAX language. Here you learn the syntax of the language, the difference between a calculated column and a measure (also called calculated field, in certain old Excel versions), and the most commonly used functions in DAX.

In this chapter, we start talking about the DAX language. Here you learn the syntax of the language, the difference between a calculated column and a measure (also called calculated field, in certain old Excel versions), and the most commonly used functions in DAX.

Because this is an introductory chapter, it does not cover many functions in depth. In later chapters, we explain them in more detail. For now, introducing the functions and starting to look at the DAX language in general are enough. When we reference features of the data model in Power BI, Power Pivot, or Analysis Services, we use the term Tabular even when the feature is not present in all the products. For example, “DirectQuery in Tabular” refers to the DirectQuery mode feature available in Power BI and Analysis Services but not in Excel.

Understanding DAX calculations

Before working on more complex formulas, you need to learn the basics of DAX. This includes DAX syntax, the different data types that DAX can handle, the basic operators, and how to refer to columns and tables. These concepts are discussed in the next few sections.

We use DAX to compute values over columns in tables. We can aggregate, calculate, and search for numbers, but in the end, all the calculations involve tables and columns. Thus, the first syntax to learn is how to reference a column in a table.

The general format is to write the table name enclosed in single quotation marks, followed by the column name enclosed in square brackets, as follows:

'Sales'[Quantity]

We can omit the single quotation marks if the table name does not start with a number, does not contain spaces, and is not a reserved word (like Date or Sum).

The table name is also optional in case we are referencing a column or a measure within the table where we define the formula. Thus, [Quantity] is a valid column reference, if written in a calculated column or in a measure defined in the Sales table. Although this option is available, we strongly discourage you from omitting the table name. At this point, we do not explain why this is so important, but the reason will become clear when you read Chapter 5, “Understanding CALCULATE and CALCULATETABLE.” Nevertheless, it is of paramount importance to be able to distinguish between measures (discussed later) and columns when you read DAX code. The de facto standard is to always use the table name in column references and always avoid it in measure references. The earlier you start adopting this standard, the easier your life with DAX will be. Therefore, you should get used to this way of referencing columns and measures:

Sales[Quantity] * 2            -- This is a column reference
[Sales Amount] * 2             -- This is a measure reference

You will learn the rationale behind this standard after learning about context transition, which comes in Chapter 5. For now, just trust us and adhere to this standard.

COMMENTS IN DAX

The preceding code example shows comments in DAX for the first time. DAX supports single-line comments and multiline comments. Single-line comments start with either --or //, and the remaining part of the line is considered a comment.

= Sales[Quantity] * Sales[Net Price] -- Single-line comment
= Sales[Quantity] * Sales[Unit Cost] // Another example of single-line comment

A multiline comment starts with /* and ends with */. The DAX parser ignores everything included between these markers and considers them a comment.

= IF (
    Sales[Quantity] > 1,
    /* First example of a multiline comment
       Anything can be written here and is ignored by DAX
    */
    "Multi",
    /* A common use case of multiline comments  is to comment-out a part of
       the existing code
       The next IF statement is ignored because it falls within a multiline comment
        IF (
            Sales[Quantity] = 1,
            "Single",
            "Special note"
        )
    */
    "Single"
)

It is better to avoid comments at the end of a DAX expression in a measure, calculated column, or calculated table definition. These comments might be not visible at first, and they might not be supported by tools such as DAX Formatter, which is discussed later in this chapter.

DAX data types

DAX can perform computations with different numeric types, of which there are seven. Over time, Microsoft introduced different names for the same data types, creating some sort of confusion. Table 2-1 provides the different names under which you might find each DAX data type.

Table 2-1 Data Types

DAX Data Type	Power BI Data Type	Power Pivot and Analysis Services Data Type	Correspondent Conventional Data Type (e.g., SQL Server)	Tabular Object Model (TOM) Data Type
Integer	Whole Number	Whole Number	Integer / INT	int64
Decimal	Decimal Number	Decimal Number	Floating point / DOUBLE	double
Currency	Fixed Decimal Number	Currency	Currency / MONEY	decimal
DateTime	DateTime, Date, Time	Date	Date / DATETIME	dateTime
Boolean	True/False	True/False	Boolean / BIT	boolean
String	Text	Text	String / NVARCHAR(MAX)	string
Variant	-	-	-	variant
Binary	Binary	Binary	Blob / VARBINARY(MAX)	binary

In this book, we use the names in the first column of Table 2-1 adhering to the de facto standards in the database and Business Intelligence community. For example, in Power BI, a column containing either TRUE or FALSE would be called TRUE/FALSE, whereas in SQL Server, it would be called a BIT. Nevertheless, the historical and most common name for this type of value is Boolean.

DAX comes with a powerful type-handling system so that we do not have to worry about data types. In a DAX expression, the resulting type is based on the type of the term used in the expression. You need to be aware of this in case the type returned from a DAX expression is not the expected type; you would then have to investigate the data type of the terms used in the expression itself.

For example, if one of the terms of a sum is a date, the result also is a date; likewise, if the same operator is used with integers, the result is an integer. This behavior is known as operator overloading, and an example is shown in Figure 2-1, where the OrderDatePlusOneWeek column is calculated by adding 7 to the value of the Order Date column.

Sales[OrderDatePlusOneWeek] = Sales[Order Date] + 7

Figure 2-1 Adding an integer to a date results in a date increased by the corresponding number of days.

The result is a date.

In addition to operator overloading, DAX automatically converts strings into numbers and numbers into strings whenever required by the operator. For example, if we use the & operator, which concatenates strings, DAX converts its arguments into strings. The following formula returns “54” as a string:

= 5 & 4

On the other hand, this formula returns an integer result with the value of 9:

= "5" + "4"

The resulting value depends on the operator and not on the source columns, which are converted following the requirements of the operator. Although this behavior looks convenient, later in this chapter you see what kinds of errors might happen during these automatic conversions. Moreover, not all the operators follow this behavior. For example, comparison operators cannot compare strings with numbers. Consequently, you can add one number with a string, but you cannot compare a number with a string. You can find a complete reference here: https://docs.microsoft.com/en-us/power-bi/desktop-data-types. Because the rules are so complex, we suggest you avoid automatic conversions altogether. If a conversion needs to happen, we recommend that you control it and make the conversion explicit. To be more explicit, the previous example should be written like this:

= VALUE ( "5" ) + VALUE ( "4" )

People accustomed to working with Excel or other languages might be familiar with DAX data types. Some details about data types depend on the engine, and they might be different for Power BI, Power Pivot, or Analysis Services. You can find more detailed information about Analysis Services DAX data types at http://msdn.microsoft.com/en-us/library/gg492146.aspx, and Power BI information is available at https://docs.microsoft.com/en-us/power-bi/desktop-data-types. However, it is useful to share a few considerations about each of these data types.

Integer

DAX has only one Integer data type that can store a 64-bit value. All the internal calculations between integer values in DAX also use a 64-bit value.

Decimal

A Decimal number is always stored as a double-precision floating-point value. Do not confuse this DAX data type with the decimal and numeric data type of Transact-SQL. The corresponding data type of a DAX decimal number in SQL is Float.

Currency

The Currency data type, also known as Fixed Decimal Number in Power BI, stores a fixed decimal number. It can represent four decimal points and is internally stored as a 64-bit integer value divided by 10,000. Summing or subtracting Currency data types always ignores decimals beyond the fourth decimal point, whereas multiplication and division produce a floating-point value, thus increasing the precision of the result. In general, if we need more accuracy than the four digits provided, we must use a Decimal data type.

The default format of the Currency data type includes the currency symbol. We can also apply the currency formatting to Integer and decimal numbers, and we can use a format without the currency symbol for a Currency data type.

DateTime

DAX stores dates as a DateTime data type. This format uses a floating-point number internally, wherein the integer corresponds to the number of days since December 30, 1899, and the decimal part identifies the fraction of the day. Hours, minutes, and seconds are converted to decimal fractions of a day. Thus, the following expression returns the current date plus one day (exactly 24 hours):

= TODAY () + 1

The result is tomorrow’s date at the time of the evaluation. If you need to take only the date part of a DateTime, always remember to use TRUNC to get rid of the decimal part.

Power BI offers two additional data types: Date and Time. Internally, they are a simple variation of DateTime. Indeed, Date and Time store only the integer or the decimal part of the DateTime, respectively.

THE LEAP YEAR BUG

Lotus 1-2-3, a popular spreadsheet released in 1983, presented a bug in the handling of the DateTime data type. It considered 1900 as a leap year, even though it was not. The final year in a century is a leap year only if the first two digits can be divided by 4 without a remainder. At that time, the development team of the first version of Excel deliberately replicated the bug, to maintain compatibility with Lotus 1-2-3. Since then, each new version of Excel has maintained the bug for compatibility.

At the time of printing in 2019, the bug is still there in DAX, introduced for backward compatibility with Excel. The presence of the bug (should we call it a feature?) might lead to errors on time periods prior to March 1, 1900. Thus, by design, the first date officially supported by DAX is March 1, 1900. Date calculations executed on time periods prior to that date might lead to errors and should be considered as inaccurate.

Boolean

The Boolean data type is used to express logical conditions. For example, a calculated column defined by the following expression is of Boolean type:

= Sales[Unit Price] > Sales[Unit Cost]

You will also see Boolean data types as numbers where TRUE equals 1 and FALSE equals 0. This notation sometimes proves useful for sorting purposes because TRUE > FALSE.

String

Every string in DAX is stored as a Unicode string, where each character is stored in 16 bits. By default, the comparison between strings is not case sensitive, so the two strings “Power BI” and “POWER BI” are considered equal.

Variant

The Variant data type is used for expressions that might return different data types, depending on the conditions. For example, the following statement can return either an integer or a string, so it returns a variant type:

IF ( [measure] > 0, 1, "N/A" )

The Variant data type cannot be used as a data type for a column in a regular table. A DAX measure, and in general, a DAX expression can be Variant.

Binary

The Binary data type is used in the data model to store images or other nonstructured types of information. It is not available in DAX. It was mainly used by Power View, but it might not be available in other tools such as Power BI.

DAX operators

Now that you have seen the importance of operators in determining the type of an expression, see Table 2-2, which provides a list of the operators available in DAX.

Table 2-2 Operators

Operator Type	Symbol	Use	Example
Parenthesis	( )	Precedence order and grouping of arguments	(5 + 2) * 3
Arithmetic	+ − * /	Addition Subtraction/negation Multiplication Division	4 + 2 5 − 3 4 * 2 4 / 2
Comparison	= <> > >= < <=	Equal to Not equal to Greater than Greater than or equal to Less than Less than or equal to	[CountryRegion] = “USA” [CountryRegion] <> “USA” [Quantity] > 0 [Quantity] >= 100 [Quantity] < 0 [Quantity] <= 100
Text concatenation	&	Concatenation of strings	“Value is” & [Amount]
Logical	&& \|\| IN NOT	AND condition between two Boolean expressions OR condition between two Boolean expressions Inclusion of an element in a list Boolean negation	[CountryRegion] = “USA” && [Quantity]>0 [CountryRegion] = “USA” \|\| [Quantity] > 0 [CountryRegion] IN {“USA”, “Canada”} NOT [Quantity] > 0

Moreover, the logical operators are also available as DAX functions, with a syntax similar to Excel’s. For example, we can write expressions like these:

AND ( [CountryRegion] = "USA", [Quantity] > 0 )
OR ( [CountryRegion] = "USA", [Quantity] > 0 )

These examples are equivalent, respectively, to the following:

[CountryRegion] = "USA" && [Quantity] > 0
[CountryRegion] = "USA" || [Quantity] > 0

Using functions instead of operators for Boolean logic becomes helpful when writing complex conditions. In fact, when it comes to formatting large sections of code, functions are much easier to format and to read than operators are. However, a major drawback of functions is that we can pass in only two parameters at a time. Therefore, we must nest functions if we have more than two conditions to evaluate.

Table constructors

In DAX we can define anonymous tables directly in the code. If the table has a single column, the syntax requires only a list of values—one for each row—delimited by curly braces. We can delimit multiple rows by parentheses, which are optional if the table is made of a single column. The two following definitions, for example, are equivalent:

{ "Red", "Blue", "White" }
{ ( "Red" ), ( "Blue" ), ( "White" ) }

If the table has multiple columns, parentheses are mandatory. Every column should have the same data type throughout all its rows; otherwise, DAX will automatically convert the column to a data type that can accommodate all the data types provided in different rows for the same column.

{
    ( "A", 10, 1.5, DATE ( 2017, 1, 1 ), CURRENCY ( 199.99 ), TRUE ),
    ( "B", 20, 2.5, DATE ( 2017, 1, 2 ), CURRENCY ( 249.99 ), FALSE ),
    ( "C", 30, 3.5, DATE ( 2017, 1, 3 ), CURRENCY ( 299.99 ), FALSE )
}

The table constructor is commonly used with the IN operator. For example, the following are possible, valid syntaxes in a DAX predicate:

'Product'[Color] IN { "Red", "Blue", "White" }

( 'Date'[Year], 'Date'[MonthNumber] ) IN { ( 2017, 12 ), ( 2018, 1 ) }

This second example shows the syntax required to compare a set of columns (tuple) using the IN operator. Such syntax cannot be used with a comparison operator. In other words, the following syntax is not valid:

( 'Date'[Year], 'Date'[MonthNumber] ) = ( 2007, 12 )

However, we can rewrite it using the IN operator with a table constructor that has a single row, as in the following example:

( 'Date'[Year], 'Date'[MonthNumber] ) IN { ( 2007, 12 ) }

Conditional statements

In DAX we can write a conditional expression using the IF function. For example, we can write an expression returning MULTI or SINGLE depending on the quantity value being greater than one or not, respectively.

IF (
    Sales[Quantity] > 1,
    "MULTI",
    "SINGLE"
)

The IF function has three parameters, but only the first two are mandatory. The third is optional, and it defaults to BLANK. Consider the following code:

IF (
    Sales[Quantity] > 1,
    Sales[Quantity]
)

It corresponds to the following explicit version:

IF (
    Sales[Quantity] > 1,
    Sales[Quantity],
    BLANK ()
)

Save to your account

This chapter is from the book

Definitive Guide to DAX, The: Business intelligence for Microsoft Power BI, SQL Server Analysis Services, and Excel, 2nd Edition

Learn more Buy

Related resources

Microsoft Excel Pivot Table Data Crunching Including Dynamic Arrays, Power Query, and Copilot

By Bill Jelen
eBook $35.19

Collect, Combine, and Transform Data Using Power Query in Power BI and Excel, 2nd Edition

By Daniil Maslyuk, Gil Raviv
eBook $31.99

ChatGPT with Excel: Leverage ChatGPT for Data Success (Video)

By
Online video $119.99

See related titles