# Microsoft SQL Server 2012 Analysis Services: DAX Basics

- By Alberto Ferrari, Marco Russo, Chris Webb
- 7/15/2012

## Common DAX Functions

Now that you have seen the fundamentals of DAX and how to handle error conditions, take a brief tour through the most commonly used functions and expressions of DAX. In this section, we show the syntax and the meaning of various functions. In the next section, we show how to create a useful report by using these basic functions.

### Aggregate Functions

Almost every Tabular data model must operate on aggregated data.
DAX offers a set of functions that aggregate the values of a column in
a table and return a single value. We call this group of functions
*aggregate functions*. For example, the
expression

= SUM( Sales[Amount] )

calculates the sum of all the numbers in the
*Amount* column of the Sales table. This expression
aggregates all the rows of the Sales table if it is used in a
calculated column, but it considers only the rows that are filtered by
slicers, rows, columns, and filter conditions in a pivot table
whenever it is used in a measure.

In Table A-1
of the Appendix, you can see the complete list of aggregated
functions available in DAX. The four main aggregation
functions (*SUM, AVERAGE, MIN*, and
*MAX*) operate on only numeric values. These
functions work only if the column passed as argument is of numeric or
date type.

DAX offers an alternative syntax to these functions to make the
calculation on columns that can contain both numeric and nonnumeric
values such as a text column. That syntax adds the suffix A to the
name of the function, just to get the same name and behavior as Excel.
However, these functions are useful for only columns containing
*TRUE/FALSE* values because
*TRUE* is evaluated as 1 and
*FALSE* as 0. Any value for a text column is always
considered 0. Empty cells are never considered in the calculation, so
even if these functions can be used on nonnumeric columns without
returning an error, there is no automatic conversion to numbers for
text columns. These functions are named *AVERAGEA, COUNTA,
MINA*, and *MAXA*.

The only interesting function in the group of A-suffixed functions is *COUNTA*. It
returns the number of cells that are not empty and works on any type
of column. If you are interested in counting all the cells in a column
containing an empty value, you can use the
*COUNTBLANK* function. Finally, if you want to
count all the cells of a column regardless of their
content, you want to count the number of rows of the table, which can
be obtained by calling the *COUNTROWS* function.
(It gets a table as a parameter, not a column.) In other words, the
sum of *COUNTA* and *COUNTBLANK*
for the same column of a table is always equal to the number of rows
of the same table.

You have four functions by which to count the number of elements in a column or table:

*COUNT*operates only on numeric columns.*COUNTA*operates on any type of columns.*COUNTBLANK*returns the number of empty cells in a column.*COUNTROWS*returns the number of rows in a table.

Finally, the last set of aggregation functions performs calculations at the row level before they are aggregated. This is essentially the same as creating a column calculation and a measure calculation in one formula. This set of functions is quite useful, especially when you want to make calculations by using columns of different related tables. For example, if a Sales table contains all the sales transactions and a related Product table contains all the information about a product, including its cost, you might calculate the total internal cost of a sales transaction by defining a measure with this expression.

Cost := SUMX( Sales, Sales[Quantity] * RELATED( Product[StandardCost] ) )

This function calculates the product of
*Quantity* (from the Sales table) and
*StandardCost* of the sold product (from the
related Product table) for each row in the Sales table, and it returns
the sum of all these calculated values.

Generally speaking, all the aggregation functions ending with an
X suffix behave this way: they calculate an expression (the second
parameter) for each of the rows of a table (the first parameter) and
return a result obtained by the corresponding aggregation function
(*SUM, MIN, MAX*, or *COUNT*)
applied to the result of those calculations. We explain this behavior
further in Chapter 5.
Evaluation context is important for understanding how this calculation
works. The X-suffixed functions available are *SUMX,
AVERAGEX, COUNTX, COUNTAX, MINX*, and
*MAXX*.

Among the counting functions, one of the most used is
*DISTINCTCOUNT*, which does exactly what its name
suggests: counts the distinct values of a column, which it takes as
its only parameter.

*DISTINCTCOUNT* deserves a special mention
among the various counting functions because of its speed. If you have
some knowledge of counting distinct values in previous versions of
SSAS, which implemented Multidimensional only, you already know that
counting the number of distinct values of a column was problematic. If
your database was not small, you had to be very careful whenever you
wanted to add distinct counts to the solution and, for medium and big
databases, a careful and complex handling of partitioning was
necessary to implement distinct counts efficiently. However, in
Tabular, *DISTINCTCOUNT* is amazingly fast due to
the nature of the columnar database and the way it stores data in
memory. In addition, you can use *DISTINCTCOUNT* on
any column in your data model without worrying about creating new
structures, as in Multidimensional.

Following what you have already learned in Chapter 1, “Introducing the Tabular Model,” if you have a previous
SSAS cube that has many problematic *DISTINCTCOUNT*
results, measuring performance of the same solution rewritten in
Tabular is definitely worth a try; you might have very pleasant
surprises and decide to perform the transition of the cube for the
sole presence of *DISTINCTCOUNT*.

### Logical Functions

Sometimes you might need to build a logical condition in an
expression—for example, to implement different calculations depending
on the value of a column or to intercept an error condition. In these
cases, you can use one of the logical functions in DAX. You have
already seen in the previous section, “Handling Errors in DAX Expressions,” the two most
important functions of this group, which are *IF*
and *IFERROR*. In Table A-3 of the Appendix, you can see the
list of all these functions (which are *AND, FALSE, IF,
IFERROR, NOT, TRUE*, and *OR*) and their
syntax. If, for example, you want to compute the
*Amount* as *Quantity*
multiplied by *Price* only when the
*Price* column contains a correct numeric value,
you can use the following pattern.

Amount := IFERROR( Sales[Quantity] * Sales[Price], BLANK() )

If you did not use the *IFERROR* and the
*Price* column contains an invalid number, the
result for the calculated column would be an error because if a single
row generates a calculation error, the error is propagated to the
whole column. The usage of *IFERROR*, however,
intercepts the error and replaces it with a blank value.

Another function you might put inside this category is
*SWITCH*, which is useful when you have a column
containing a low number of distinct values, and you want to get
different behaviors, depending on the value. For example, the column
*Size* in the DimProduct table contains L, M, S,
and XL, and you might want to decode this value in a more meaningful
column. You can obtain the result by using nested
*IF* calls.

SizeDesc := IF (DimProduct[Size] = "S", "Small", IF (DimProduct[Size] = "M", "Medium", IF (DimProduct[Size] = "L", "Large", IF (DimProduct[Size] = "XL", "Extra Large", "Other"))))

The following is a more convenient way to express the same
formula, by using *SWITCH*.

SizeDesc := SWITCH (DimProduct[Size], "S", "Small", "M", "Medium", "L", "Large", "XL", "Extra Large", "Other" )

The code in this latter expression is more readable, even if it
is not faster, because, internally, switch statements are translated
into nested *IF* calls.

### Information Functions

Whenever you must analyze the data type of an expression, you
can use one of the information functions that are listed in Table A-4 of the Appendix. All these
functions return a TRUE/FALSE value and can be used in any logical expression. They are: *ISBLANK,
ISERROR, ISLOGICAL, ISNONTEXT, ISNUMBER*, and
*ISTEXT*.

Note that when a table column is passed as a parameter, the
*ISNUMBER, ISTEXT*, and
*ISNONTEXT* functions always return
*TRUE* or *FALSE*, depending on
the data type of the column and on the empty condition of each
cell.

You might be wondering whether *ISNUMBER* can
be used with a text column just to check whether a conversion to a number is possible. Unfortunately, you
cannot use this approach; if you want to test whether a text value can be converted to a number, you must try
the conversion and handle the error if it fails.

For example, to test whether the column
*Price* (which is of type
*String*) contains a valid number, you must write
the following.

IsPriceCorrect = ISERROR( Sales[Price] + 0 )

To get a *TRUE* result from the
*ISERROR* function, for example, DAX tries to add a zero to the
*Price* to force the conversion from a text value
to a number. The conversion fails for the *N/A*
price value, so you can see that *ISERROR* is
*TRUE*.

If, however, you try to use *ISNUMBER*, as in
the following expression

IsPriceCorrect = ISNUMBER( Sales[Price] )

you will always get *FALSE* as a result
because, based on metadata, the *Price* column is
not a number but a string.

### Mathematical Functions

The set of mathematical functions available in DAX is very similar to those in
Excel, with the same syntax and behavior. You can see the complete
list of these functions and their syntax in Table A-5 of the Appendix. The
mathematical functions commonly used are *ABS, EXP, FACT, LN,
LOG, LOG10, MOD, PI, POWER, QUOTIENT, SIGN*, and
*SQRT*. Random functions are
*RAND* and
*RANDBETWEEN*.

There are many rounding functions, summarized here.

FLOOR = FLOOR( Tests[Value], 0.01 ) TRUNC = TRUNC( Tests[Value], 2 ) ROUNDDOWN = ROUNDDOWN( Tests[Value], 2 ) MROUND = MROUND( Tests[Value], 0.01 ) ROUND = ROUND( Tests[Value], 2 ) CEILING = CEILING( Tests[Value], 0.01 ) ROUNDUP = ROUNDUP( Tests[Value], 2 ) INT = INT( Tests[Value] ) FIXED = FIXED(Tests[Value],2,TRUE) ISO = ISO.CEILING( Tests[Value], 0.01 )

In Figure 4-10, you can see the different results when applied to some test values.

**Figure 4-10** Different rounding functions lead to different
values.

As you can see, *FLOOR, TRUNC*, and
*ROUNDDOWN* are very similar, except in the way you
can specify the number of digits on which to round. In the opposite
direction, *CEILING* and
*ROUNDUP* are very similar in their results. You
can see a few differences in the way the rounding is done (see row B,
in which the 1.265 number is rounded in two ways on the second decimal
digit) between the *MROUND* and
*ROUND* functions. Finally, note that *FLOOR*
and *MROUND* functions do not operate on negative
numbers, whereas other functions do.

### Text Functions

Table A-6 of the Appendix contains
a complete description of the text functions available in DAX: they
are *CONCATENATE, EXACT, FIND, FIXED, FORMAT, LEFT, LEN,
LOWER, MID, REPLACE, REPT, RIGHT, SEARCH, SUBSTITUTE, TRIM,
UPPER*, and *VALUE*.

These functions are useful for manipulating text and extracting data from strings that contain multiple values, and are often used in calculated columns to format strings or find specific patterns.

### Conversion Functions

You learned that DAX performs automatic conversion of data types to adjust them to the need of the operators. Even if it happens automatically, a set of functions can still perform explicit conversion of types.

*CURRENCY* can transform an expression into a
currency type, whereas *INT* transforms an
expression into an integer. *DATE* and
*TIME* take the date and time parts as parameters
and return a correct DATETIME. *VALUE* transforms a string
into a numeric format, whereas *FORMAT* gets a
numeric value as the first parameter and a string format as its second
parameter, and can transform numeric values into strings.

### Date and Time Functions

In almost every type of data analysis, handling time and date is
an important part of the job. DAX has a large number of functions that
operate on date and time. Some of them make simple transformations to
and from a *datetime* data type, such as the ones
described in Table A-7 of the
Appendix. These are *DATE, DATEVALUE, DAY, EDATE, EOMONTH,
HOUR, MINUTE, MONTH, NOW, SECOND, TIME, TIMEVALUE, TODAY, WEEKDAY,
WEEKNUM, YEAR*, and *YEARFRAC*. To make
more complex operations on dates, such as comparing aggregated values
year over year or calculating the year-to-date value of a measure,
there is another set of functions, called time intelligence functions,
which is described in Chapter 8, “Understanding Time Intelligence in DAX.”

As mentioned before in this chapter, a
*datetime* data type internally uses a
floating-point number by which the integer part corresponds to the
number of days starting from December 30, 1899, and the decimal part
indicates the fraction of the day in time. (Hours, minutes, and
seconds are converted into decimal fractions of the day.) Thus, adding
an integer number to a datetime value increments the value by a
corresponding number of days. However, most of the time, the
conversion functions are used to extract day, month, and
year from a date.

### Relational Functions

Two useful functions that enable you to navigate through
relationships inside a DAX formula are *RELATED*
and *RELATEDTABLE*. In Chapter 5, you learn all the
details of how these functions work; because they are so useful, it is
worth describing them here.

You already know that a calculated column can reference column values of the table in which it is
defined. Thus, a calculated column defined in FactResellerSales can
reference any column of the same table. But what can you do if you
must refer to a column in another table? In general, you cannot use
columns in other tables unless a relationship is defined in the model
between the two tables. However, if the two tables are in
relationship, then the *RELATED* function enables
you to access columns in the related table.

For example, you might want to compute a calculated column in
the FactResellerSales table that checks whether the product that has
been sold is in the Bikes category and, if it is, apply a reduction
factor to the standard cost. To compute such a column, you must write
an *IF* that checks the value of the product
category, which is not in the FactResellerSales table. Nevertheless, a
chain of relationships starts from FactResellerSales, reaching
DimProductCategory through DimProduct and DimProductSubcategory, as
you can see in Figure 4-11.

**Figure 4-11** FactResellerSales has a chained relationship with
DimProductCategory.

It does not matter how many steps are necessary to travel from
the original table to the related one; DAX will follow the complete
chain of relationship and return the related column value. Thus, the
formula for the *AdjustedCost* column can be

=IF ( RELATED (DimProductCategory[EnglishProductCategoryName]) = "Bikes", [ProductStandardCost] * 0.95, [ProductStandardCost] )

In a one-to-many relationship, *RELATED* can
access the one side from the many side because, in that case, only one
row, if any, exists in the related table. If no row is related with
the current one, *RELATED* returns
*BLANK*.

If you are on the one side of the relationship and you want to
access the many side, *RELATED* is not helpful
because many rows from the other side are available for a single row
in the current table. In that case, *RELATEDTABLE*
will return a table containing all the related rows. For example, if
you want to know how many products are in this category, you can
create a column in DimProductCategory with this formula.

= COUNTROWS (RELATEDTABLE (DimProduct))

This calculated column will show, for each product category, the number of products related, as you can see in Figure 4-12.

**Figure 4-12** Count the number of products by using *RELATEDTABLE*.

As is the case for *RELATED, RELATEDTABLE*
can follow a chain of relationships, always starting from the one side
and going in the direction of the many side.