- Skill 1.1: Create a multidimensional database by using Microsoft SQL Server Analysis Services (SSAS)
- Skill 1.2: Design and implement dimensions in a cube
- Skill 1.3: Implement measures and measure groups in a cube
- Chapter summary
- Thought experiment
- Thought experiment answers
Skill 1.2: Design and implement dimensions in a cube
After using the Dimension Wizard to add a dimension object to your multidimensional database, you must then add it to one or more cubes to make it available to client applications. Additionally, there are some configuration steps to perform when you need to improve the user experience for browsing the cube, enable specific SSAS features, or optimize performance.
Select an appropriate dimension model, such as fact, parent-child, roleplaying, reference, data mining, many-to-many, and slowly changing dimension
There are several different types of dimension models that you can define for your multidimensional database. For some of the dimensional models, you define the type by configuring its dimension usage relationship type on the Dimension Usage page of the cube designer. For the other dimensional models, you configure dimension or cube properties. In this section, we review each type of dimension model and the steps required to configure it.
Regular dimension model
When a dimension does not meet the criteria for any of the other dimensional models described in this section, it is considered to be a regular dimension. When you add it to a cube, the existence of a foreign key relationship between its corresponding dimension table and a measure group’s fact table usually results in the addition of a regular relationship type. Sometimes, for various reasons, you might need to add this relationship type manually.
You can confirm the existence of a regular relationship by reviewing dimension usage. To do this, perform the following steps:
In Solution Explorer, double-click Wide World Importers DW.cube to open the cube designer, and then click the Dimension Usage tab. When you see text displayed without an icon in the cell intersection between a dimension and a measure group, as shown in Figure 1-23, the relationship type is Regular.
FIGURE 1-23 Cube dimension usage defining a relationship between a dimension and a measure group
Click the ellipsis button in the cell intersection to open the Define Relationship dialog box, shown in Figure 1-24.
FIGURE 1-24 Define Relationship dialog box showing the configuration of a regular relationship
When Regular is selected in the Select Relationship Type drop-down list, you must identify the granularity attribute in the dimension, and the corresponding column in the measure group table.
In the Granularity Attribute drop-down list, you can select any attribute that exists in the dimension object in your project. This list does not include attributes from the dimension table appearing in the DSV if those attributes are not added to the dimension object.
When you select a granularity attribute, the attribute’s corresponding key columns in the dimension table, as defined in the dimension object, display in the Dimension Columns list at the bottom of the dialog box. You then match each key column to a foreign key column in the fact table associated with the measure group.
Think of dimension usage for a regular relationship as the SSAS definition of a foreign key relationship. The existence of that relationship in the DSV is not sufficient because that is used whenever SSAS needs to generate SQL statements during processing, or when translating MDX to SQL. The DSV represents the physical modeling of the data source, even if it includes derived elements. The purpose of dimension usage is to define the logical relationships between dimension and measure groups objects. It represents the logical modeling of your multidimensional database.
Fact dimension model
A fact dimension model allows you to define a relationship between a measure group and dimension object that you create from a degenerate dimension. Remember from Skill 1.1 that a degenerate dimension is a type of column found in a fact table. Because both the measure group and the dimension come from the same table, there is no additional configuration required after you specify the Fact relationship type.
Let’s set up a new dimension, configure a fact dimension model, and check the results by following these steps:
In Solution Explorer, right-click the Dimensions folder, select New Dimension, and click Next twice to proceed to the Specify Source Information page of the wizard.
In the Main Table drop-down list, select Sale.
The two primary keys in the Sale table display in the Key Columns list, which requires you to select an item in the Name Column list. However, there is no column that adequately represents a value for each row in this table and the invoice date is not required to uniquely identify each row in the table. Therefore, you can clear this value.
Open the drop-down list containing Invoice Date Key, scroll to the top of the list, and then select the blank row at the top of the list to remove Invoice Date Key. Now you no longer need to provide a Name Column value, so click Next to continue.
On the Select Related Tables page, clear each check box, and click Next.
On the Select Dimension Attributes page, select the check box to the left of WWI Invoice ID, scroll down, and clear the check boxes for Delivery Date Key, Stock Item Key, Invoice Date Key, Salesperson Key, Customer Key, City Key, and Bill To Customer Key, and then click Next.
Type Invoice in the Name box, and click Finish to complete the wizard.
In the Attributes pane, click Sale Key to highlight it. Then, in the Properties window, select False in the AttributeHierarchyVisible drop-down list.
By setting this property to False, you render the Sale Key attribute inaccessible to business users when they browse the cube. As a surrogate key, it has no meaning for business analysis, but is required in the dimension to uniquely identify each row and to establish the fact relationship in dimension usage.
Rename the WWI Invoice ID attribute by right-clicking it in the Attributes pane, selecting Rename, and then typing Invoice.
Double-click the Wide World Importers DW.cube file in Solution Explorer, go to the Cube Structure page, right-click the white space in the Dimensions pane, select Add Cube Dimension, select Invoice, and click OK. Now the dimension in the project is included in the cube.
Click the Dimension Usage tab of the cube designer, click the cell labeled Sale Key, and click the ellipsis button in the cell.
In the Define Relationship dialog box, select Fact in the Select Relationship Type drop-down list, as shown in Figure 1-25. There is nothing else to configure because the column for the granularity attribute for the dimension is also a column in the fact table for the measure group.
FIGURE 1-25 Define Relationship dialog box showing the configuration of a fact relationship
Click OK to close the dialog box. An icon displays in the Sale Key cell to indicate the fact relationship between the Invoice dimension and the Sale measure group.
Right-click the project in Solution Explorer, and select Deploy.
After the Deployment Completed Successfully message displays, click the Browser tab in the cube designer, expand the Measures node in the cube metadata tree, expand the Sale folder, and drag Sale Count to the query window labeled Drag Levels Or Measures Here To Add To The Query.
Expand the Invoice node, and drag Invoice (next to the rectangular collection of blue squares) to the area next to Sale Count. The browser generates an MDX query behind the scenes and displays the results, as shown in Figure 1-26. Here you can see some invoices have only one item sold, while others have multiple items sold. As you add dimensions to the cube, you can create a query to see the specific stock items sold, the salesperson selling the item, and more.
FIGURE 1-26 Browser in the cube designer showing Sale Count by Invoice
Parent-child dimension model
You use a parent-child dimension model when you have a dimension table that includes a foreign-key relationship to itself. It is also referred to as a ragged hierarchy or self-referencing relationship. Unlike a standard hierarchical structure in which there are a fixed number of levels, such as Month, Quarter, and Year in a Date dimension, a ragged hierarchy can have varying numbers of levels for each node. Common examples of ragged hierarchies are organizational charts, financial charts of accounts, or bills of materials.
The WideWorldImportersDW database does not include a parent-child dimension, but you can artificially create one by converting the Employee table in the DSV to a named query in which you arbitrarily assign employees to managers. To do this, double-click Wide World Importers DW.dsv in Solution Explorer to open the DSV designer. In the Tables pane, right-click the Employee table, point to Replace Table, and select With New Named Query. In the Create Named Query dialog box, replace the SELECT statement with the statement shown in Listing 1-1, as shown in Figure 1-27, and then click OK.
LISTING 1-1 Named query statement to arbitrarily assign managers to employees
SELECT [Employee Key], CASE WHEN [WWI Employee ID] >= 3 AND [WWI Employee ID] <= 11 THEN 194 WHEN [WWI Employee ID] >= 12 AND [WWI Employee ID] <= 15 THEN 203 WHEN [WWI Employee ID] >= 16 AND [WWI Employee ID] <= 19 THEN 208 WHEN [WWI Employee ID] = 20 THEN 210 ELSE NULL END AS [Manager Key], [WWI Employee ID], [Employee], [Preferred Name], [Is Salesperson] FROM [Dimension].[Employee];
FIGURE 1-27 Create Named Query dialog box showing a SELECT statement to generate a DSV table
Next, create the Employee dimension in the project by first following the steps described in the “Develop dimension” section earlier in this chapter and then performing these steps:
In the Specify Source Information page of the Dimension Wizard, use Employee as the Main Table and set Employee Key as the Key Column, and Employee as the Name Column. Click Next.
In the Select Dimension Attributes page of the wizard, select the Manager Key check box. Then click Next, and click Finish.
In the Attributes pane of the dimension designer, rename Employee Key as Employee, and Manager Key as Employees.
With Employees selected, locate the Usage property in the Properties window, and then select Parent in the Usage drop-down list. After you set this property, an icon displays next to the Employees attribute to designate its status as a parent-child hierarchy.
To observe the effect of this change, right-click the project in Solution Explorer, and select Deploy to update the multidimensional database on the server with the new dimension.
When you see the message the deployment is complete, click the Browser page of the dimension designer, and select Employees in the Hierarchy drop-down list.
Expand the All member in the browser, and then expand each member with a plus symbol next to it to view the parent-child hierarchical structure, as shown in Figure 1-28. In the Employees table in the WideWorldImportersDW table, there are multiple rows for many employees due to its slowly changing dimension design, which results in the appearance of duplicates in the dimension browser. You can safely ignore these duplicates for the purposes of this example.
FIGURE 1-28 Browser in the dimension designer showing the Employees parent-child hierarchy
When you create a parent-child dimensional model and set an attribute’s Usage property to Parent, you can also optionally configure the following additional properties for that attribute:
MembersWithData The default setting for this property is NonLeafDataVisible. The assumption with this setting is that the fact table’s key column associated with this dimension contains values for both parent and child members. In the case of the WideWorldImportersDW data (after creating a contrived relationship for managers and employees in the named query), that means managers and employees are associated with rows in the fact table, such as Archer Lamble and Jack Potter. When you browse the cube, when you drill down to break out the total sales for Archer Lamble, you see it includes Archer’s sales reported as a separate value from Jack’s sales (which you can see if you drill down from Jack’s manager Katie Darwin).
If you change this property to NonLeafDataHidden, the totals for each manager are calculated correctly. However, you do not see the manager’s data in the query results when you drill down to view the employees’ sales for that manager. This behavior can be confusing to business users and is not recommended as a best practice.
MembersWithDataCaption If you use the NonLeafDataVisible setting for the MembersWithData property, you can set the caption for the attribute member that displays when you drill from the parent to the child attribute members. If you leave this property blank, the caption for both parent and child is identical. To more clearly differentiate the parent member as an aggregate value and its corresponding child member, you can use an asterisk as a wildcard and a string value to combine the actual member name with a static string.
As an example, you can type * (Manager) in the MembersWithData Caption. For the WideWorldmportersDW named query in this chapter’s sample project, Archer’s sales as a parent member is the aggregate value of sales made by Archer and Jack and displays as Archer Lamble. If you drill down, you see Archer’s sales as a child member displayed as Archer Lamble (Manager).
NamingTemplate You use this property when you want to assign level names to each level in a parent-child hierarchy, which is important when your users browse the cube by using Excel or Power BI. If you leave this property blank, SSAS assigns a level names like Level 01, Level 02, and so on.
In a standard hierarchy, you define the level names when you create the hierarchy, as explained later in this chapter in the “Define attribute relationships” section. If you want to provide a name for levels in a parent-child hierarchy, your only option is to use the naming template.
A parent-child hierarchy does not have a fixed number of levels, and each branch of the hierarchy tree can have zero to many different levels. The second level down one branch does not necessarily have the same context as the second level down another branch. For example, in the Employee dimension, Archer Lamble is a manager on the first level below the All member, and the second level below Archer includes Jai Shand, Katie Darwin, and Piper Koch. Let’s say hypothetically that each of these employees is a customer service representative. However, another member on the first level is Kayla Woodcock and the child members on the second level of this branch of the hierarchy are all sales staff. Therefore, it does not make sense to use a naming template in this scenario where the second level varies by branch.
When you have a parent-child hierarchy with a more consistent structure, you can use the NamingTemplate property to define a unique string for each level of the hierarchy. On the bottom level in the template, append an asterisk to the name to instruct SSAS to add a number that increments by 1 for each subsequent level when there are more levels in the hierarchy than named levels in the template.
RootMemberIf You use this property to help SSAS distinguish parent members from child members. You can choose from one of the following values in the property’s drop-down list.
ParentIsBlankSelfOrMissing This is the default setting, which covers any possible scenario to define a parent member. SSAS looks in the column defined as the parent key column, which is the attribute with the Usage property set to Parent, and if that column is blank, contains the same value as the dimension’s key attribute column, or is not also found in the dimension’s key attribute column on another row, the member is flagged as a parent member at the top level of the hierarchy. In other words, the member displays as the first level of members below the All member.
ParentIsBlank You use this setting when you want SSAS to consider a member as a parent only when the parent attribute column is blank.
ParentIsSelf You use this setting when you want SSAS to define a parent only when the parent attribute column matches the value in the key attribute column.
ParentIsMissing You use this setting when SSAS should identify a parent based on a parent attribute column value is not found anywhere in the table in the key attribute column.
UnaryOperatorColumn This property is used only for special design scenarios, such as a financial account dimension in which you need to change the default rollup behavior. Although many data warehouse developers opt to use positive and negative values in a financial fact table to manage rollup behavior for account balances, another less commonly used option is to include a column in the financial account dimension to contain operators that define this behavior. For example, when you want to compute a parent account member’s aggregate value, such as Income (Loss), you typically want to subtract one of its children, Expenses, from another of its children, Revenue. That is, Income (Loss) equals Revenue minus Expenses. You must include a value, such as a plus or minus symbol, in a column in your dimension table, and then map that column to the UnaryOperatorColumn property to define the rollup behavior.
Roleplaying dimension model
A roleplaying dimension is one for which you define one dimension in the project, but use that dimension multiple times with the same fact table. In the WideWorldImportersDW database, you can see two examples of a roleplaying dimension—Customer and Date. Specifically, the Sale fact table has two foreign key relationships to the Customer table and two foreign key relationships to the Date table, as shown in Figure 1-29.
FIGURE 1-29 Roleplaying dimensions in the WideWorldImportersDW database
When multiple foreign key relationships exist between a fact table and a dimension table, each relationship has a different business meaning or role. For the Customer dimension, the CustomerKey column in the Sale table represents the customer to whom the sale was made, whereas the BillToCustomerKey column represents the customer who is responsible for paying the invoice. For the Date dimension, the InvoiceDateKey column in the Sale table reflects the date of the sale while the DeliveryDateKey is the date the stock item was delivered to the customer.
First, add the Customer and Date dimensions to the project by following the steps described in the “Develop dimension” section earlier in this chapter and then perform the following steps:
Configure each dimension and its attributes by using the settings shown in Table 1-1.
TABLE 1-1 New dimensions for roleplaying
Attribute to add
Attribute Property Value
Calendar Month Label
Calendar Month Number
Calendar Month Label
Calendar Year Label
Calendar Year Label
ISO Week Number
ISO Week Number
Next, add the Customer and Date dimensions to the cube. Double-click the Wide World Importers DW.cube, and then, on the Cube Structure page, right-click the white space in the Dimensions pane, and select Add Cube Dimension. While pressing the CTRL key, select Customer and Date, and then click OK.
Notice that although you selected two new dimensions to add to the cube, there are four new dimensions: Customer, Bill To Customer, Delivery Date, and Invoice Date. The cube designer detected the roleplaying characteristics of the Customer and Date dimensions based on the relationships defined in the DSV. If you check the relationship definitions on the Dimension Usage tab, you can see that the granularity attributes for each new dimension is correctly matched to a separate column in the measure group table.
After you deploy the project to update the server with the changes (and click Yes to confirm overwriting the database if necessary), open the Browser tab in the cube designer.
Click Reconnect, the third button from the left on the cube designer toolbar, to restore the connection to the SSAS server after deployment.
Expand the Measures node and Sale folder, and then drag Sale Count to the query window.
Then expand Customer and drag Customer.Customer to the query window to view sale counts by individual customers, as shown in Figure 1-30.
FIGURE 1-30 Query results showing sales counts by customer on the Browser page of the cube designer
You can then change the query to test the Bill To Customer roleplaying dimension. Drag the Customer column from the query window to the left side of the screen.
Then expand Bill To Customer and drag Bill To Customer.Customer to the query window to view sale counts by Bill To Customer, as shown in Figure 1-31. Notice that each roleplaying cube dimension based on the Customer dimension has the same set of attributes: Buying Group, Category, and Customer. The difference in queries is a result of the join to a different column in the measure group table.
FIGURE 1-31 Metadata pane and query window on the Browser page of the cube designer showing sale counts by Bill To Customer
Reference dimension model
A reference dimension model is useful when you want to analyze measures by a dimension for which a relationship does not exist directly in the fact table. Instead, the fact table has a relationship with another dimension, which also has a relationship with the reference dimension. As an example, we need to consider how to model the multidimensional database if the WideWorldImportersDW database were missing the roleplaying dimension for Bill To Customer, and instead were to have a separate dimension table to reference it. Figure 1-32 shows a hypothetical structure in the sample database if it were designed to use a reference dimension for the Bill To Customer rather than include its relationship directly.
FIGURE 1-32 Source tables for a reference dimension model
You can simulate this structure in the project by creating a second DSV and then creating a cube based on the new DSV by following these steps:
Right-click the Data Source Views folder, select New Data Source View, click Next, select the Wide World Importers DW data source, click Next twice, type WWI DW Alt in the Name box, and click Finish.
Right-click the DSV window, select New Named Query, type Sale Alt in the Name box, and then replace the SELECT statement with the statement in Listing 1-2.
LISTING 1-2 Named query statement to remove Bill To Customer Key from the Sale table
SELECT [Sale Key], [City Key], [Customer Key], [Stock Item Key], [Invoice Date Key], [Delivery Date Key], [Salesperson Key], [WWI Invoice ID], [Description], [Package], [Quantity], [Unit Price], [Tax Rate], [Total Excluding Tax], [Tax Amount], [Profit], [Total Including Tax], [Total Dry Items], [Total Chiller Items] FROM [Fact].[Sale];
Right-click again in the DSV, select New Named Query, type Customer Alt in the Name box, and then replace the SELECT statement with the statement in Listing 1-3.
LISTING 1-3 Named query statement to add Bill To Customer Key to the Customer table
SELECT c1.[Customer Key], c2.[Customer Key] as [Bill To Customer Key], c1.[WWI Customer ID], c1.Customer, c1.Category, c1.[Buying Group] FROM [WideWorldImportersDW].[Dimension].[Customer] c1 LEFT OUTER JOIN [WideWorldImportersDW].[Dimension].[Customer] c2 ON c1.[Bill To Customer] = c2.[Customer];
Right-click once more in the DSV, select New Named Query, type Bill To Customer in the Name box, and then replace the SELECT statement with the statement in Listing 1-4.
LISTING 1-4 Named query statement to create Bill To Customer
SELECT [Customer Key] AS [Bill To Customer Key], [Customer] AS [Bill To Customer] FROM [Dimension].[Customer] WHERE [Customer Key] in (1, 202);
Next, define logical primary keys for each named query. To this, right-click Sale Key in the Sale Alt named query and select Set Logical Primary Key. Repeat this step for the Customer Key in the Customer Alt named query and the Bill To Customer Key in the Bill To Customer named query.
Then create a relationship between Sale Alt and Customer Alt by selecting Customer Key in Sale Alt and dragging it to Customer Key in Customer Alt. Repeat this process to create relationship between Bill To Customer Key in the Customer Alt named query and Bill To Customer Key in the Bill To Customer named query.
Add the two dimensions to the project by following the steps described in the “Develop dimension” section earlier in this chapter using the settings shown in Table 1-2. Be sure to change the DSV in the Dimension Wizard to WWI DW Alt. When you create the Customer Alt dimension, do not clear the Bill To Customer check box on the Select Related Tables page, and do not clear the Bill To Customer Key check box on the Select Dimension Attributes page of the wizard.
TABLE 1-2 New dimensions for a reference dimensional mode
Attribute to add
Attribute Property Value
Bill To Customer Key
Bill To Customer
Bill To Customer
Bill To Customer Key
Bill To Customer
Next create a cube based on the new DSV. Right-click the Cubes folder, select New Cube, click Next twice, select WWI DW Alt in the Data Source View drop-down list, select the Sale Alt check box, click Next, clear the Measure check box, select Sale Alt Count, click Next twice, clear the Sale Alt check box on the Select New Dimensions page, click Next, type Reference Dimension Cube in the Cube Name box, and click Finish.
Although the Bill To Customer dimension was listed on the Select Existing Dimensions, because it is in the same DSV as the Sale Alt measure group, it is not added as a cube dimension automatically. No direct relationship exists between the Sale Alt and Bill To Customer named queries in the DSV. Nonetheless, you can add the dimension to the cube manually as a reference dimension.
Right-click inside the Dimensions pane on the Cube Structure page of the cube designer, select Add Cube Dimensions, select Bill To Customer, and click OK.
Click the Dimension Usage tab, click the cell intersection between the Bill To Customer dimension and the Sale Alt measure group, click the ellipsis button that appears, select Referenced in the Select Relationship Type drop-down list, select Customer Alt in the Intermediate Dimension drop-down list, select Bill To Customer in the Reference Dimension Attribute drop-down list, and select Bill To Customer Key in the Intermediate Dimension Attribute drop-down list, as shown in Figure 1-33. Click OK.
FIGURE 1-33 Define Relationship dialog box showing the configuration of a referenced relationship
Deploy the project by right-clicking the project in Solution Explorer, and selecting Deploy.
Click the Browser tab in the cube designer, expand Measures, expand the Sale Alt folder, and drag Sale Alt Count to the query window. Then expand Bill To Customer, and drag Bill To Customer to the query window, as shown in Figure 1-34.
FIGURE 1-34 Metadata pane and query window on the Browser page of the cube designer showing sale counts by Bill To Customer
Data mining dimension model
You can take advantage of the data mining engine built into SSAS to build a new dimension by which you can analyze data. A common implementation approach is to use the clustering algorithm to use demographic and other data to group customers into clusters and then add the clusters as a dimension to your cube where you can slice and dice your data by clusters.
The source data in WideWorldImportersDW does not lend itself well to clustering or any other data mining algorithm available in SSAS. Nonetheless, you can perform the following steps to understand the mechanics of implementing a data mining model in a multidimensional database:
Your first step is to set up a data mining model in your project by right-clicking the Mining Structures folder in Solution Explorer and selecting New Mining Structure.
Click Next in the Data Mining Wizard, select From Existing Cube, and click Next. Select Microsoft Clustering in the Which Data Mining Technique Do You Want To Use drop-down list, and then click Next.
On the Select The Source Cube Dimension page, select Customer, and then click Next.
On the Select The Case Key page, select Customer. Click Next.
On the Select Case Level Columns page, select Profit, Quantity, and Sale Count. Click Next.
The case level identifies the granularity of the data mining structure. In other words, it identifies the level of detail to data mine. In this example, a case is a unique customer.
On the Specify Mining Model Column Usage page, select the check box for Profit in the Predictable column, as shown in Figure 1-35.
FIGURE 1-35 Specify Mining Model Column Usage page of the Data Mining Wizard
Click Next four times to complete the wizard.
On the Completing The Wizard page, keep the Create Mining Model Dimension check box selected, type Customer Clusters in the text box to the right of this check box, select the Create Cube Using Mining Model Dimension, as shown in Figure 1-36, and then click Finish.
FIGURE 1-36 Completing The Wizard page of the Data Mining Wizard
Right-click the project in Solution Explorer, and select Deploy to deploy the updates to the SSAS server and process.
Double-click the Wide World Importers DW_DM.cube file in Solution Explorer, click the Browser tab in the cube designer, expand the Measures folder, expand the Sale folder, and drag Profit to the query window. Then expand the Customer Clusters folder and then drag MiningDimensionContentNodes to the query window to view the breakdown of sales by cluster. When you use more realistic data than WideWorldImportersDW, more clusters are likely to be available for analysis.
Many-to-many dimension model
The use of a many-to-many dimensional model is necessary when you need to support a more complex type of analysis than we have explored to this point. Traditional analysis is a one-to-many model in which one dimension member can be associated with many fact rows. Put another way, a single fact row is associated with one member in a related dimension. In a many-to-many relationship, a single fact row can be associated with many dimension members.
This type of modeling is common in banking analysis in which a single bank account can be associated with multiple account holders, and therefore a transaction in a fact table has a many-to-many relationship with account holders. The medical industry also often requires many-to-many modeling. As an example, a patient can have multiple diagnoses, so an event such as a hospital stay can be associated with multiple members in a diagnosis dimension.
As a result of implementing this type of model, you can roll up aggregated values within a dimension by different members without overcounting values. Let’s say in a simple cube you have a bank account with two account holders and the account has a $100 balance. If you query balances for all account holders, SSAS lists each account holder separately with the $100 balance, but the aggregate value for all account holders is $100. SSAS knows how to associate the fact rows individually to each dimension member, but also correctly computes the aggregate values for the dimension.
Before you can implement a many-to-many dimension model in SSAS, you need to structure your source tables or DSV appropriately. Nothing in the WideWorldImportersDW source tables lends itself to this type of modeling, so let’s set up a simple scenario in which a new dimension contains sales reasons. Sales reasons describe why a customer made a purchase and are captured in a survey conducted at the time of the purchase. Any sale transaction can be associated with zero to many sales reasons. To model this scenario, you need to add a new Sales Reason dimension and a Sales Reason Bridge to associate sales transactions from the Sale table to the sales reasons in the new dimension, as shown in Figure 1-37.
FIGURE 1-37 Source tables for a many-to-many dimension model
To set up this scenario in the current project, perform the following steps:
Create a named query as a source for the Sales Reason dimension. To do this, double-click Wide World Importers DW.dsv in Solution Explorer to open the DSV designer.
In the diagram pane, right-click anywhere in the diagram pane, and select New Named Query.
In the Create Named Query dialog box, type Sales Reason in the Name box, replace the SELECT statement with the statement shown in Listing 1-5, and then click OK.
LISTING 1-5 Named query statement to create Sales Reason dimension
SELECT 1 AS SalesReasonKey, 'Value' AS SalesReason UNION SELECT 2 AS SalesReasonKey, 'Manufacturer' AS SalesReason UNION SELECT 3 AS SalesReasonKey, 'Sales Person' AS SalesReason;
Right-click SalesReasonKey in the diagram, and select Select Logical Primary Key.
Now create a bridge table to define the relationship between the fact table and the new dimension. In the diagram pane of the DSV, right-click anywhere, and select New Named Query.
In the Create Named Query dialog box, type Sales Reason Bridge in the Name box, replace the SELECT statement with the statement shown in Listing 1-6, and then click OK.
LISTING 1-6 Named query statement to create Sales Reason fact bridge
SELECT CAST(1 AS bigint) AS [Sale Key], 1 AS SalesReasonKey UNION SELECT CAST(1 AS bigint) AS [Sale Key], 2 AS SalesReasonKey UNION SELECT CAST(1 AS bigint) AS [Sale Key], 3 AS SalesReasonKey UNION SELECT CAST(2 AS bigint) AS [Sale Key], 1 AS SalesReasonKey;
Select the Sale Key column in the Sales Reason Bridge table and drag it to the Sale key column in the Sale table to define a new relationship.
Repeat the previous step to create a relationship between SalesReasonKey in the Sales Reason Bridge table and the SalesReasonKey column in the Sales Reason table.
Create the Sales Reason dimension in the project by following the steps described in the “Develop dimension” section earlier in this chapter, using Sales Reason as the Main Table and setting Sales Reason Key as the Key Column and Sales Reason as the Name Column. Click Next twice, and click Finish.
Rename the Sales Reason Key attribute to Sales Reason.
Double-click Wide World Importers DW.cube in Solution Explorer to open the cube designer.
Right-click in the Dimension pane on the Cube Structure page, select Add Cube Dimension, select Sales Reason, and click OK.
Right-click the Measures pane, select New Measure Group, and select Sales Reason Bridge to add the bridge table to the cube.
Expand the Sales Reason Bridge folder in the Measures pane, select Sales Reason Bridge Count, and then, in Properties window, select False in the Visible drop-down list.
Open the Dimension Usage page of the cube designer. Notice the Invoice and Sales Reason dimensions have a regular relationship with the Sales Reason Bridge measure group.
To set up the many-to-many relationship, click the intersection between Sales Reason and Sale, click the ellipsis button, select Many-To-Many in the Select Relationship Type drop-down list, select Sales Reason Bridge in the Intermediate Measure Group drop-down list, as shown in Figure 1-38, and click OK.
FIGURE 1-38 Define Relationship dialog box showing the configuration of a many-to-many relationship
Deploy the project, and then open the Browser page of the cube designer, click Reconnect in the toolbar.
On the Cube menu, select Analyze In Excel. Click Enable in the Microsoft Security Notice message box.
In the PivotTable Fields list, drag the Invoices check box to the Filters pane below the list.
In the pivot table, open the drop-down list, select the Select Multiple Items check box, clear the All check box, expand All, and select the 1 and 2 check boxes.
In the PivotTable Fields list, select the Sale Count check box, and notice the pivot table displays a total of 3.
In the PivotTable Fields list, select the Invoice check box and notice there are two invoices, 1 and 2. Invoice 1 has a sale count of 1 and Invoice 2 has a sale count of 2.
Now select Sales Reason in the PivotTable Fields list. A second level is added to the pivot table in which there are three sales reasons for Invoice 1, each with a sale count of 1: Value, Manufacturer, and Sales Person. Invoice 2 has one sales reason, Value, with a sale count of 1. Notice the total remains 3 in spite of the four instances of sales reasons in the pivot table having a sale count of 1. The sale counts are not overcounted even when a many-to-many relationship is present in the query’s dimensions.
Slowly changing dimension model
As explained in the “Source table design” section in Skill 1.1, “Create a multidimensional database by using Microsoft SQL Server Analysis Services (SSAS),” there are two types of SCD tables that you can design and then add to your DSV—Type 1 for which you do not track historical changes, and Type 2 for which you track historical changes. When you use the Type 1 SCD design, you do not need to perform any additional tasks in the DSV or dimension design, although you should be aware of the effect of data changes during processing as explained in Chapter 4, which in turn can influence how you configure attribute relationships as described later in this skill. On the other hand, when you use the Type 2 SCD design, there are some specific tasks you should perform in the dimension design to support the ability to browse both the current version and the historical version of dimension attributes.
The first task is to ensure your Type 2 SCD table design includes columns to support slowly changing dimensions as described in Skill 1.1. That is, you must have a surrogate key, a business key, SCD history columns, and at least one attribute for which values change over time for which you want to preserve history.
The Stock Item table in the Wide World Importers DW DSV meets these criteria; so let’s add it to the project as a slowly changing dimension by following the steps described in the “Develop dimension” section earlier in this chapter and then performing these steps:
In the Dimension Wizard, use Stock Item as the Main Table, set Stock Item Key as the Key Column, and set Stock Item as the Name Column.
In the Select Dimension Attributes page of the wizard, keep the Stock Item Key check box selected, and then select the WWI Stock Item ID check box to add the business key, the Color check box to add the SCD attribute, and the Size check box to add an unchanging attribute. Then click Next, and click Finish.
Your next set of tasks is to change properties for the surrogate key and business key attributes. First, rename the Stock Item Key as Stock Item History, and WWI Stock Item ID as Stock Item.
Next, set the NameColumn property for the business key. The NameColumn property is set automatically for the surrogate key by the wizard. In this case, you use the same column to define the Name Column property for the business key. To do this, select Stock Item (the renamed business key attribute) in the Attributes pane of the dimension designer, scroll down to locate the NameColumn property, click the box to the right of the property name, click the ellipsis button that displays, select Stock Item in the Name Column dialog box, and then click OK.
Last, add the dimension to a cube. In Solution Explorer, double-click Wide World Importers DW.cube, right-click inside the Dimensions pane on the Cube Structure page of the cube designer, select Add Cube Dimension, select Stock Item, and click OK. The dimension usage is set correctly by default with a regular relationship due to the relationship defined between the Sale fact table and the Stock Item dimension table, so no additional configuration is required.
There is no data in the World Wide Importers DW sales to properly illustrate the effect of SCD changes, so execute the statement shown in Listing 1-7 to add a new sale using the surrogate key for a current stock item.
LISTING 1-7 Insert a row into the sales fact table for a current stock item
INSERT INTO [Fact].[Sale] ([City Key], [Customer Key], [Bill To Customer Key], [Stock Item Key], [Invoice Date Key], [Delivery Date Key], [Salesperson Key], [WWI Invoice ID], [Description], [Package], [Quantity], [Unit Price], [Tax Rate], [Total Excluding Tax], [Tax Amount], [Profit], [Total Including Tax], [Total Dry Items], [Total Chiller Items], [Lineage Key]) VALUES (0, 0, 0, 604, '2016-05-31', '2016-05-31', 0, 70511, 'USB rocket launcher (Gray)', 'Each', 1000, 25, 15, 25000, 3750, 15500, 28750, 1000, 0, 11); GO
Deploy the project to see how the SCD design affects query results. Right-click the project in Solution Explorer, select Deploy, and click Yes to overwrite the database on the server.
After deployment is complete, click the Browser page of the cube designer, and click the Reconnect button in the toolbar.
Now set up the query. Expand the Measures folder in the metadata pane, expand the Sale folder, and then drag Sale Count to the query window.
Next, expand Stock Item in the metadata tree. This top-level Stock Item node represents the dimension as indicated by the icon containing a cube and arrows pointing in three directions. Below this is another Stock Item node, which is an attribute as indicated by the set of blue squares arranged in a rectangular shape.
Drag the Stock Item attribute to the query window, and then drag the right edge of the column header to the right to widen the column for better visibility of each stock item’s name, as shown in Figure 1-39. Notice that the current count for USB Rocket Launcher (Gray) is 1079.
FIGURE 1-39 Metadata pane and query window on the Browser page of the cube designer showing sale counts by stock item
If you drag the Color attribute into the query window, you see the Sale Count is 1078 for the historical color Gray and 1 for the current color Steel Gray. The Stock Item attribute is an aggregate of all historical information, but you can also view the historical information for the changing attribute Color when you add it to the query.
Now drag the Stock Item column header from the query window to the left to remove it from the query, and then drag Stock Item History from the metadata pane to the query window. Drop it to the left of the Color column.
Next, filter the query to more easily locate the stock item in the previous query. To do this, select Stock Item in the Dimension drop-down list above the query window, and select Stock Item History in the Hierarchy drop-down list.
In the Operator drop-down list, select Begins With. In the Filter Expression box, type USB rocket launcher and press Enter. Then remove the Color column. Now you see two stock items with the same name, as shown in Figure 1-40, because these two dimension members have separate surrogate keys, and each surrogate key appears in one or more rows in the fact table.
FIGURE 1-40 Metadata pane and query window on the Browser page of the cube designer showing sale counts by stock item history
When you include the Color attribute in the query, you can see that each Stock Item History member corresponds to a different color. However, when Color is not included in the query, the appearance of two separate rows in the query results is potentially confusing to business users. For this reason, a common design approach for modeling SCDs is to hide the attribute associated with the surrogate key so that business users cannot accidentally use it in a query when exploring the cube.
In this case, you can hide the Stock Item History attribute by returning to the Stock Item dimension designer, selecting Stock Item History in the Attributes pane, and changing the AttributeHierarchyVisible property in the Properties window to False.
After you deploy the project, you can reconnect to the cube on the Browser page of the cube designer and confirm that the Stock Item History attribute no longer appears in the metadata pane, as shown in Figure 1-41.
FIGURE 1-41 Metadata pane in the Browser page of the cube designer showing the Stock Item dimension’s attributes after hiding the key attribute
Implement a dimension type
Each dimension has a Type property that is set by default to Regular. In general, when the property is not set to Regular, the Type setting instructs either the SSAS server, the client application, or both the server and client application how to perform certain types of calculations to alleviate the need to create complex calculated members, custom rollup formulas, or MDX scripts. Typically, you must also define a Type property for one or more attributes in the dimension to configure the desired behavior.
You can choose from the following Type values to define the applicable calculation behavior for a dimension:
Accounts Sets specific server-side behaviors for aggregating financial accounts. When you use this dimension type, you must also configure an attribute’s Type property as AccountType for the attribute containing members such as Assets, Liabilities, Revenue, and so on. SSAS uses the account type to determine whether to aggregate values for dates within a calendar or fiscal year for income statement accounts (such as revenue and expenses), or to compute point-in-time values for balance sheet accounts (such as assets and liabilities).
BillOfMaterials (BOM) Organizes a dimension by attributes that define how parts comprise a unit and requires you to define attributes such as organizational unit, BOM resource, and quantitative. A client application controls how this information drives dimension behavior.
Channel Represents channel information and requires a client application to determine how this information is used.
Currency Defines the currency conversion behavior for a dimension.
Customers Organizes attributes for information about customers, such as name, address, demographic details, and more. You must use a client application that supports this dimension type to see behavior change.
Geography Requires a client application to define dimension behavior, such as displaying aggregated data projected onto a geographic visualization. This dimension type contains attributes for geographical areas, such as postal code, city, or country.
Time Sets the behavior for calculations that are dependent on time. By setting the Type property appropriately for each attribute, you identify the attributes containing years, quarters, months, days, and so on.
Organization Defines a dimension containing attributes about subsidiary organizations, ownership percentages, and voting rights percentages. A client application controls how this information affects calculations or dimension behavior in an application.
Products Organizes a dimension by attributes describing characteristics of parts, such as color, size, category, and so on. This dimension type requires a client application to use this information.
Promotion Describes a dimension containing information about marketing promotions. This information is used only within a client application that supports this dimension type.
Quantitative Requires a client application that can use attributes containing quantitative values.
Rates Defines the dimension containing currency rate information that the SSAS server uses for currency conversions in conjunction with the Currency dimension type. Refer to the additional resource recommended for the Currency dimension type to learn how to use this dimension type properly.
Scenario Identifies a dimension containing attributes to define a scenario, such as budgeting or forecasting, and a version.
Utility Flags a dimension for a purpose to be defined by a client application.
The most common implementation of a dimension type other than the default of Regular is to set a Date dimension as a Time dimension type. Let’s add some additional attributes to this dimension and then explore the additional configuration required in the dimension to achieve the desired results.
To set the Type property of the Date dimension, double-click the Date.dim file in Solution Explorer, click the dimension name (the top level node) in the Attributes pane of the dimension designer, and then select Time in the Type drop-down list in the Properties window. You must also configure each attribute with the applicable type by selecting the attribute in the Attributes pane, and selecting the correct value, shown in Table 1-3, in the Type drop-down list. When you navigate through the list of values in the Type drop-down list, you must expand nodes within a type category to find the type you want. For example, you can expand Date, and then Calendar to find Months (shown in the table as Date/Calendar/Months), or expand ISO to find Iso8601WeekOfYear. In addition, change the Name property of each attribute if it is included in Table 1-3.
TABLE 1-3 Attribute configuration in the Date dimension
Attribute Property Value
ISO Week Number
Define attribute relationships
A best practice in multidimensional database development is the addition of user-defined hierarchies. A user-defined hierarchy (in which the user is you, as the developer, and not the business user who queries the cube) is a pre-defined navigation path for moving from summary to detail information. For example, you use a hierarchy to make it easier to view sales by year, then by month, and then by day, as shown in Figure 1-42.
FIGURE 1-42 Pivot table displaying sale counts by day, month, and year
Each level of a hierarchy corresponds to a separate attribute in the dimension. For example, in the hierarchy shown in Figure 1-42, the top level of the hierarchy is the Calendar Year attribute, the middle level is the Calendar Month attribute, and the bottom level is the Date attribute.
After creating a hierarchy, you might also need to define attribute relationships between each level in the hierarchy. Attribute relationships determine how SSAS calculates aggregate values within a cube and how it stores data on disk. For this reason, the presence or absence of attribute relationships potentially affects query performance.
Before you can work with attribute relationships, you must first create a hierarchy. To do this for the Date dimension, perform the following steps:
Open the dimension designer for Date, drag the Calendar Year attribute from the Attributes pane to the Hierarchies pane of the designer. When you drop it, a new hierarchy is created.
You can then add levels to the new hierarchy by dragging the Calendar Month attribute into the hierarchy object below the Calendar Year attribute, and then repeating this process with the Date attribute.
Right-click the Hierarchy caption for the new hierarchy in the Hierarchies pane, select Rename, and type Calendar to rename it.
You can create as many hierarchies as you like. Technically speaking, you can create only 2,147,483,647 user-defined hierarchies, but you probably do not need that many!
For now, create another hierarchy renamed as ISO Calendar by using the Calendar Year, ISO Week Number, and Date attributes, as shown in Figure 1-43.
FIGURE 1-43 User-defined hierarchies in the Date dimension
Notice the warning icon and best practice warning indicated by the wavy underscore below each hierarchy name to indicate a potential problem. If you hover the cursor over each warning, the following message displays: “Attribute relationships do not exist between one or more levels of this hierarchy. This may result in decreased query performance.” This is a very common warning.
Behind the scenes, SSAS sets up attribute relationships between the hierarchy levels and the key attribute where possible. To review the current definition for attribute relationships, click the Attribute Relationships tab in the dimension designer. As shown in Figure 1-44, the relationships between the key attribute, Date, and the attributes on the levels of each hierarchy are visible as arrows connecting Date to ISO Week Number, Calendar Year, and Calendar Month.
FIGURE 1-44 Attribute Relationships page of the dimension designer for the Date dimension
The hierarchy warnings on the Dimension Structure page displays because Calendar Year is part of both hierarchies, but does not connect directly to the attribute in the child level in either hierarchy. The dimension can process successfully in this state, but the warning message alerts you that there could be an adverse impact on query performance. With this database, the volume of data is too small for you to notice slow query performance. However, once you start working with fact tables having millions or billions of rows, the performance problem becomes more noticeable.
Before you decide whether you need to fix this particular warning, you must know what type of hierarchy you have created. A natural hierarchy is a hierarchy for which each attribute has a one-to-many relationship between dimension members on a parent level and dimension members on a child level, as shown in Figure 1-45. In the Date dimension, the Calendar hierarchy is a natural hierarchy because each Calendar Month member corresponds to one, and only one, member on the Calendar Year level.
FIGURE 1-45 Natural hierarchy
By contrast, an unnatural hierarchy is one for which there is the potential to have a many-to-many relationship between members on a parent and child levels. To create an unnatural hierarchy, open the Stock Item dimension designer and create a hierarchy named Color-Size with Color as the top level, and Size as the bottom level. Figure 1-46 illustrates the partial results of these attribute combinations in a hierarchy.
FIGURE 1-46 Unnatural hierarchy
You use an unnatural hierarchy to enable users to report on combinations of dimension members that are not hierarchical. That way, you can build into your multidimensional database the ability to produce the same result as a relational query that aggregates the quantity sold and group the aggregate value by two different, non-hierarchical columns, as shown in Listing 1-8.
LISTING 1-8 Retrieve aggregate results from a relational database by using a GROUP BY clause
USE WideWorldImportersDW; GO SELECT Color, Size, SUM(Quantity) AS QuantitySold FROM Fact.Sale s INNER JOIN Dimension.[Stock Item] si ON s.[Stock Item Key] = si.[Stock Item Key] GROUP BY Color, Size;
If you review the attribute relationships of an unnatural hierarchy, as shown in Figure 1-47, you can see there is no arrow connecting the attributes of the hierarchy’s level. Instead, each level’s attribute relates back to the key attribute, Stock Item History. In the case of an unnatural hierarchy, this lack of attribute relationships between the hierarchy’s attributes is appropriate.
FIGURE 1-47 Attribute relationships for the Stock Item dimension
Whenever you see the warning about non-existent attribute relationships between hierarchy levels, you should evaluate whether the hierarchy is natural or unnatural. If it is natural, you must fix the attribute relationships, but you can ignore the warning if the hierarchy is unnatural. However, you should test whether the performance of queries using the unnatural hierarchy is acceptable. If it is not, consider modify the dimension structure to transform the unnatural hierarchy into a natural hierarchy.
Let’s say that you want to optimize the natural hierarchies, Calendar and ISO Calendar, by fixing the attribute relationships. To do this, perform the following steps:
Open the Attribute Relationships page of the Date dimension designer. Then drag the ISO Week Number object in the diagram to the Calendar Year object and drag the Calendar Month object to the Calendar Year object to create the relationships shown in Figure 1-48.
FIGURE 1-48 Corrected attribute relationships for the Date dimension
These adjustments clear the warnings from the Dimension Structure page of the dimension designer. However, a new problem is introduced because the data is not currently structured to support a natural hierarchy for ISO Calendar.
To see the results caused by this problem, deploy the project. Deployment fails due to the following warning that displays in the Output window: “Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: ‘Dimension_Date’, Column: ‘ISO_x0020_Week_x0020_Number’, Value: ‘15’. The attribute is ‘ISO Week Number’.”
The best way to see the problem with the duplicate attribute key is to process the Date dimension manually so that you can extract the SQL query that SSAS generates, and review the query results in SSMS. First, right-click Date.dim in Solution Explorer, select Process, select Process Full in the Process Options drop-down list if it is not already selected, and click Run.
In the Process Progress dialog box, expand Processing Dimension ‘Date’ Completed, expand Processing Dimension Attribute ‘ISO Week Number’ Completed, expand SQL Queries 1, click the SELECT statement, and click View Details to see the full query, as shown in Figure 1-49.
FIGURE 1-49 Processing query in the View Details dialog box
Highlight the entire SELECT statement, and press CTRL+C to copy the statement to the clipboard. Click Close three times to close all open dialog boxes.
Next, open SSMS, connect to the database engine hosting the WideWorldImportersDW database, click the New Query button in the toolbar, select WideWorldImportersDW in the Available Databases drop-down list in the toolbar, paste the statement into the Query window. Modify the query to add an ORDER BY clause as shown in Listing 1-9, and click Execute.
LISTING 1-9 Review results of query generated by SSAS for populating the ISO Week Number attribute
SELECT DISTINCT [Dimension_Date].[ISO Week Number] AS [Dimension_DateISO_x0020_Week_x0020_Number0_0], [Dimension_Date].[Calendar Year Label] AS [Dimension_DateCalendar_x0020_Year_x0020_Label0_1] FROM [Dimension].[Date] AS [Dimension_Date] ORDER BY [Dimension_Date].[ISO Week Number];
As you can see by the partial results shown in Figure 1-50, each ISO Week Number value in the first column corresponds to four different Calendar Year values in the second column. Because of the attribute relationship that exists between ISO Week Number and Calendar Year, there can be only one Calendar Year parent member for each ISO Week Number child member. Otherwise, dimension processing fails.
FIGURE 1-50 Results of query that populates the ISO Week Number attribute
An easy way to fix this problem is to modify the key column for the ISO Week Number attribute to create a unique value for each ISO Week Number-Calendar Year combination. To do this, perform the following steps:
Open the Dimension Structure tab of the dimension designer, select ISO Week Number in the Attributes pane, click the KeyColumns box in the Properties window, and then click the ellipsis button.
In the Key Columns dialog box, click Calendar Year in the Available Columns List, click the > button, select Calendar Year in the Key Columns list, click the Up button to place Calendar Year ahead of ISO Week Number in the list, and click OK.
Next, click the NameColumn box in the Properties window, click the ellipsis button, select ISO Week Number, and click OK.
Last, deploy the project to update the multidimensional database on the server with your changes.
Like attributes, attribute relationships have properties. You can select an arrow in the diagram or select a relationship in the Attribute Relationships pane to view the following properties in the Properties window.
RelationshipType Determines how SSAS manages changes to the key column for a related attribute. The default value is Flexible, which allows changes to occur without requiring you to perform a full process of the dimension. (The implications of relationship types on dimension processing is described more fully in Chapter 4.) For example, if you expect Color to change frequently for a stock item, you can set the relationship between Stock Item History and Color to Flexible.
Your other option is to set this property to Rigid. In that case, any change in the relationship between an attribute and a related attribute requires you to fully process the dimension. You use this option when you expect the relationships between attributes rarely or never to change, such as the assignment of a date to a month, or when you expect values to rarely change. The date 2013-01-01 always remains a child of CY2013-Jan and never gets reassigned to CY2013.
Attribute Specifies the attribute on which the attribute relationship is based. In other words, this property defines the parent attribute in the relationship.
Cardinality Describes the relationship has a many-to-one or one-to-one relationship between the child attribute and the parent attribute. The default is Many to indicate a many-to-one relationship, such as exists between Date and Calendar Month in which there are many dates related to a single month. This is the most common scenario that you find in attribute relationships.
You change the value to One when you want to specify a one-to-one relationship. For example, if you had an attribute for telephone number in the Customer dimension, you would define the relationship between that attribute and the Customer attribute by setting the Cardinality property to one.
Name Provides a unique name for the relationship. This value defaults to the name of the parent attribute.
Visible Determines whether a client application can access the parent attribute as a member property in an MDX query. Working with member properties is described in more detail in Chapter 3.