Design a multidimensional business intelligence (BI) semantic model
- 9/7/2017
Chapter summary
Before you start building a multidimensional database, you should spend time designing a star schema to support business requirements in an OLAP environment. The star schema consists of one or more fact tables and multiple dimension tables that you populate from OLTP systems by using ETL processes. In a multidimensional project that you create in SSDT, you connect to this star schema by creating a data source, and then reference tables in the star schema by creating a data source view. Then you can add dimension objects to your project by using the Dimension Wizard and add a cube by using the Cube Wizard. You can then fine-tune the dimension and cube objects by configuring properties in the dimension and cube designers, respectively. You must deploy your SSDT to create the multidimensional database and its objects on the SSAS server.
The storage model you choose determines not only how SSAS stores data, but also what type of data it stores. MOLAP is the default storage model that stores both detail and aggregate data for the optimal data retrieval. SSAS stores only metadata when you use the ROLAP storage model, creates aggregations in the relational data source, and retrieves requests for data from the data source by translating MDX queries into platform-specific SQL statements. ROLAP is beneficial when you need near real-time access to data. SSAS stores aggregate data for the HOLAP storage model and determines at query time whether to retrieve aggregate data from SSAS or from the relational source after converting the MDX query into a SQL statement. HOLAP is useful when most queries require aggregate data.
SSAS supports many different types of dimensional models. The exam tests your knowledge regarding the selection and configuration of the following types:
Fact Use this dimensional model when you want to include a degenerate dimension in your database. On the Dimension Usage page of the cube designer, you must relate the dimension a measure group by using the Fact relationship type.
Parent-child Use this dimensional model when a dimension table includes a foreign-key relationship to itself. Add the column containing the foreign key as an attribute in the dimension and set its Usage property to Parent. Optionally, you can configure properties applicable only to a parent attribute.
Roleplaying Use this dimensional model when you have a dimension represented multiple times in a fact table in different contexts. For example, you might have Invoice Date and Delivery Date based on the same Date dimension. You create one dimension object in your project, but add the dimension as many times as applicable to the cube. On the Dimension Usage page of the cube designer, you relate each cube dimension to the same measure group by using the Regular relationship type, but define different granularity attributes, which relate to different measure group columns.
Reference Use this dimensional model when you need to support slice and dice by a dimension for which no foreign key exists in the fact table although it is related to an intermediate dimension having a foreign key in the fact table. The key attribute for the reference dimension must exist as a non-key attribute in the intermediate dimension. On the Dimension Usage page of the cube designer, you define a Referenced relationship between the reference dimension and the measure group, and specify the intermediate dimension.
Data mining Use this dimensional model when you want to use a data mining algorithm to create a dimension useful for slicing and dicing, such as clusters of customers. You must first create a data mining model to define the data mining technique and columns to which the data mining algorithm is applied. The Data Mining Wizard allows you to create a dimension in which to store the results and optionally a cube.
Many-to-many Use this dimensional model when you need to rollup measures by different dimension attributes without overcounting the results in aggregations. Typically, you must add a bridge table as a measure group to the cube to support this structure and hide the measure group from the cube. On the Dimension Usage page of the cube designer, define a Regular relationship between the dimension and the bridge table and a Many-To-Many relationship between the dimension and the original measure group. You specify the bridge table as the Intermediate Measure group for the many-to-many relationship.
Slowly changing dimension Use this dimensional model when you need to track changes to an attribute over time for a Type 2 SCD design. In the dimension design, you must include the key attribute as well as the attribute containing the business key in the source table, and then hide the key attribute by setting its AttributeHierarchyVisible property to False.
Use the Type property for dimension and attribute objects when you need SSAS or client applications to invoke a specific behavior when that object is queried. The most common dimension type to implement is Time, although you should also be familiar with the concepts for implementing the Accounts and Currency dimension types.
After you add a user-defined hierarchy to a dimension, you should review and correct attribute relationships as necessary. A user-defined hierarchy provides a pre-defined navigation path that simplifies cube exploration in a client application. SSAS uses corresponding attribute relationships to optimize both data storage and data retrieval for natural hierarchies.
Use the cube designer to add and configure measures and measure groups after initially creating a cube by using the Cube Wizard. Importantly, configure the AggregateFunction and FormatString properties correctly for each measure to ensure values are aggregated appropriately and display legibly in client applications. You can also add calculated measures to include business logic for non-scalar values that are important to business analysis.
SSAS includes the following aggregate functions to support semiadditivity: Min, Max, ByAccount, AverageOfChildren, FirstChild, LastChild, FirstNonEmpty, and LastNonEmpty. A semiadditive aggregation adds values together across one dimension, but not across others. Common use cases for semiadditive behavior are financial account reporting and inventory analysis.