Managing Partitions and Database Processing in Microsoft SQL Server 2008 Analysis Services
In this chapter, you will learn how to:
Configure a dimension to use ROLAP or MOLAP storage.
Configure a measure group partition to use ROLAP, HOLAP, or MOLAP storage.
Configure rigid attribute relationships.
Process a dimension using the Process Update processing option.
Process a partition using the Process Incremental processing option.
Configure proactive caching.
Create, merge, and manage measure group partitions.
Import an Analysis Services design into a Business Intelligence Development Studio project.
In Chapter 11, “Retrieving Data from Analysis Services.” you learned how to use client tools and MDX queries to get data out of Analysis Services and into a PivotTable dynamic view or other report. In order to get data out of the Analysis Services database, you had to first put data into the Analysis Services database. In previous chapters, every time you were ready to apply your changes, you simply deployed your solution and let the Analysis Services server work its magic to determine how to go about loading the data from the relational data warehouse into the Analysis Services database. How Analysis Services actually processed and stored that data has been a conceptual black box up to this point. Knowledge is power, however, and that adage certainly holds true in the domain of Microsoft SQL Server 2008 Analysis Services. The more you know about what goes on within the black box of the Analysis Services server, the better you’ll be able to troubleshoot unusual situations and find solutions to difficult problems.
This chapter explains in broad terms what goes on inside that black box. You will learn about how Analysis Services physically processes and stores data. The explanations are simplified, but they encompass some of the most complex topics in this book. If nothing else, this chapter can give you an appreciation for the sophistication and elegance of the Analysis Services server design.
Working with Storage
An Analysis Services database uses four types of information: the design of database objects (dimensions, cubes, measure groups, and so on), dimension data, detail values, and aggregated values.
Analysis Services always stores the design of database objects in XML files on the Analysis Services server, but it allows you to decide where the dimension data, details values, and aggregated values are stored. You can choose from three storage modes that correspond to different physical storage locations:
ROLAP (relational OLAP) leaves dimension data and detail values in the relational database. If the relational database is SQL Server Database Engine, Analysis Services can also create indexed views in the relational database to store aggregated values.
HOLAP (hybrid OLAP) leaves the detail values in the relational fact table but stores aggregated values on the Analysis Services server. HOLAP is not an option for dimension storage.
MOLAP (multidimensional OLAP) stores dimension data, detail values, and aggregated values on the Analysis Services server.
No matter which storage mode you choose, the database design is stored on the Analysis Services server. It’s the database design that makes Analysis Services data appear to be in a cube to a person running a query. That means the storage mode is invisible to client applications—that is, applications that query the cube. The client application always sees the cube. Deciding which storage option to use is primarily based on processing and query performance and has no effect on how client applications will interact with the cube.
Because a client application can’t tell which storage mode you have chosen, you can change the storage mode without affecting any client applications. After you specify storage and start using the cube, you can still change your mind later and switch to a different storage type. Because a cube appears to the client application as a single, logical entity, you can use different storage modes for different portions of a cube. To avoid doing that, you must use multiple partitions. You will learn about creating partitions later in this chapter.
Understanding Dimension Storage Modes
You can choose to store dimension data using ROLAP or MOLAP storage mode. If you choose ROLAP storage, the dimension data will be stored in the source dimension tables. If you choose MOLAP storage, the dimension data will be stored on the Analysis Services server. You can use dimensions with ROLAP or MOLAP storage with partitions that have ROLAP, HOLAP, or MOLAP storage with one restriction: If the partition uses MOLAP storage, it must have one dimension that uses MOLAP storage. If you use a dimension with ROLAP storage with a partition that uses MOLAP storage, any time you execute a query that uses a leaf level member from an attribute in the ROLAP dimension, Analysis Services will have to retrieve the values from the relational database. For example, if Product was a ROLAP dimension and a user included the members of the Product Color attribute hierarchy in a query, Analysis Services would have to retrieve both the dimension data and the fact data directly from the relational database, and you would lose some of the query performance benefits that MOLAP storage provides. The only time you want to use the ROLAP storage mode for a dimension is when a dimension contains an extremely large number of members or when you want a measure group that uses only ROLAP storage for its dimensions and partitions.
In the next procedure you will learn how to change the storage mode to ROLAP for a dimension. Later in this chapter, you will work with cubes that contain partitions using ROLAP and MOLAP storage. You will then browse the cubes and see that to a client application, the cubes appear alike.
Modify dimension storage settings
Use Business Intelligence Development Studio (BIDS) to open the AdventureWorks BI solution contained in the C:\Microsoft Press\Analysis Services 2008 SBS\Chapter 14\AdventureWorks BI folder. The AdventureWorks SSAS database contains two dimensions, Geography MOLAP and Geography ROLAP, which are identical. You will change the storage mode of the Geography ROLAP dimension to Real-time ROLAP. The other three dimensions, Date, Employee, and Product, use MOLAP storage.
In Solution Explorer, expand the Dimensions folder, right-click Geography ROLAP.dim, and select View Designer.
In the Attributes pane of the Dimension Designer, right-click the Geography ROLAP dimension and select Properties.
In the Properties window, select the ProactiveCaching property and click the ellipsis button (...) that appears on the right.
In the Dimension Storage Settings dialog box, drag the slider to Real-time ROLAP and click OK. The Properties window will now show that the value of the ProactiveCaching property is Real-time ROLAP, and the value of the StorageMode property is Rolap.
Close the Dimension Designer and save the changes to the Geography ROLAP dimension.
You were able to change the storage mode of the dimension without making any other change to its design. Query performance is the only difference that a user should perceive between a dimension that uses MOLAP storage and one that uses ROLAP storage.
In the next section, you will learn how to change the storage mode of a measure group partition.
Understanding Partition Storage Modes
Choosing a storage mode is not as difficult as it might seem. Use ROLAP storage if your analytical solution requires real-time data or if you have insufficient disk storage or processing time to process a large volume of data into MOLAP storage. The ROLAP storage with aggregations option is rarely used because aggregations in a relational database can be bulky, and they usually only marginally improve query performance. You might choose the ROLAP storage with aggregations option if you’re learning about aggregations and want to physically look at the aggregation tables, but this storage mode is rarely used in practice.
Aggregations in both MOLAP and HOLAP are identical—the only difference is where the detail-level values are stored. If you count the space required by the original warehouse as well as the space needed for the OLAP cubes, MOLAP does consume more storage space than HOLAP because the MOLAP storage option duplicates the values from the fact table. Analysis Services, however, is very efficient in how it stores data. An Analysis Services database will often use less than half of the storage space of its source database. With a very large warehouse database, you could process the data into a MOLAP cube and then archive and remove the original warehouse. By using the MOLAP storage option, you could actually end up using a fraction of the original storage space.
If you have a large, permanent warehouse—and if using aggregations can satisfy most queries—you may want to consider HOLAP storage. Queries that must retrieve detail data are slower than if the cube used MOLAP storage, but if they’re infrequent, the performance gain might not be worth the incremental storage requirements. In addition, processing a MOLAP cube can take more time than processing a HOLAP cube. While developing an OLAP cube, you may want to use HOLAP storage simply to speed up processing during the time that you process frequently. When you have completed the database design, you can switch to MOLAP storage to maximize query performance.
The AdventureWorks SSAS database contains two cubes, ROLAP and MOLAP, that both use MOLAP storage. The cubes are identical except that the ROLAP cube contains the Geography ROLAP dimension, whereas the MOLAP cube contains the Geography MOLAP dimension. In the next procedure, you will change the storage mode of one of the partitions in the ROLAP cube. You will then browse the cubes and see that they appear to be identical.
Modify partition storage settings
In Solution Explorer, expand the Cubes folder, right-click ROLAP.cube, and select View Designer.
In the Cube Designer, click the Partitions tab. The Reseller Sales measure group contains one partition, Fact Reseller Sales. Notice that the partition’s storage mode is MOLAP. In the next three steps, you will change the storage mode to ROLAP.
Right-click the Fact Reseller Sales partition and click Storage Settings.
In the Partition Storage Settings dialog box, drag the slider to Real-time ROLAP and click OK.
Verify that the partition’s Storage Mode column displays ROLAP, like this:
Close the Cube Designer and save the changes to the ROLAP cube. In the next procedure, you will browse these two cubes, so you need to deploy and process the AdventureWorks SSAS database.
On the Build menu, select Deploy AdventureWorks SSAS. If the AdventureWorks SSAS database already exists on the server, a dialog box may appear warning that the database will be overwritten. If the warning appears, click Yes. The prior version of the database will be deleted and the current deployment will continue.
You are now ready to browse the cubes and see that they appear identical.
In the next procedure, you will browse the cubes using SQL Server Management Studio (SSMS).
Browse the ROLAP and MOLAP cubes
On the Microsoft Windows task bar, click Start, point to All Programs, expand the Microsoft SQL Server 2008 folder, and then select SQL Server Management Studio.
In the Connect To Server dialog box, select Analysis Services from the Server Type list. In the Server Name text box, type localhost and click Connect.
In Object Explorer, expand the Databases folder, expand the AdventureWorks SSAS database, and expand the Cubes folder.
In Object Explorer, right-click the ROLAP cube and select Browse.
In the Metadata pane, expand the Measures folder, expand the Reseller Sales measure group, and drag the Reseller Order Quantity measure to the totals area of the Report pane.
In the Metadata pane, expand the Date dimension and drag the Calendar Year attribute to the columns area of the Report pane.
Expand the Geography ROLAP dimension and drag the Geography hierarchy to the rows area of the Report pane. Click the Report pane and then right-click the Area attribute and select Expand Items. The report should look like this:
Repeat steps 4 through 7 to create one more report that uses the MOLAP cube. In step 4, select the MOLAP cube. In step 7, use the Geography MOLAP dimension.
You should now have a window in SSMS for the ROLAP cube and a window for the MOLAP cube. The values in both of the reports should be identical. Leave these reports open—you will use them in the next procedure.
Changing Data in a Warehouse
When you process an Analysis Services database, you update the information in the dimensions and measure groups based on the data stored in the data warehouse and the design of your database. If you change the design—for example, if you add a measure or a dimension to a cube—you must process the affected portions of the database. If the data in the data warehouse changes, as it inevitably will, you will also need to process the affected portions of your Analysis Services database.
The information in a data warehouse is almost always time-dependent. That means that at the very least, you’ll continually add new time periods to your data warehouse. In time, you might also add additional products or additional geographic regions. When the data warehouse changes, you need to process the database to resynchronize your Analysis Services database with the relational data warehouse.
The FactResellerSales table in the SSAS2008SBS database contains data for six countries through June 2011. In the SSAS2008SBS database, the DimSalesTerritory dimension table includes only the six countries that appear in the fact table. The DimDate dimension table, however, includes months through December 2011. It is not uncommon in a warehouse to include months in the date dimension through the end of the current quarter or year, but to add members to other dimensions only as they are needed.
Included in the C:\Microsoft Press\Analysis Services 2008 SBS\Chapter 14\SQL folder is a SQL script named Update Warehouse 1.sql. This script adds Mexico to the DimSalesTerritory table and an additional row in the fact table for a sale in Mexico for December 2011. Inserting these records into the source database for the cubes simulates, on a very small scale, the load operations that occur regularly in a production data warehouse.
In the next procedure, you’ll execute queries that insert data into the warehouse and then observe the effect of changed data on the cubes. You will see that the storage mode for a dimension or measure group partition affects what data is displayed by a cube when source data is changed.
Insert source data
On the SSMS File menu, point to Open and select File. In the Open File dialog box, browse to the C:\Microsoft Press\Microsoft Press\Analysis Services 2008 SBS\Chapter 14\SQL folder, select Update Warehouse 1.sql, and click Open.
On the Query menu, point to Connection and select Connect. In the Connect To Database Engine dialog box, change the Server Name to localhost and click Connect.
On the SSMS toolbar, click Execute. After the query executes successfully, close the Update Warehouse 1.sql Query window.
Select the ROLAP [Browse] window and on the Browser toolbar, click Reconnect. With Real-time ROLAP storage mode, changes to the source database immediately appear in the cube, because the dimension and partition data is stored only in the source database. The newly added country, Mexico, is displayed on rows in the Geography ROLAP dimension, and the number of items in the newly added order, 122, is displayed at the intersection of Mexico and CY 2011. Aggregate values have also been updated. The total for CY 2011 has increased by 122 to 45,252 and the total for all years and all countries has increased by 122 to 214,500.
Select the MOLAP [Browse] window and on the Browser toolbar, click Reconnect. The grand total for Order Quantity for all countries is still 214,378, and Mexico doesn’t appear in the list of countries. All the values are unchanged from before the warehouse changed.
The MOLAP cube uses MOLAP storage, and it behaves as if you had not changed the data source. When you use MOLAP storage, with or without aggregates, the cube is completely detached from the data warehouse. You can even delete the warehouse database without affecting the Analysis Services database. With MOLAP storage, you must process a dimension or partition to resynchronize with the data warehouse.