Creating a basic pivot table

In this chapter, you will:

  • Format your source data before creating a pivot table

  • Learn how to create a basic pivot table

  • Understand the Recommended PivotTable and the Analyze Data features

  • Use slicers to filter your report

  • Keep up with changes in the data source

  • Share the pivot cache

  • Save time with PivotTable tools

When you have a family portrait taken, the photographer takes time to make sure that the lighting is right, the poses are natural, and everyone smiles their best smile. This preparation ensures that the resulting photo is effective in its purpose.

When you create a pivot table report, you’re the photographer, taking a snapshot of your data. By taking time to make sure your data looks its best, you can ensure that your pivot table report is effective in accomplishing the task at hand.

One of the benefits of working in a spreadsheet is that you have the flexibility of laying out your data to suit your needs. Indeed, the layout you choose depends heavily on the task at hand. However, many of the data layouts used for presentations are not appropriate when used as the source data for a pivot table report.

However, just because a pivot table report is created successfully does not mean that it’s effective. A host of things can go wrong as a result of bad data preparation—from inaccurate reporting to problems with grouping and sorting.

Format your source data before creating a pivot table

Let’s look at a few of the steps you can take to ensure that you end up with a viable pivot table report.

Ensuring that data is in a Tabular layout

A perfect layout for the source data in a pivot table is a Tabular layout. In Tabular layout, there are no blank rows or columns. Every column has a heading. Every field has a value in every row in most cases. Columns do not contain repeating groups of data.

Figure 2-1 shows an example of data structured properly for a pivot table. There are headings for each column. Even though the values in D2:D6 are all the same model, the model number appears in each cell. Month data is organized down the page instead of across the columns.

FIGURE 2.1

FIGURE 2.1 This data is structured properly for use as a pivot table source.

Tabular layouts are database-centric, meaning you would most commonly find these types of layouts in databases. These layouts are designed to store and maintain large amounts of data in a well-structured, scalable format.

Avoiding storing data in section headings

Examine the data in Figure 2-2. This spreadsheet shows a report of sales by month and a model for the North region of a company. Because the data in rows 2 through 24 pertains to the North region, the author of the worksheet entered the title North as a single cell in C1. This approach is effective for displaying the data, but it’s not effective for a pivot table data source.

FIGURE 2.2

FIGURE 2.2 Region and model data are not formatted properly in this data set.

Also, in Figure 2-2, the author was very creative with the model information. The data in rows 2 through 6 applies to Model 2500P, so the author entered this value once in A2 and then applied a fancy vertical format combined with Merge Cells to create an interesting look for the report. Again, although this is a cool format, it is not useful for pivot table reporting.

In addition, the worksheet in Figure 2-2 is missing column headings. You can guess that column A is Model, column B is Month, and column C is Sales. However, for Excel to create a pivot table, this information must be included in the first row of the data.

Avoiding repeating groups as columns

The format shown in Figure 2-3 is common. A time dimension is presented across several columns. Although it is possible to create a pivot table from this data, this format is not ideal.

FIGURE 2.3

FIGURE 2.3 This matrix format is common but not effective for pivot tables. The Month field is spread across several columns of the report.

The problem is that the headings spread across the top of the table pull double duty as column labels and actual data values. In a pivot table, this format would force you to manage and maintain six fields, each representing a different month.

Eliminating gaps and blank cells in the data source

Delete all empty columns within your data source. An empty column in the middle of your data source causes your pivot table to fail on creation because the blank column, in most cases, does not have a column name.

Delete all empty rows within your data source. Empty rows may cause you to inadvertently leave out a large portion of your data range, making your pivot table report incomplete.

Fill in as many blank cells in your data source as possible. Although filling in cells is not required to create a workable pivot table, blank cells are generally errors waiting to happen. A good practice is to represent missing values with some logical missing value code wherever possible.

Applying appropriate type formatting to fields

Formatting fields appropriately helps you avoid a whole host of possible issues, from inaccurate reporting to problems with grouping and sorting.

Make certain that any fields to be used in calculations are explicitly formatted as a number, currency, or any other format appropriate for use in mathematical functions. Fields containing dates should also be formatted as any one of the available date formats.

Summary of good data source design

The attributes of an effective tabular design are as follows:

  • The first row of your data source is made up of field labels or headings that describe the information in each column.

  • Each column in your data source represents a unique category of data.

  • Each row in your data source represents individual items in each column.

  • None of the column names in your data source double as data items that will be used as filters or query criteria (that is, names of months, dates, years, names of locations, or names of employees).