top of page

Data Modeling with Power Pivot - Getting Started



MS Excel has been undoubtedly the most popular spreadsheet program around, and due to its popularity, it has been a go-to tool for data analysts. Modern Excel (especially Microsoft 365) is full of useful features that make data analysis and visualization super easy.


Since 2013, MS Excel comes with some powerful data transformation and modeling tools. These features are known as Power Query & Power Pivot. Both are quite feature-loaded and relatively easy to learn.


This article is about how to get started with these two features. We can begin to analyze data for just under two minutes!



What is Power Pivot?

Most of us are familiar with Pivot tables in Excel. It is a handy tool when it comes to summarizing and analyzing the data quickly.



The usual pivot tables can summarize data from a single table only. Most of the cases, our data is stored across multiple related tables. So, before summarizing and analyzing data using a pivot table, we first combine multiple tables (using VLOOKUP).


Power Pivot simplifies this step. We can connect multiple tables and then use them as our regular pivot table.


In this following example, we can quickly see the total orders(from Orders table) by the managers(from Users table) that have returned(from Returns table) by each customer segment(from Customer table)


Final Output:


And the best part is zero formulas.



The entire steps divided into three stages:

  • Connecting with data

  • Transforming the data

  • Modeling the data

But, first thing first:


Understanding Data

For this exercise, we are taking Sample Superstore Sales

We have four tables in the data model:

· Orders: Contains a list of all the orders

· Customer: Contains customer details

· Returns: Contains a list of all the orders returned

· User: Contains Region-wise Manager



Step 1: Connecting with data

Let us open a new workbook


Data > From File > From Workbook



Navigate to the file location > Select the file > Import


Since we need to bring all four sheets in the model, we check the option: Select Multiple items


Post the selection of the table, we have two options: Load and Transform


Load

Use this option when our data does not require any transformations.







o Load: It loads all the selected data tables in the form of Excel tables

o Load To: It only creates a connection with the data source and not add any data tables in the current Excel file.













Also, check the option: Add this data to the Data Model. It loads the table into the Power Pivot data model.


From a data modeling perspective, it is advisable to select Load To instead of Load unless we need data in the form of a table.


Transform Data


Use this option when our data requires transformation. This option takes us to the ETL tool: Power Query.

We should always load the data to Power Query (Transform Data) first and ensure all the tables and columns are appropriately structured.

For our activity, the data table requires certain transformations; hence we select Transform Data.

Step 2: Transforming Data

When we click on Transform Data, Power Query window open on a separate window:


1. Queries: Shows list of all the data tables loaded

2. Data Preview: Shows the top 1000 rows of data from the source file

3. Query Settings: Shows the name of the query and all the transformations applied on each table (Under Applied Steps)

Applied Steps are something like a macro recording. It lists down all the actions performed on the data table, and when we hit refresh, it runs the entire steps and presents the final output



Assigning the Right Data Type

One of the critical elements of analytics is to assign the right data types for each column.


Power Query is intelligent enough to detect and assign the appropriate data type. The symbol denotes the data type, located before the column name:


Following are the meanings of each data type symbol:














To change the existing data type, click on the symbol, and select the appropriate data type:















In the current data model, I have made the following changes:

  • Transform CustID as Text in Customer and Orders table

  • Transform OrderID as Text in Orders and Returns table

  • Transform Order Date and Ship Date as Date (from Date/Time) in the Orders table

  • Promoted the first row in the Users table as a header


You may check the articles on Power Query transformation.

Once we are sure with the transformations, it is time to load the data into the Power Pivot data model


Home > Click on the down arrow on Close & Load > Close & Load To










Select:

  • Only Create Connection: This will only connect with the source file and do not import the data table

  • check the option for Add this data to the Data Model to load the data tables into the Power Pivot








Note: We cannot access any Excel window if the Power Query window is open.


How to enable Power Pivot add-in?

Skip this step if you have already enabled the Power Pivot Add-in

File > Options > Add-ins > Manage > COM Add-ins > Go


Select the Power Pivot for Excel > Ok


Step 3: Data Modelling

It is time to open the Power Pivot window.


Power Pivot > Manage


Alternatively, we can access it through the Data ribbon also:



It opens the Power Pivot window


Establish the relationship between the tables

Switch to the Diagram View

Using mouse drag & drop, we establish the relationship between the tables:


Manage Relationship view:


Please note that Power Pivot does not support Many-to-Many relationships. It supports only a Many-to-One and One-to-One relationship.

That’s it! Data modeling is complete.


Now, click on Home > Pivot Table:


Click Ok


Summarizing & Analyzing Data

Now, all we need to do is drag and drop fields from multiple tables to summarize the data:


We have data from four tables, and Power Pivot enables us to connect and analyze data in a few simple steps.


What is next?

Power Pivot has some extensive data modeling capabilities, and when combined with Power Query, it packs some serious punch. The next article will build on this, where we learn writing DAX measures (Power Pivot formula engine).



Related articles



Comentários


bottom of page