Before you learn why and how to Keep Dimensions & Facts tables separate in a Power Pivot or Power BI model, it is important you understand the difference between Facts and Dimension tables.
A facts table contains data about something that happened that you want to analyse. An example of a facts table would be a sales table or a budget table. Facts tables often contain quantitative values that you can aggregate, and keys that can be used to create relationships with dimension tables
The table below is a Facts sales table. It contains values such as Units sold and Total sales which you can aggregate, and it contains keys such as Customer and Product ID which can be used to create relationships with facts tables.
A dimensions table is something that describes a fact. For Example, the name of a customer or product. Dimension tables are used to slice, dice and filter facts tables. They are mostly qualitative and non-numerical in nature.
The table below is a customer dimension table. It contains a customer ID, which in this case in the unique key for the table, and other descriptive attributes of customers.
Facts tables and Dimension tables are joined together in Power Pivot or Power BI by the use of a relationship. The most optimal relationship type in these tools is a one to many relationship. Where the dimensions table is the one side and the facts tables in the many side. Other relationships types, such as Many to many and the use of bidirectional filters are not optimal and if you don’t know what you are doing, like really really know, stay away from them.
Looking at our Dimension table, the Customer ID is unique for each customer and only appears in the table once. When in the facts sales tables, the customer ID can occur many times. A one to many relationship can be set up between the customer ID on the facts and dimensions tables.
It’s Not Excel
A typical Excel user would be to adding news columns of data to a table, using the likes of VLOOKUP, or even the new XLOOKUP. They do this so they can analyse data by different criteria. Criteria that are not found in the initial table.
Look at the two tables below, a sales table and a customer table. Let’s say in Excel we wanted to analyse our total sales by male or female customers. The customer sex can be found in the customer table, but the values we want to analyse are in the sales table.
To analyse this in Excel with a standard pivot table we would need to add a new column to one of the tables.
We could in the sales table add a new column looking up the customer ID and returning the sex. If you are using Excel 365 you could use XLOOKUP. And in other versions you can use VLOOKUP.
Then when you insert a pivot table, the fields list will include the Customer Sex and you can use this in your pivot table to analyse the sales. Why? Because you have added a dimension to a facts table.
Although this works well in Excel this is not best practice in Power Pivot and Power BI. You should where possible, keep dimensions and facts in separate tables.
Why keep Dimensions & Facts tables separate?
The table below has been loaded to our model. It’s a detailed sales table containing not only sales information but also the full customer information and product information.
We create 2 simple measures to calculate the average sales and the average age of the customer
Average Sales =Average(detailedsales[Totalsales]
Average Age:=AVERAGE(detailedsales[Customer Age])
And then we create a pivot table and drop in our Region and our measures for Average Sales and for Average Age.
What we will find is the average sales works just fine, but the average age is not correct.
If we filter the table we can get a better look at what is going on here. Lets filter the table to only show the region Northern Ireland. The average age for a customer from Northern Ireland is showing as 34.33
However, Dylan Lynch made purchased 2 Items and so his age is weighted twice. Compared to Jack Kelly which is weighted only once.
The correct calculation for this would be to have each customers age to be only weighted once. If we were to compute the average age taking the customers age only once, we would get a value of (27+49)/2= 38.
How to overcome this with DAX
We can overcome this problem using DAX. However, this required more detailed DAX knowledge and is by far an optimal model.
The function we would use to do this is
Correct Average Age:=AVERAGEX(SUMMARIZE(detailedsales,detailedsales[Customer ID],detailedsales[Customer Age]),detailedsales[Customer Age])
Now if we drop in our new measure to our pivot table, we have the correct average age.
The Better way to overcome this problem.
Don’t have dimensions in facts tables. It is not optimal for DAX and the solution shown above should only be used in a last resort. Have a separate table for the customers were each customer is listed only once identified by a unique key, the Customer ID.
You should be able to obtain this table from your database or date source as a separate table. But if not and the only table you have is the detailed sales table including the dimensions, then Power Query is your friend. We can use the detailed table and with Power Query reduce it down to just Customer attributes and load this to our model.
To do this, in the Power Query editor, duplicate the query for the detailed sales table. Rename this query to Customers.
Hold Ctrl and select the Customer ID, Customer Name, Customer Age, Region and Customer Sex columns.
Right click and select Delete other columns.
Select the Customer ID column and from the Home ribbon select Remove Rows and then Remove Duplicates. This will leave us with a table containing only unique Customer IDs.
The we can return to our original query, the detailedsales table and remove some of these columns.
Hold Ctrl and select the Customer Name, Customer Age, Region and Customer Sex columns. We need to leave the Customer ID column in place as we want to use this to create a relationship with our newly created dimensions table.
Right click and select Delete columns. By removing these columns, we can make sure they are not loaded into the model, keeping the model nice and tidy.
Fixing Our Pivot table.
Once these tables have been loaded to our model, we can create a relationship between the tables.
We can also add a new measure to the customer dimensions table to calculate the average of the customer age.
Returning to our pivot table, we will note that the region has disappeared. This is because we have now removed this field from the detailed sales table. We can add the region to the pivot table now from the Customer table. And we can add our Average age measure.
As we can see, we now have the correct value for Average age.
Online Training Courses
Explore Power Pivot
Learn Power Query
Master Excel Formulas
Best value – Access All Areas. All our courses – 12 months subscription
Do you want to start collecting rewards quickly for learning Excel? Then you should try:
SIGN UP FOR OUR NEWSLETTER TODAY – GET EXCEL TIPS TRICKS & LEARN AND EARN ACTIVITIES TO YOUR INBOX
We think you would like:
If you would like to start collecting rewards quickly for learning Excel then you should try: