Correlation is a statistical measure that shows how two variable might move together. For Example as the price goes down, the sales volume might go up, or as the weather gets warmer, sales volume might decrease. These types of correlation might be obvious in some organisations, but looking at the correlation between different data sets can revile interesting and actionable insights for a business.
Mapping these sort of trends has become common practice for businesses looking for key insights from their data. But for many of us, working with data in this way is new. For many of us, including me, analysing data in Excel has meant producing some sort of historical report, maybe a chart or two and a little guestimate forecasting.
But as the data trend explodes and terms like correlation coefficient get thrown about more often in business I thought it was time to explore this area in Excel. After all, Excel users work with data every day and if this is where things are moving then we can’t get left behind.
So in this week’s video we will look at how we can use Excel to calculate the correlation between sets of variables. This is not a lesson on correlation but how to carry out correlation using Excel.
If you want to learn more about correlation then I would suggest you read a little more on Wikipedia
Sign up for my newsletter. Don’t worry, I wont spam. Just useful Excel and Power BI tips and tricks to your inbox
Excels Data Analysis Toolpak add in
Correlation in Excels is calculated using the Data Analysis toolpak add in. To turn on the add in, got to file, then options then go to add in’s. In Excel Add in’s, select Analysis tool pack.
Once you have turned on the Analysis Toolpak in Excel, you will find a new option in your data ribbon.
How to Calculate Correlation Coefficient using Excel
On selecting Data Analysis from the data ribbon, a dialogue box will open with a suite of Data analysis statistical calculations that you can carry out on your data. As we are looking at the correlation between variables we will select Correlation.
Selecting Correlation will open a new dialogue box.
First you need to select your Input Range. This is the range that contains all the values for the set of variables in which you are testing for correlation. If your data is going down the rows, so you have columns of data, the select Grouped by Columns. If your data is going across the rows then select Grouped by Rows.
If your data has headers, then also select Labels in First Row. You then need to select where you want your output to be placed.
In the background Excel works its magic and returns a table with a correlation coefficient value between variables. The values returned are between -1 and 1. A strong positive correlation will show a value close to 1 whereas a strong negative correlation will show a value close to -1.
If you are working with multiple variables, the table results might not be so easy to spot. We can now add some conditional formatting to our correlation table so strong positive and negative values can be easily seen.