In a previous post we looked at DAX SUM and SUMX expressions and you were introduced to a lot. We mentioned table functions, iterators, row and filter context, measure, columns, and we also quickly introduced you to the RELATE function.
In this article we are going to slow things down a little and look at DAX COUNT and COUNTX – Impact of Using Measures and Columns. Measures and columns work very different and this is a perfect example of how using measures and columns can impact the results in a table or visualization.
COUNT comes directly from Excel and will count the number of cells in a column that contain a number. If a field contains a 0 it will be included in the count but if a field contains blank it will not be counted. Text & true/false are not counted. So if you want to count text or logical functions you need to use COUNTA.
COUNTX on the other hand cannot be found in Excel. It works very like SUMX. COUNTX takes two arguments, the first must return a table and the second must contain the column that is searched by COUNTX. Like COUNT, COUNTX counts numbers so if you want to count text or logical functions you need to use COUNTAX.
Let’s hop into an example and have further look at COUNT and COUNTX and the impact of creating a calculated column v’s a measure.
COUNT by Calculated Column
First we will add a calculated column to our products table, and we will enter the expression
=COUNT (products[Cost Price])
Now before we hit enter, think for a moment about what this formula will return? Will it return 1 in all of the rows because it’s looking at the cost price for that row and that row only; so can only count 1? Or will it return 12 because there are 12 cost prices in the table? Maybe is it going to return something else?
The function returns 12. Although this is a calculated column (which defines the row), by using the count function, we are telling the DAX engine to count all of the fields in this column that have a number. The rows are assumed, or implied to be all the rows in the cost price column by virtue of the fact we used an Aggregation function
COUNT by Measure
Let’s now create a measure using the same function
=COUNT (products[Cost Price])
Again we get 12 because the row context was defined by the fact COUNT is an aggregation function.
COUNT Measures and Columns in Pivot Tables
To look at this further let’s put this into a pivot table.
On the row labels we will drop in the products name. Then into the values field we will drop in our calculated column ‘count of cost price’.
The first thing you will note is that the counts are not correct. This is because in a calculated column the values are being aggregated by sum. If we change this from sum to count then we get the correct value.
But who want to go through this extra step? There is no real need for the calculated column. You could just drop the sales price into the value and change the aggregation from SUM to COUNT!
Next we will drop in our measure for the count of cost price. Immediately, without the need to change the aggregation, we have been given the correct value.
What We Learned So Far
What seems to be very evident from this is that if you use a calculated column the values are calculated within the column, row by row. Then these values are aggregated based on the filter placed on the pivot table which is ‘products name’.
Whereas when you create a measure the values are not created in the table. The rows of the calculation are defined by the measure and by the filters placed on the pivot table. It is also much easier to slice and dice a value created by a measure than a values created in a calculated column.
If you are coming from an Excel background, the logical thing to do would be produce an extra column in your data. But when you are using DAX, things are a little different.
Mastering the Fundamentals of DAX for Power BI, Power Pivot and Analysis Services
This online course will be coming soon on The Excel Club...
Sign up to our newsletter to get an exclusive launch discount
COUNTX By Calculated Column
Now let’s look at COUNTX. Remember we said COUNTX is an iterator. That means it will work its way through the table and evaluate an expression for each row and at the end aggregate the result.
What we will try to do next is count the number of a set of values in the products table filtered for just the product category Shoes.
First we will create a calculated column and we will enter the DAX function
=COUNTX (FILTER(products,products[Product Name]="Shoes"),products[Cost Price]) and we will name this column CCcountshoes.
What we can see is that for each row, the value of two is returned. You see by using a calculated column, the row context is defined. So in the first row DAX is saying, okay let’s filter the product name just to give me shoes. Then count the amount of product cost prices. The answer is 2. In the next row and the subsequent rows DAX does exactly the same thing. It’s says to its self, lets filter the products table to only give us shoes and then count the cost price. So for each row you will get the same.
COUNTX By Measure
So let’s now create a measure and we will use the same syntax that we use for the calculated column
=COUNTX (FILTER(products,products[Product Name]="Shoes"),products[Cost Price]) and we will name this measure Count Shoes.
The result we get is two and the calculation was carried out only once on the table. You see COUNTX is an iterator. So what happens is DAX say to its self, lets filter the product name to just shoes. Then using the tables that filter returns, lets then count the fields in the products cost price column that contains a value. So once the table is returned, COUNTX irritates over each row of the table and counts the values.
COUNTX Columns and Measures in Pivot Tables
Now let’s have a look at the difference in a pivot table of the use of both a calculated column and a calculated measure using an iterator.
We will use our products name in the rows and drop in the calculated column we just created to the value. What we would expect to see as our expression filters the table to only shoe is a count for the shoes cost price. But instead first we get a sum aggregation, like we did with the count calculated column.
Let’s try changing the aggregation to COUNT. We see we get 1 in Boots, but we don’t have any Boots that are Shoes. So these values don’t really mean too much.
Let’s drop in our measure. Now we can see that we have a value for Shoes and no value under any of the other product names.
In this pivot table, with the measure, DAX says to its self, let’s go to the products table and the first item is Boots. So our table is first filtered by Boots. Then when it has this table it says to its self, okay in this table of Boot, find all the Shoes and count the cost price. But there are no Shoes that are Boots and so there is no value to return. Do you get it!?
With the calculated column in the pivot table, the expression is first calculated in the table, row by row, where a count of 2 is returned. In the pivot table DAX is then told, to aggregate the column by the products name, well there is 1 product name Boots, and that does contain a value within the selected calculated column so that value is included.
DAX COUNT and COUNTX – Impact of Using Measures and Columns
Don’t worry if you are finding this a little complicated as we will cover it many times thought out the course.
What you need to understand now is that calculated columns and measure work different. This must be taken this into consideration when preparing your DAX calculations.
It is also important to understand the difference between aggregation functions and iterating functions. Aggregation functions such as SUM, COUNT and AVERAGE will take all the rows in the selection as implicated rows. Iteration functions will explicitly state the rows to be used.
We mentioned earlier that if you need to count text or logical functions you need to use COUNTA and COUNTAX and these count function are part of the DAX statistical functions. There are also a number of other count functions that can be used in the mix and we will see more of them as we progress but for the moment just so you know.
DISTINCTCOUNT will count the distinct values in the selected data.
COUNTROWS will count the rows of data in a table.
COUNTBLANKS will count all the blank rows in a table.