When you are analysing data, it is important to both understand the data that you have and make sure you prep your data before you load it into your model. Preparing your data involves making sure your data is in a suitable format for analysis and itis loaded to your model in the most optimal format. This can include cleansing, merging, and splitting. And there are many other transformation steps that one can carry out to create an optimal clean model. Many data modelers will spend longer preparing their data than analyzing it. The process of analysis is a lot quicker and easier if your data is set up in the best possible format first.
It is not optimal to start transforming data using DAX. DAX stands for Data analysis expressions. It’s not Data transformation expressions. Transforming data using DAX only takes up space in your model, slows it down, weakens the link between the data set and the data source. And at times can also be complex to understand.
Power Query is your friend. Use it to prep your data and don’t be afraid to invest the time doing so. The more you do before you import data to a Power Pivot or Power BI model, the less you must do after. You can also greatly reduce the size of your model by carrying out as many transformation steps prior to import than after. This will keep your model working fast and save on computer resources.
It would be common to see DAX users carry out TEXT transformation steps in DAX. Steps such as splitting or joining columns. DAX is empowered with the same TEXT functions as Excel. Functions like LEFT, RIGHT, MID, PROPER, TRIM. DAX also has concatenate functions. To clean text columns in DAX often requires a calculated column and we know we should avoid calculated columns when we can. All these types of transformations can be carried out in Power Query before you load the data to your model.
In Power Query we can find text transformations in two places. On the transform ribbon and on Add Column ribbon. Its important to note the distinction between the two. If you transform text from the transform ribbon, then you are changing the selected column. Whereas if you carry out text transformations from the Add column ribbon, the existing column will remain in place and the transformations will be shown in a new column.
From the transformation ribbon, (which will change the selected column) the Format options include
Lowercase – this will change all letters to lowercase
Uppercase – Will change all letters to uppercase
Capitalize Each Word – this will make the first letter of each work a capital letter
Trim – this is used to remove leading and trailing spaces
Clean – will remove non printable characters, often found on CSV downloads.
You can also use the format options to add either a Prefix or a suffix.
Looking at our example image, we would use both Trim and Capitalize Each Word to clean up this column of names.
The option for Extracting data from in Power query offers more options that either Excel or DAX
Length will count the number of characters in the string, just like the LEN function in Excel
First Characters will allow you extract characters starting from the left. It works just like the left function in Excel. You just need to select how many string characters you want to keep and anything after that will be removed.
Last Characters works like RIGHT in Excel or DAX and allows you extract data from the left of the string
Range is like MID where you specify a starting index and how many characters after that you want to keep.
Text before, after and between delimiters are also rather cool. In DAX you would have to use a combination of functions such as LEFT and FIND. For Example, suppose we wanted to extract the first names. Each name has a different number of characters so using left won’t do. Instead we can extract text before delimiter. Our delimiter is a space as this is what separates the first from the last name.
We can also use Power Query to Split columns. Splitting columns will create a new column and the new column will contain the same name with a prefix added as the original column. If you are using this option, don’t forget to rename your column.
Like extract, the options available when splitting a column are greater than the DAX functions. Splitting a column by Digits to non-digits and Non digits to digits are a treat. These can be used when there are leading, or trailing numbers combined with text.
Finally, if you are in doubt, you can always add a column by example. If you don’t know what option to choose or can’t find the option, adding column by example is a viable option for text transformations.
The examples I have shown were all from the Transform column ribbon in Power Query. Unless you are splitting a column, they do not create new columns. You need to consider the model and if you need to amend existing columns or add new ones. If you can just amend, I would suggest you go with that option rather than cluttering your model with data and columns you don’t need. So unless you are going to use the column as a filter or in a calculation, then you don’t need to add a new column but you can transform the existing column.
Become a Power Pivot Hero
GET and TRANSFORM DATA like a PRO
Power Query Excel 365