I’m starting to become a big fan of Power BI and the monthly releases are certainly helping the tool get closer to prime time (I think Power BI has come a long way, but still has some catching up to do). I love the Power BI-R integration which significantly enhances the charting capabilities. I’m yet to find out how to plot multiple series on the same axis without having to do all the reshaping of columns-to-rows. One would think these sort of charts should be supported out of the box – I’m not referring to plotting different values of a single column as series, rather, plotting the counts for two separate columns (e.g., 2 lines showing number of patients having hypertension as 1 line and #patients with diabetes as another over months – input data has columns: ID, hypertension yes/no, diabetes yes/no, date of visit).
I wanted to give Power BI-R integration a try and this is some sample code I was using in R which simply extracts month and year from a date field:
If we omit specifying the format of the raw data and try using date field directly, we may end up with the wrong month:
I pasted the code (with the format specified) into Power BI. In my dataset I have a date type column called CompletedDTTM and a column called Accession which is a unique identifier. CompletedDTTM was specified to be a ‘Date’ field and formatted to look like:
When I tried to plot the counts over months, I ended up with this:
Not quite the chart I had in mind (I have ignored grouping by year here). All the R code is trying to do is determine distinct accession counts by month.
It was a bit hard to debug since the code was running fine in RStudio but in Power BI I’m getting NAs. A few times I also ended up with the following while trying things out:
Above is not very helpful. You can click on ‘See details’ and try debugging, but it’s not the easiest. An easier approach is to print the dataframe values to a table to see what’s going on:
Now that we can see what’s going on, the next step is to fix the month extraction piece.
After playing around a bit, I managed to put the following (DAX) syntax together:
So I’m guessing that what’s getting passed to the underlying R script within the Power BI environment is not the same format the raw data file has, but rather, the format we have selected within Power BI after importing data.
This also makes the syntax a lot easier! We don’t have to worry about 2 and 4-digit years either since all of them are formatted the same way once we select the date display format 🙂
Finally, now that we have the right values loaded into R, all we have to do is plot using ggplot instead of showing the table.
So it seems like you can’t just copy-paste code from standard R environment to Power BI directly, at least when it comes to working with dates.
I haven’t shown how to plot the 2 series on the same x-axis, but this should be pretty straightforward with ggplot.