A common feature of many programming languages is User-Defined Functions (UDF). These are currently missing from Tableau, and I think adding such an option would be a real benefit to developers – what you might call a “Quality of Life” improvement.
What do I mean by UDF? I mean functions that can be created just like calculated fields, but with placeholders for parameters or fields so that they can be re-used several times, instead of being re-written with slight variations. We’ll start by defining an option for the syntax, and then look at a few examples.
What I can envisage as syntax for a UDF is similar to a normal calculated field, but with a placeholder for a referenced field or parameter. For example, let’s say I have a calculation for Sales in various periods:
Calculated fields:
Current Period Sales: SUM(IF [Current Period] THEN [Sales] END)
Previous Period Sales: SUM(IF [Previous Period] THEN [Sales] END)
Parallel Period Sales: SUM(IF [Parallel Period] THEN [Sales] END)
Implementation using a UDF:
Period Sales: SUM(IF <field1> THEN [Sales] END)
(This is a definition of a UDF accepting a parameter called <field1>)
Current Period Sales: [Period Sales]([Current Period])
Previous Period Sales: [Period Sales]([Previous Period])
Parallel Period Sales: [Period Sales]([Parallel Period])
This is just a simple example, of course. I’m sure that if Tableau decide to implement something like this they’ll find a good idea for the exact syntax.
UDFs have two great advantages. One is that you can re-use complicated expressions by simply changing the referenced field, as in my example, saving the effort of duplicating them and making small changes. The other is maintainability – if you need to modify the calculation, you only have to modify the UDF, and not multiple similar calculated fields.
I’ve encountered countless situations where this could be useful, but let’s just mention two of them. One is an expression commonly used (with lots of variations) to format numeric fields as text (originally from The Information Lab):
IF ABS(AVG([Population Total])) >= 1000000000 THEN
//round for billions
STR(ROUND(AVG([Population Total] / 1000000000), 1)) + 'b'
ELSEIF ABS(AVG([Population Total])) >= 1000000 THEN
//round for millions
STR(ROUND(AVG([Population Total] / 1000000), 1)) + 'm'
ELSEIF ABS(AVG([Population Total])) >= 1000 THEN
//round for thousands
STR(ROUND(AVG([Population Total] / 1000), 1)) + 'k'
ELSE
STR(ROUND(AVG([Population Total]),0))
END
Now assume that you need to format several numeric fields, not just [Population Total]. Today, you have to duplicate the calculated field for each one, and replace the [Population Total] with the numeric field name. Then, if you decide to modify the formatting logic, you have to go and edit each one, wasting valuable time and increasing the chance of errors. A UDF would make life much easier!
Another example is dimension switching, where we use a parameter to enable the user to switch between dimensions in a worksheet:
CASE [Select Dimension]
WHEN “State” THEN [State]
WHEN “Category” THEN [Category]
WHEN “Sub-Category” THEN [SubCategory]
WHEN “Customer” THEN [Customer]
END
I some scenarios I have four such fields, nested one after the other (or two on rows and two on columns), all with the same dimensions but based on different parameters (which have the same list of values, but that’s another issue). So I have four similar copies of the calculation above, each using a different parameter name, and if I decide to add a new dimension to the user’s options I have to modify all of them. A simple UDF would help.
There’s already an idea about this on the Salesforce Ideas list, but I know it’s not on the roadmap. Nowadays everyone is working on AI and more advanced ideas, but this should have been implemented years ago.
Tableau has many other pain points, of course, and maybe one day I’ll post a list of the ones that hurt the most – and I mean stuff that seems to be easy to fix, not anything requiring major modifications. This is just one example, and maybe we’ll see it one day…
Leave a comment