Linear regression line in Tableau calculated by external R script

Run code from an external .R file in Tableau

Tableau introduced the R integration in version 8.1 back in 2013. That’s awesome because it opens up to Tableau the whole range of analytical functionality R offers. Most of the time the R code being triggered from within Tableau is rather short, such as a regression, a call to a clustering algorithm or correlation measures. But what happens when the code you want to run out of Tableau is getting longer and more complicated? Are you still bound to the “Calculated Field” dialog window in Tableau? It’s nice but it’s tiny and has no syntax coloring or code completion for our precious R code.

Run R code inline in a Calculated Field in Tableau

Run R code inline in a Calculated Field in Tableau

But things will get arduous when trying to write longer code. So why not write the code externally in your favorite R IDE (something I recommend doing anyways – even with shorter scripts!), save it to a .R file and reference that code from Tableau’s calculated field? Sounds like a great idea, and it’s totally possible. Let’s try it out, shall we?

The simple and straightforward attempt

First attempt: Let’s just outsource the exact code we used in the calculated field to the .R file and run source() within the Tableau calculated field. Here’s the code which I then saved into a file called Rcode.R in my Windows machine’s Downloads folder:

model <- lm(.arg1 ~ .arg2)
print(summary(model))
predict(model, interval = 'prediction')[, 1]

And here’s the accompanying formula for Tableau – please note that the path within the setwd() function needs to be modified according to where you put your R code file:

SCRIPT_REAL("
    setwd('C:/Users/kgreger/Downloads')
    source('Rcode.R')
", 
SUM([Profit]), 
SUM([Sales]))

Did it work?

Tableau error message: The result returned by the SCRIPT function is of an unexpected type.

Tableau error message: The result returned by the SCRIPT function is of an unexpected type.

Nope, that’s not it.

The slightly more involved attempt

Next let’s try to call the external R code file and pass the input data from Tableau as command line parameters. This is the slighly modified R code – note that the first command line parameter commandArgs[1] is the name of the script itself, hence we use arguments 2 and 3 here:

model <- lm(commandArgs[2] ~ commandArgs[3])
print(summary(model))
predict(model, interval = 'prediction')[, 1]

And the accompanying formula for Tableau:

SCRIPT_REAL("
    setwd('C:/Users/kgreger/Downloads')
    system(paste('Rcode.R', .arg1, .arg2))
", 
SUM([Profit]), 
SUM([Sales]))

Did it work?

Tableau error message: R's system() expects character string as first argument.

Tableau error message: R’s system() expects character string as first argument.

Still not quite what we want. The problem here is that paste()ing together an atomic string ("Rcode.R") and two vectors with length greater than 1 results in a non-atomic vector – which system() can’t handle. You can check that yourself by running this R code snippet:

str(paste("text", c("a", "b")))

A possible solution would be to translate the two input vectors into strings such as "c(10, 20, 30)", pass these to the R code file and re-translate them into proper vectors. Sounds like a lot of work, both in terms of implementing it (which is why I didn’t even try that) and also in terms of what Tableau, RServe and R would have to deal with. Just imagine how long these two vectors would be with the 5,009 elements as shown in the workbook that accompanies this blog post… (Right click > Save target as...)

The not so straight-forward but pretty slick (and working!) attempt

OK, so let’s think about what’s actually happening here to see if we can find a better solution for this. What we want to do is run code that’s written in an external file (our Rcode.R), but use data from Tableau, which only lives in the Tableau fields or the meta-fields .argn. If you think about it, that sounds like function call. And that’s actually the solution to our problem!

So within the R file we wrap our code in a user defined function, like this – I decided to give it a prefix with my initials and the name of the main function it’s using, here lm, hence kg.lm:

kg.lm <- function(y, x) {
  model <- lm(y ~ x)
  print(summary(model))
  predict(model, interval = 'prediction')[, 1]
}

This is really neat, now we just have to modify the call from the calculated field in Tableau to use our newly created function:

SCRIPT_REAL("
    setwd('C:/Users/kgreger/Downloads')
    source('Rcode.R')
    kg.lm(y = .arg1, x = .arg2)
", 
SUM([Profit]), 
SUM([Sales]))

Did it work this time?

Linear regression line in Tableau calculated by external R script

Linear regression line in Tableau calculated by external R script

Now ain’t that pretty? It totally worked! And neither the R code nor the syntax in Tableau’s calculated field became significantly longer or more complex. There’s a few other nice aspects about externalizing your R code this way (this starts to sound like one of those home shopping shows on TV…):

  • You can re-use your R code in multiple calculated fields within one workbook or even across multiple workbooks and still have only one place to manage/edit/update the code. Imagine you developed a nice machine learning model with a few hundred lines of code and suddenly discover a bug – you wouldn’t want to have to browse through all your workbooks using this model to correct the bug there, would you?
  • You can put many of these user defined functions in one single R file, source() this in your calculated fields and use just the functions you need.
  • The previous fact is especially interesting regarding the fact that you can preload external R resources when firing up your RServe session. This needs to be done in a file called Rserv.cfg, which needs to be placed in the RServe folder: DRIVE:\INSTALLATION_PATH\R\R-VERSION_NUMBER\library\Rserve\libs\x64 (or \i386 if you’re on a 32-bit machine) – the parts in italics need to be set accordingly. There’s a lot of documentation about what can go in the Rserv.cfg file, but for our purposes here it is sufficient to put the following line for each external R code file you’d like to preload:
    source C:/MyFolder/sample.R
    

And there you go: Running code from an external .R file in Tableau’s calculated fields. Enjoy and let me know if you run into problems or know an even easier way of doing this. If you want to skip all the hard work and just look at the ready-made thing, feel free to download the workbook that accompanies this blog post… (Right click > Save target as...)

11 Comments

  1. Hi,

    Thanks for the article. I am facing an error even with the final version of the code.
    The error is as follows :-
    An error occured while communicating with the Rserve service.
    Error : ‘\U’ used without hex digits in character string starting “”C:\U”

    P.S. : I am able to run R codes internally from Tableau. the error is specific to calling external functions from .R files.

    Reply

  2. Hi,

    Great article.
    I would like to ask, is there any way to catch errors in SCRIPT_REAL function in Tableau?
    My script is below. It works in most of the cases but sometimes I get the following error:
    Error in x[1L:after] : only 0’s may be mixed with negative subscripts

    Script:

    SCRIPT_REAL(“library(forecast);
    time <- ts(.arg1,start=c(2013,1), frequency=52);
    fcast <- forecast(time, h=.arg2[1]);
    n<-length(.arg1);
    append(.arg1[(.arg2[1]+1):n],fcast$mean, after = n-.arg2[1]);"
    ,[RUNNING_RESOLVED],[Days Ahead])

    Regards,
    Mircea

    Reply

    1. Hi Mircea!

      Debugging in R is always challenging – especially with the R code embedded in a Tableau SCRIPT_ function. My only advice would be to check your code in standalone R (preferably a nice IDE like R Studio) for bugs and to make heavy use of outputting messages and temporary data dumps to the console while testing your code. Hopefully this will help you find the bug in your code.

      // Konstantin

      Reply

  3. Hi,
    Thanks a lot for the article.I am struggling to run my R code in Tableau.
    I have some R code and the output is data frame.I just want to display the output of r code as a data frame but in tableau. Please let me know how can i run my r code in Tableau as we do in R studio.

    Reply

    1. The biggest problem in your case is probably that your’re trying to return more than one column from your R script back into Tableau. That’s currently not supported, I’m afraid.

      There are a few (hacky, admittedly) ways to work around this limitation, though. I presented one of them in my Hands-On Training Session on R at Tableau Conference 2017 in Las Vegas, which involves pasting the contents of, say, column 1 and column 2 by a character you’re absolutely sure will not occur in either of those columns (I often use the tilde ~ for these), which will result in only one return vector. This can then be split up in Tableau into two columns and evaluated independently.

      Seeing it written here in just two sentences probably doesn’t make too much sense, so I’ll try to put together a full blog post on how to do that, soon. In the meantime feel free to google it – it’s been suggested and used by many people before me.

      Reply

  4. Hi Greger,

    Thanks for the blog.
    Is there a way we can publish R-Tableau integrated dashboard in Tableau Server?

    Reply

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.