FastRCluster

running FastR from GNU-R

Stepan Sindelar
OracleLabs

Safe Harbor Statement

The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Oracle reserves the right to alter its development plans and practices at any time, and the development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.

Motivation

FastR

An alternative implementation of the R programming language built on top of the GraalVM platform.

Open source and licensed under GPLv3.

Terminology: "GNU-R" == the reference implementation of R

FastR

FastR aims to be fully GNU-R compatible including the native interface (R extensions). It is currently based on R 3.5.1 and reuses the base packages of GNU-R.


						$ ./graalvm/bin/R
						R version 3.5.1 (FastR)
						Copyright (c) 2013-19, Oracle and/or its affiliates
						Copyright (c) 1995-2018, The R Core Team
						Copyright (c) 2018 The R Foundation for Statistical Computing
						Copyright (c) 2012-4 Purdue University
						Copyright (c) 1997-2002, Makoto Matsumoto and Takuji Nishimura
						All rights reserved.

						FastR is free software and comes with ABSOLUTELY NO WARRANTY.
						You are welcome to redistribute it under certain conditions.
						Type 'license()' or 'licence()' for distribution details.

						R is a collaborative project with many contributors.
						Type 'contributors()' for more information.

						Type 'q()' to quit R.
						> c(1,10) + 1:2
						[1]  2 12
					

Why FastR

  • Significantly faster R code execution

Raytracing the Washington Monument in R
http://www.tylermw.com/throwing-shade/

Rewrite the workhorse Fortran routine to R

Peak Performance after warm-up

i7-8750H CPU, 6x2.20GHz, 32GB RAM
GraalVM, FastR & GraalPython version 19.1.0

Why FastR

  • Significantly faster R code execution
  • High performance interop with Python, Scala, …

Raytracing the Washington Monument in R
http://www.tylermw.com/throwing-shade/

Rewrite the Fortran routine to Python

Peak Performance after warm-up

i7-8750H CPU, 6x2.20GHz, 32GB RAM
GraalVM, FastR & GraalPython version 19.1.0

Plotted with FastR and

Why FastR

  • Significantly faster R code execution
  • High performance interop with Python, Scala, …
  • Integration with advanced GraalVM dev tools

Why FastR

  • Significantly faster R code execution
  • High performance interop with Python, Scala, …
  • Integration with advanced GraalVM dev tools
  • Simple embeddeding into Java applications

What's the catch?

“It’s a hard task to make an alternative implementation run all R code in the same way as GNU-R. Can you imagine having to reimplement every function in base R to be not only faster, but also to have exactly the same documented bugs?.”

Advanced R by Hadley Wickham

http://adv-r.had.co.nz/Performance.html#faster-r

FastRCluster: parallel package

  • Well-known existing interface for communication between R processes
  • From the outside, FastR process looks like just another R/Rscript instance
  • FastR supports the XDR serialization
  • Let's use the PSOCK cluster, but with FastR

The fastRCluster package

The fastRCluster package

installation


							devtools::install_github(
							'oracle/fastr/com.oracle.truffle.r.pkgs/fastRCluster')
					

							fastRCluster::installFastR()
					

							# Alternatively:
							options(graalvm.home = '/path/to/existing/installation')
					

The fastRCluster package: demo

Shiny apps & future & promises

  • future: asynchronous computations
  • future: configurable backend including PSOCK
  • promises: make your Shiny apps more responsive: you should use them!

							library(future)

							f <- future({ 3+4 })
							print(value(f))

							val %<-% 3+4
						

								# Shiny:
								output$plot <- renderPlot({
								  future({
								    ...
								  })
								})
						

Demo: Shiny & future & promises

  • Mandelbrot set computation and visualization
  • Original workhorse: C routine, rewritten to R
    • Takes some time to compute...

							# ...
							for (i in seq_along(xcoo)) {
							  for (j in seq_along(ycoo)) {
							    c[[1L]] = xcoo[i]; c[[2L]] = ycoo[[j]];
							    z[[1L]] = 0;       z[[2L]] = 0;
							    for (k in 1:iter) {
							      oldz[[1L]] = z[[1L]]; oldz[[2L]] = z[[2L]];
							
							      # the mandelbrot mapping z -> z^2 + c
							      z[[1L]] = oldz[[1L]]*oldz[[1L]] - oldz[[2L]]*oldz[[2L]] + c[[1L]];
							      z[[2L]] = 2 * oldz[[1L]]*oldz[[2L]] + c[[2L]];
							
							      if((z[[1L]]*z[[1L]] + z[[2L]]*z[[2L]]) > 4) break;
							    }
							    set[[i, j]] <- k;
							  }
							}						
							# ...
						

								library(future)
								library(promises)
								-plan(multisession)
								+library(fastRCluster)
								+cl <- makeFastRCluster(4)
								+plan(cluster, workers = cl)						   
						
Fork me on GitHub

Summary

fastRCluster is a package that let's you try FastR in GNU-R/RStudio


						install_github(
						  'oracle/fastr/com.oracle.truffle.r.pkgs/fastRCluster')
						?fastRCluster
				

FastR is an alternative R implementation that can run your R code faster and offers many additional features.

http://graalvm.org

http://github.com/oracle/fastr