A primer in using Java from R - part 1

Introduction

This primer shall consist of two parts and its goal is to provide a walk-through of using resources developed in Java from R. It is structured as more of a “note-to-future-self” rather than a proper educational article, I however hope that some readers may still find it useful. It will also list a set of references that I found very helpful, for which I thank the respective authors.

The primer is split into 2 posts:

  1. In this first one we talk about using of the rJava package to create objects, call methods and work with arrays, we examine the various ways to call Java methods and calling Java code from R directly via execution of shell commands.
  2. In the second one we discuss creating and using custom .jar archives within our R scripts and packages, handling of Java exceptions from R and a quick look at performance comparison between the low and high-level interfaces provided by rJava.
R <3 Java, or maybe not?

R <3 Java, or maybe not?

Calling Java from R directly

Calling Java resources from R directly can be achieved using R’s system() function, which invokes the specified OS command. We can either use an already compiled java class, or invoke the compilation also via a system() call from R. Of course for any real world practical uses, we will probably do the Java coding, compilation and jaring in a Java IDE and provide R with just the final .jar file(s), I however find it helpful to have a small example of the simplest complete case, for which even the following is sufficient. Integrating pre-prepared .jars into an R packages will be covered in detail by the second part of this primer.

Let us show that by writing a very silly dummy class with just 2 methods:

  • main, that prints “Hello World!” + an optional suffix, if provided as argument
  • SayMyName method, that returns a string constructed from “My name is” and getClass().getName()

This HelloWorldDummy.java file can look as follows:

package DummyJavaClassJustForFun;

public class HelloWorldDummy {

  public String SayMyName() {
   return("My name is " + getClass().getName());
  }
  
  public static void main(String[] args) {
    String stringArg = "And that is it.";
    if (args.length > 0) {
      stringArg = args[0];
    }
    System.out.println("Hello, World. " + stringArg);
  }
}

Compilation and execution via bash commands

Now that we have our dummy class ready, we can put together the commands and test them by just executing via a shell, or for RStudio fans, we can test the commands via RStudio’s cool Terminal feature. First, the compilation command, which may look something like the following, assuming that we are in the correct working directory:

$ javac DummyJavaClassJustForFun/HelloWorldDummy.java

Now that we have the class compiled, we can execute the main method, with and without the argument provided:

$ java DummyJavaClassJustForFun/HelloWorldDummy
$ java DummyJavaClassJustForFun/HelloWorldDummy "I like winter"

In case we need to compile and run with more .jars that are in folder jars/, we specify the folder using -cp (class path):

$ javac -cp "jars/*" DummyJavaClassJustForFun/HelloWorldDummy.java
$ java -cp "jars/*:compile/src" DummyJavaClassJustForFun/HelloWorldDummy

Compilation and execution of Java code from R

Now that we have tested our commands, we can use R to do the compilation via the system function. Do not forget to cd into the correct directory within a single system call if needed:

system('cd data/; javac DummyJavaClassJustForFun/HelloWorldDummy.java')

After that we can also execute the main method, and the main method with one argument specified, just like we did it outside of R, once again using cd to enter the proper working directory if needed:

system('cd data/; java DummyJavaClassJustForFun/HelloWorldDummy')
system('cd data/; java DummyJavaClassJustForFun/HelloWorldDummy "Also I like winter"')

The rJava package - an R to Java interface

The rJava package provides a low-level interface to Java virtual machine. It allows creation of objects, calling methods and accessing fields of the objects. It also provides functionality to include our java resources into R packages easily.

We can install it with the classic:

install.packages("rJava")

Note the system requirement Java JDK 1.2 or higher and for JRI/REngine JDK 1.4 or higher. After attaching the package, we also need to initialize a Java Virtual Machine (JVM):

## Attach rJava and Init a JVM
library(rJava)
.jinit()

In case of issues with attaching the package using library, one can refer to this helpful StackOverflow thread.

Creating Java objects with rJava

We will now very quickly go through the basic uses of the package. The .jnew function is used to create a new Java object. Note that the class argument requires a fully qualified class name in Java Native Interface notation.

# Creating a new object of java.lang class String
sHello <- .jnew(class = "java/lang/String", "Hello World!")
# Creating a new object of java.lang class Integer
iOne <- .jnew(class = "java/lang/Integer", "1")

Working with arrays via rJava

# Creating new arrays
iArray <- .jarray(1L:2L)
.jevalArray(iArray)
## [1] 1 2
# Using a list of 2 and lapply
# Integer Matrix int[2][2]
iMatrix <- .jarray(list(iArray, iArray), contents.class = "[I")
lapply(iMatrix, .jevalArray)
## [[1]]
## [1] 1 2
## 
## [[2]]
## [1] 1 2
# Integer Matrix int[2][2]
square <- array(1:4, dim = c(2, 2))
square
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
# Using dispatch = TRUE to create the array 
# Using simplify = TRUE to return a nice R array
dSquare <- .jarray(square, dispatch = TRUE)
.jevalArray(dSquare, simplify = TRUE)
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
# Integer Tesseract int[2][2][2][2]
tesseract <- array(1L:16L, dim = c(2, 2, 2, 2))
tesseract
## , , 1, 1
## 
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
## 
## , , 2, 1
## 
##      [,1] [,2]
## [1,]    5    7
## [2,]    6    8
## 
## , , 1, 2
## 
##      [,1] [,2]
## [1,]    9   11
## [2,]   10   12
## 
## , , 2, 2
## 
##      [,1] [,2]
## [1,]   13   15
## [2,]   14   16
# Use dispatch = TRUE to create the array 
# Use simplify = TRUE to return a nice R array
# Interestingly, this seems weird
dTesseract <- .jarray(tesseract, dispatch = TRUE)
.jevalArray(dTesseract, simplify = TRUE)
## , , 1, 1
## 
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    0
## 
## , , 2, 1
## 
##      [,1] [,2]
## [1,]    0    0
## [2,]    0    8
## 
## , , 1, 2
## 
##      [,1] [,2]
## [1,]    9    0
## [2,]    0    0
## 
## , , 2, 2
## 
##      [,1] [,2]
## [1,]    0    0
## [2,]    0   16

Calling Java methods using the rJava package

rJava provides two levels of API:

  • fast, but inflexible low-level JNI-API in the form of the .jcall function
  • convenient (at the cost of performance) high-level reflection API based on the $ operator.

In practice, there are three ways available to us from the rJava package enabling us to call Java methods, each of them with their positives and negatives.

The low-level way - .jcall()

.jcall(obj, returnSig = "V", method, ...) calls a Java method with the supplied arguments the “low-level” way. A few important notes regarding the usage, for more refer to the R help on .jcall:

  • requires exact match of argument and return types, doesn’t perform any lookup in the reflection tables
  • passing sub-classes of the classes present in the method definition requires explicit casting using .jcast
  • passing null arguments needs a proper class specification with .jnull
  • vector of length 1 corresponding to a native Java type is considered a scalar, use .jarray to pass a vector as array for safety
# Calling a Java method length on the object low-level way
.jcall(sHello, returnSig = "I", "length")
## [1] 12
# Also we must be careful with the data types:

# This works
.jcall(sHello, returnSig = "C", "charAt", 5L)
## [1] 32
# This does not
.jcall(sHello, returnSig = "C", "charAt", 5)
## Error in .jcall(sHello, returnSig = "C", "charAt", 5): method charAt with signature (D)C not found

The high-level way - J()

J(class, method, ...) is the high level API for accessing Java, it is slower than .jnew or .jcall since it has to use reflection to find the most suitable method.

  • to call a method, the method argument must be present as a character vector of length 1
  • if method is missing, J creates a class name reference
# Calling a Java method length on the object high-level way
J(sHello, "length")
## [1] 12
# Also, the high-level will not help here this way
J(sHello, "charAt", 5L)
## Error in .jcall(o, "I", "intValue"): method intValue with signature ()I not found
J(sHello, "charAt", 5)
## Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.NoSuchMethodException: No suitable method for the given parameters

The high-level way with convenience - $

Closely connected to the J function, the $ operator for jobjRef Java object references provides convenience access to object attributes and calling Java methods by implementing relevant methods for the completion generator for R.

  • $ returns either the value of the attribute or calls a method, depending on which name matches first
  • $<- assigns a value to the corresponding Java attribute
# And via the $ operator
sHello$length()
## [1] 12
# But these still do not work
sHello$charAt(5L)
## Error in .jcall(o, "I", "intValue"): method intValue with signature ()I not found
sHello$charAt(5)
## Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.NoSuchMethodException: No suitable method for the given parameters

Examining methods and fields

.DollarNames returns all fields and methods associated with the object. Method names are followed by ( or () depending on arity:

# vector of all fields and methods associated with sHello
.DollarNames(sHello)
##  [1] "CASE_INSENSITIVE_ORDER" "equals("               
##  [3] "toString()"             "hashCode()"            
##  [5] "compareTo("             "compareTo("            
##  [7] "indexOf("               "indexOf("              
##  [9] "indexOf("               "indexOf("              
## [11] "valueOf("               "valueOf("              
## [13] "valueOf("               "valueOf("              
## [15] "valueOf("               "valueOf("              
## [17] "valueOf("               "valueOf("              
## [19] "valueOf("               "length()"              
## [21] "isEmpty()"              "charAt("               
## [23] "codePointAt("           "codePointBefore("      
## [25] "codePointCount("        "offsetByCodePoints("   
## [27] "getChars("              "getBytes()"            
## [29] "getBytes("              "getBytes("             
## [31] "getBytes("              "contentEquals("        
## [33] "contentEquals("         "equalsIgnoreCase("     
## [35] "compareToIgnoreCase("   "regionMatches("        
## [37] "regionMatches("         "startsWith("           
## [39] "startsWith("            "endsWith("             
## [41] "lastIndexOf("           "lastIndexOf("          
## [43] "lastIndexOf("           "lastIndexOf("          
## [45] "substring("             "substring("            
## [47] "subSequence("           "concat("               
## [49] "replace("               "replace("              
## [51] "matches("               "contains("             
## [53] "replaceFirst("          "replaceAll("           
## [55] "split("                 "split("                
## [57] "join("                  "join("                 
## [59] "toLowerCase("           "toLowerCase()"         
## [61] "toUpperCase()"          "toUpperCase("          
## [63] "trim()"                 "toCharArray()"         
## [65] "format("                "format("               
## [67] "copyValueOf("           "copyValueOf("          
## [69] "intern()"               "wait("                 
## [71] "wait("                  "wait()"                
## [73] "getClass()"             "notify()"              
## [75] "notifyAll()"            "chars()"               
## [77] "codePoints()"

Signatures in JNI notation

Java Type Signature
boolean Z
byte B
char C
short S
int I
long J
float F
double D
type[] [ type
method type ( arg-types ) ret-type
fully-qualified-class Lfully-qualified-class ;

In the fully-qualified-class row of the table above note the

  • L prefix
  • ; suffix

For example

  • the Java method: long f (int n, String s, int[] arr);
  • has type signature: (ILjava/lang/String;[I)J

References

  1. rJava basic crashcourse - at the rJava site on rforge, scroll down to the Documentation section
  2. The JNI Type Signatures - at Oracle JNI specs
  3. rJava documentation on CRAN
  4. Calling Java code from R by prof. Darren Wilkinson
  5. Mapping of types between Java (JNI) and native code
  6. Fixing issues with loading rJava