The Data Science Lab

This R/S4 Demo Might Take You Out of Your Comfort Zone

p2 <- Person()  # alternate S4 instantiation
print(p2)

The official R documentation isn’t clear about why there are two instantiation mechanisms for S4, and doesn't provide much advice. I prefer to use the new function for S4 instantiation.

The demo illustrates field access with this code:

cat("Setting p1 fields directly \n\n")
p1@empID <- as.integer(65565)
p1@lastName <- "Adams"
p1@hireDate <- "2010-09-15"
p1@payRate <- 43.21
display(p1)

It’s possible to define get and set methods for an S4 class, but because all fields have public scope, there's no advantage in doing so.

Next, the demo shows how to call a class method:

cat("Calling yearsService \n\n")
tenure <- yearsService(p1)
cat("Person p1 tenure = ", tenure, " years \n")

Notice that because the yearsService method was registered as a function that operates on a Person object, the method is called just like an ordinary built-in R function.

The demo concludes by showing S4 object assignment:

cat("Making a value-copy of p1 using '<-' \n\n")
p2 <- p1
cat("\nEnd OOP with S4 demo \n\n")

Here, object p2 is a value copy of p1 because S4 copies by value rather than reference. In other words, p2 is an independent duplicate of p1 and any changes made to p2 will have no effect on p1.

Wrapping Up
The demo code presented in this article should give you all the information you need to get up and running with S4 classes. When I need to write OOP code in R, I often have a difficult time deciding whether to use the S3, S4 or RC model. The RC model is much more like the C# OOP model I'm used to, but based on my experiences, most R programmers come from a strictly R programming background and feel more comfortable with S3 and S4. So, if I'm writing code intended for my own use only, then I'll usually use the RC model, but if I'm writing code for R programmers, I'll usually use S3 or S4.

In spite of the technical superiority of the S4 model over the S3 model, most of my colleagues prefer S3 to S4. I suspect that this is due mostly to the rather confusing documentation for S4, which in turn is due in large part to the many changes made to S4 since its introduction, such as deprecating "representation" and "prototype" in favor of "slots" and "initialize."



About the Author

Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Azure and Bing. James can be reached at [email protected].

comments powered by Disqus

Featured

Subscribe on YouTube