Multiple Dispatch is prominently advertised as one of the main selling points of Julia. But I am not even sure if I would consider it a feature at all. So maybe I don’t quite understand it yet.
My background: I am a Maths (especially Probability Theory and Statistics) student, and since statisticians love R and think of Julia as R’s Successor (not Python), I had a look at Julias documentation and stumbled upon Multiple Dispatch.
And people seem to rave about Multiple Dispatch, even though it is one of the things I hate about R. You see, having a lot of contact with R, I started to dislike programming itself until I did an Internship were I spent two months programming in Java. After which I thought: Programming isn’t that bad, let me pick up a small R project. And I realized that I seem to simply dislike R. So I tried to figure out why.
So here is one example of a particularly frustrating moment with R, and I will discuss why I think that multiple dispatch is at fault.
In an exercise, we were supposed to use a package and simulate random variables according to a model.
So I would download the package with
library(package)
Now in an object oriented language, you could just type
package.
and the IDE could suggest you things included in the package. In R you type
?package
and get the documentation. Then you have to hope, that the thing you want to use is documented there. In my case it is not. But thankfully the exercise suggested the function I was supposed to try so this was not a problem for me. Anyway, so I figure out how to generate random variables according to a model and save it into a variable.
y<-simulate(model, x)
So now I want to see what I have generated. So I type
y
and get
Object of class gridDataFrame
Grid topology:
cellcentre.offset cellsize cells.dim
1 1 400
Points:
coordinates variable1
1 1 -0.04098571
2 2 -0.09039331
3 3 -1.42573822
4 4 -0.60306120
5 5 0.47389710
6 6 -1.63860592
Okay neat, I want to plot it! Let’s extract the columns. No idea what this object is, but let’s just try the usual:
> y[2]
Error in `[.data.frame`(x@data, i) : undefined columns selected
> y$coordinates
Error in y$coordinates : $ operator not defined for this S4 class
hm, a class? how about
> y.coordinates
Error: object 'y.coordinates' not found
Shit, I guess google might help. Oh, apparently getSlots
can be used on S4 classes.
> getSlots(y)
Error in getSlots(y) :
no slot of name "slots" for this object of class "gridDataFrame"
Right… Back to google… Oh, so how about:
getSlots("gridDataFrame")
data grid .params
"data.frame" "GridTopology" "list"
What is this supposed to mean?
Okay, let’s try
?gridDataFrame
Oh, there is a method called “coordinates”, which returns the coordinates. Great!
> coordinates(y)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
So let’s get the other column too then:
> variable1(y)
Error in variable1(y) : could not find function "variable1"
WTF? Okay, I guess back to the documentation. Hm, there seems to be a method called “as.data.frame”
> as.data.frame(y)
variable1
1 -0.04098571
2 -0.09039331
3 -1.42573822
4 -0.60306120
Weird it seems to have deleted the column with the coordinates. I guess it doesn’t matter I know how to extract the coordinates and I know how to get a column from a dataframe!
plot(coordinates(y), as.data.frame(y)$variable1)
In total it took me half an hour to simply extract the columns from an unknown class. This is incredibly frustrating! So why did that happen? I think it is because R does not have proper encapsulation.
If objects were actually boxes of things like in other programming languages, then I could just write
object.
and the IDE could start guessing what I would possibly want to do. The IDE can not guess a method though, if you have to write the method before the object.
So due to Multiple Dispatch and Dynamic Typing, the IDE is pretty much helpless and leaves you on your own. And since all methods are just floating about not belonging to one object, everything becomes a soup of unstructured dread.
So with this realization about R I looked at Julia (talked about as the successor to R) and was surprised that people raved about multiple dispatch which caused so much pain. So what makes that pain worth it?
You can pass different objects to it? So why not just call it a function in those cases instead of a method? (not belonging to an object?). You just need the concept of an interface
like you have in Java, and you could pass anything which implements that interface into the function. The popular example seems to be collisions. So why not write an interface: collidable
and pass collidable object into the function
collide(collidable s, collidable b)
What is the advantage of multiple dispatch over that? And more applicable things: Everything could just implement the plotable interface. And then you could just use
object.plot()
on an object and you would know whether or not it will work once you type
object.p
as the IDE will start suggesting you .plot()
if available. While you have to actually try it (and let the error message clutter your console if it does does not work) in a multiple dispatch language.