Understanding multiple dispatch

I am seriously eager to show others that Julia is fast, for its multiple dispatch.

But it is hard to prove, and below experiment shows that the python’s overloading is actually more faster than Julia’s multiple dispatch. You can compare the time consumed.

  1. Python’ overloading
from math import ceil
from time import time

def f_py(x):
    if type(x) == str:
        x = float(x)
    if type(x) == float:
        x = ceil(x)
    return x**2 % 4

def apply_to_data(arr):
    for a in arr:
        f_py(a)
Test = ["1.5", 2.5, 3]
Repeat = [number for  number in range(10**6, 10**8+1) if (number % 10**6 == 0)]
Ans = []
print("Python speed:")
for ind, N in enumerate(Repeat):
    array = Test * N
    t1 = time()
    apply_to_data(array)
    total = time()-t1
    Ans.append(total)
    if (ind % 10 == 0):
        print(ind, round(total,3), " seconds")
  1. Julia’s Multiple Dispatch
# Dispatched Function
f(x::Int64) = x^2 % 4
f(x::Float64) = f(ceil(Int64, x))
f(x::String) = f(parse(Float64, x))

function apply_to_data_md(array)
    for a in array
        f(a)
    end
end;
Test = ["1.5", 2.5, 3]

Repeat = [10^6:10^6:10^8+1;]
ans_time = []
ans_allo = []
for R in Repeat
    array = [s for i=1:R for s in ["1.5", 2.5, 3]]
    time = @timed apply_to_data_md(array)
    push!(ans_time, time[2])
    push!(ans_allo, time[3])
    if R % 10^7 == 0
        print(R, time[2])
    end
end

I got the idea of above experiment from this article. But the result of mine is somewhat different from the post.

Do you know any other experiments more rigorously showing that the julia’s multiple dispatch could help the computational speed?

The examples you gave are just simple dispatch, you only have one argument and a method for each type, that’s fine in python i.e:

import functools
import math

@functools.singledispatch
def f(x):
    # some default logic
    print("I am the default implementation.")
    
@f.register(str)
def _(x):
    # some stringy logic
    return f(float(x))
    
@f.register(int)
def _(x):
    # some integer logic
    return x**2 % 4
    
@f.register(float)
def _(x):
    # some list logic
    return f(math.ceil(x))

What isn’t fine in python is :

julia> f(x::Int64,y::String) = repeat(y,x)
f (generic function with 1 method)

julia> f(x::Int64,y::Float64) = x^y
f (generic function with 2 methods)

julia> f(3,"c")
"ccc"

julia> f(3,2.)
9.0

4 Likes

These are not multiple-dispatch, these are single-dispatch. Many languages have single-dispatch, it is often associated with object orientation and called (a kind of) polymorphism. Because object-oriented languages often use syntax like object.method(arg1, arg2, ...) the programmer usually does not think of the object as an “argument zero”, but it can be seen this way, and it is the only “argument” that these languages “dispatch on”. This means, in many languages, there is a single argument (often implicit) that defines which function (of many of the same name) should be called; this is single-dispatch. The fact there are many different types that have a method with the same name, and therefore, many different code bodies can be called depending on the first argument does not mean multiple dispatch.

What makes Julia have multiple dispatch is that Julia takes into account the types of all the arguments when choosing which method body to execute (the chosen method body is the more “specific one”, i.e., that matches the argument types more tightly than any other). This is what defines multiple dispatch and it is a thing most languages do not have.

Using a “problem” of the multiple dispatch to show a “problem” Julia have that others languages do not have.

julia> struct A; end

julia> struct B; end

julia> f(a :: A, b) = true
f (generic function with 1 method)

julia> f(a, b :: B) = false
f (generic function with 2 methods)

julia> f(A(), B())
ERROR: MethodError: f(::A, ::B) is ambiguous. Candidates:
  f(a::A, b) in Main at REPL[3]:1
  f(a, b::B) in Main at REPL[4]:1
Possible fix, define
  f(::A, ::B)
Stacktrace:
 [1] top-level scope at REPL[5]:1

Other languages do no have this “problem” because as they only dispatch on the first (often implicit) argument there is not way an ambiguity could arise.

Also, see how the compiler itself suggests using more of multiple dispatch to solve the problem caused by multiple dispatch. If you define a third method f(a :: A, b :: B) = nothing then the ambiguity is solved because it is clear which method is the more specific when taking into account all the argument types.

6 Likes

I’m the author of the article mentioned, a little embarrassed that I didn’t understand that multiple dispatch meant more that one argument! Whoops. I’ll fix it in the article.

I think that your strange speed result comes from the way that print is different in Julia. In your Julia benchmarking section change your print statement from print(R, time[2]) to println("Array length: $R Seconds: $(time[2])").

6 Likes

Multiple dispatch isn’t what makes Julia fast, static/compile-time dispatch is (one of the reasons). Julia has to compile code before running it. This compilation ideally happens once and is saved, so all further runs can just reuse it. During compilation, Julia will try to make as many decisions as possible so it doesn’t have to spend time doing the same thing every run. Dispatch is one such decision.

For example, take a method mymap(f, x) that applies a function f on each element of an array x. Dispatch of f can be done at compile time if the array has a concrete element type like Int64. The compiler knows that f(x[i]) will always do f(::Int64) and makes code that does so.

However, if you provide an x with an abstract element type like Any, then the compiler must make code that also checks the type of x[i] and finds the appropriate version of f for every iteration; this is called dynamic/runtime dispatch. 1 dynamic dispatch runs pretty quickly, but when it happens many times in a big loop, it adds up to a major slowdown. You built an array from Test = ["1.5", 2.5, 3], so its element type became Any; you were thus measuring dynamic dispatch.

I’m not sure if the timing difference you see is entirely due to dispatch, there are a lot of confounding factors. But I wouldn’t be surprised if a benchmark of Python’s single dispatch (which always happens at runtime) versus Julia’s dynamic dispatch shows that Python is faster. After all, Python only needs to take 1 type into account, while Julia must take many types into account. BTW, a benchmark of Python’s single dispatch should look more like dispatcher_object.method(arg1, arg2); binary operators like ** and % may do up to 2 single dispatches.

2 Likes

This is a reason why they say Python loops are slow and you should write “vectorized” code instead. The iteration itself isn’t the slow aspect, it’s that dispatches among other things need to happen every iteration; whereas in “fast” languages, those things are only done once at compile-time. Vectorized code, like that of NumPy, gets around this by 1) creating an array that can hold a concrete element type e.g. np.array([1,2,3,4,5], dtype = int), and 2) having functions that check the dtype in order to dispatch to a C function with fixed static types before applying it to the array in a C loop. So you have a flexible but slow language Python that can use fast C code to do the hard work.

A problem is that when you want to write something new that runs fast in a loop, you have to do it in C and then write some boilerplate C and Python code to make it NumPy-compatible. A “two-language problem”, so to speak. That was a major motivation for Julia.

And to be absolutely fair, using optimized C/C++/Fortran code isn’t only something that “slow” languages like Python or R do. Julia does too; I believe the default LAPACK/BLAS is FORTRAN. If there’s a gold-standard optimized library out there, it’s a good thing to be able to use it, no matter your language. But it’s also a good thing to be able to write something new in your own language that isn’t so slow you have to rewrite in another language. In Julia, if you’re mindful of how types, even generic ones, move through your methods, you can leverage static dispatch for fast loops. As for arrays with abstract element type or structs with abstract field types, slower dynamic dispatch is just necessary and is figured out by the compiler.

1 Like

Your example runs > 10x faster in Julia compared to Python for me.
Python:

In [5]: arr = ["1.5", 2.5, 3] * 10**6

In [6]: %timeit apply_to_data(arr)
1.68 s ± 2.23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Julia

julia> arr = repeat(["1.5", 2.5, 3], outer=10^6)

julia> @btime apply_to_data_md(arr)
  137.734 ms (0 allocations: 0 bytes)
4 Likes

3 posts were split to a new topic: Article about Multiple Dispatch

Correct-me if I’m wrong here, but there is a subtlety here. For example, let us examine this function, which returns 1 with the “largest” subtype between the arguments:

julia> dispatch(x,y) = promote_type(typeof(x),typeof(y))(1)
dispatch (generic function with 1 method)

julia> dispatch(1,1)
1

julia> dispatch(1,1.0)
1.0

julia> dispatch(1,1.f0)
1.0f0

If we inspect the llvm code of one of those calls, we see that it can’t be simpler, it just returns the one with the appropriate type:

julia> @code_llvm dispatch(1,1.0)
;  @ REPL[22]:1 within `dispatch`
define double @julia_dispatch_710(i64 signext %0, double %1) #0 {
top:
  ret double 1.000000e+00
}

The function does not need to test anything because of multiple-dispatch, it has specialized for all arguments.

This makes julia fast, even if we try to trick it, for example writting this horrible alternative:

julia> function manual(x,y)
           if x isa Float64 && y isa Float64
               return 1.0
           elseif x isa Float64 && y isa Float32
               return 1.0
           elseif x isa Float32 && y isa Float64
               return 1.0
           elseif x isa Float32 && y isa Float32
               return 1.f0
           elseif x isa AbstractFloat && y isa Int
               if x isa Float64
                   return 1.0
               else
                   return 1.f0
               end
           elseif x isa Int && y isa AbstractFloat
               if y isa Float64
                   return 1.0
               else
                   return 1.f0
               end
           else
               return 1
           end
       end
manual (generic function with 1 method)

This function does the same for a limited subset of types, “by hand”. However, if we inspect the generated code for a call of this function, we get:

julia> @code_llvm manual(1,1.0)
;  @ REPL[29]:1 within `manual`
define double @julia_manual_716(i64 signext %0, double %1) #0 {
top:
;  @ REPL[29]:13 within `manual`
  ret double 1.000000e+00
}

It just didn’t do anything again. Just returned the appropriate “one”. This is because the compiler specialized a version of that function to the types of the input (both of them) and realized that it could eliminate all branches.

Thus, multiple dispatch can indeed be a very important source of code specialization (besides simplification), and lead to better performance.

9 Likes

Python doesn’t even support overloading on the number, let alone type of function arguments:

>>> def f(a):
...     return 1
...
>>> def f(a, b):
...     return 2
...
>>> f(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required positional argument: 'b'

So Python doesn’t have overloading either. It only has classic object-oriented single-dispatch where which method is called depends on the runtime type of the reciever of a method, i.e. the obj in obj.f(a, b).

11 Likes

Right, the compiler figures out many things in addition to (multiple) dispatch. In addition to dispatching promote_type and whatever methods it calls based on typeof(x) and typeof(y), it figures out compile-time constants.

It doesn’t just figure out types, it figures out values too. Take the function f(x) = (1+2+3)/(6-5)^(3/2)*4 - sum(1:9) + x. @code_llvm f(1.0) shows that the compiler figured out the constant part of the expression is -21.0 at compile-time, so all it needs to do at runtime is add x.

1 Like

I wrote a blog post about implementing an algorithm to choose between methods to use for multiple-dispatch or overloading(depending on when you run it). You can read it here. I’m not sure if this is what julia does behind the scenes but it could give you some insight.
For single dispatch you can modify the code to select the candidates and ranking them based the first argument only.