Passing arguments to julia functions

Former Matlab user here, still discovering new things about type stability and multiple dispatch…

Currently working on a private package involving mostly linear algebra, I tried to follow the philosophy of julia’s multiple dispatch whenever I could, maybe too much even.

In my package I have a function fun that has different methods based on some symbol parameter set p1, p2, ..., but always returns the same type. I am not yet used to having types as parameters to perform the multiple dispatch. So my current implementation involves multiple methods defined as

function fun(<common_args...>, ::Val{:p1}`)::ret_type
    <some code>
end

function fun(<common_args...>, ::Val{:p2}`)::ret_type
    <some other code>
end

And the function calls are done with e.g. : fun(<common_args...>,Val(:p1))

I do not know anymore where I got this trick from, but it works for sure.
However, it seems this can lead to a high stress on the compiler (this comment is based on a discussion on slack)
Also, some @code_warntype analysis shows that this doesn’t really ensure type stability, even when explicitly specifying the return type for each method in fun.

In some other way, yet related to the current discussion I tried to see what was done in some base julia code, to see if I could get a typical julia implementation. For instance, take the Hermitian from Base.

Hermitian takes a second argument uplo which is a symbol (I figured symbols are something important in julia but I can quite fathom how), this comforted me in my practice to use symbols as arguments.

However the definition of Hermitian is

function Hermitian(A::AbstractMatrix, uplo::Symbol=:U)
    n = checksquare(A)
    return hermitian_type(typeof(A))(A, char_uplo(uplo))
end

so no check in the value of uplo is done at that time, just on its type, so there is no multiple dispatch involved here it seems. Then one can see the char_uplo call… which is actually a check on the value of uplo ! This check is defined as

function char_uplo(uplo::Symbol)
    if uplo === :U
        return 'U'
    elseif uplo === :L
        return 'L'
    else
        throw_uplo()
    end
end

The most surprising part of this to me, is that one actually converts the symbol parameter as a char parameter, what is the reasoning behind this ?

Overall my question is: What is the correct way in julia to have a function with multiple implementations differing according to parameter values :

  • Is it necessarily through true multiple dispatch where one would define types as parameters e.g.
    :p1 would be replaced by a p1type and fun would be defined as
function fun(<common_args...>, ::p1type)::ret_type
    <some code>
end
function fun(<common_args...>, ::p2type)::ret_type
    <some other code>
end
  • Is it through type specification with e.g. p::Symbol, then applying the char_uplo strategy ?
  • Some other way ?

This has probably been discussed somewhere, but I would really like a feedback on this. As often there will probably be no unique solution, but I just need one… :slight_smile:

I think that a bit more context is needed to give a truly useful answer.

If the property that determines the version of fun to be called can be determined at compile time, a Trait would be a possible solution.

If not, you may not be able to do better than dispatch via garden variety if ... else.
On the other hand, if you want the solution to be extensible to new types, Traits may still be the best solution (you don’t get static dispatch, but you do get extendability).

1 Like

Symbols are basically just strings, but they have some nice features:

  • Internally, Julia represents the names of things in your code as Symbols, so they show up a lot in metaprogramming (code that generates code)
  • They are generally more efficient to compare for equality than strings, so they’re very useful when you want a lightweight way to differentiate a few different named things (and don’t want to use types for that purpose)

But the set of things you can do with a Symbol is pretty similar to the set of things you can do with a string (you can easily convert one to the other, at the cost of some copying of data). We generally use Symbols when dealing with things which appear directly in your code, while we use Strings when dealing with input from files or users or network streams.

I suspect this is because if you were to dig deeper you would find a call to some existing BLAS function or some Julia code with the same interface. Lots of BLAS routines use characters like this to indicate what kind of operation should be performed. Pure Julia code probably wouldn’t do this (a Symbol might be more appropriate), but we’re talking about Fortran libraries whose interfaces haven’t changed in decades.

This will depend a lot on your case. There’s nothing at all wrong with just taking a Symbol or boolean flag and writing an if statement. That will produce very fast code and can be the easiest thing to do in some cases.

An approach like fun(... ::pytype) and fun(... ::p2type) is also very common. The biggest advantage of that design is that it’s very easy for a future user (or you) to then go on and implement fun(... ::p3type) without needing to modify any existing code. It /can/ also give better performance, but this is likely only going to matter if fun is in the innermost loop of your code.

I wouldn’t worry about copying the char_uplo pattern unless you also need to interface with some older code that expects a literal Char somewhere.

3 Likes

if ... else is probably the way to go. The one exception is if the return type depends on on the value as well.

Right now I would focus on the source of the type instability. The return type assertion will not necessarily solve the issue.

1 Like

Thanks for the clarification about the whole char conversion thing. That surely makes sense when interfacing wild external libraries.

In my case (and for the moment of course), this kind of call is at quite high-level, so this should not be a performance issue indeed. I was just wondering what would be the more appropriate way to do this in julia beyond the obvious if ... else statement. This is probably because I discovered multiple dispatch with julia, and now I’m trying to force it a bit… it really fells like a fancy and implicit if ... else to me but I’m sure it’s much more than that.

The parameter is given a value at runtime, basically it is some option that the user defines for some algorithm…

Can you clarify a bit the role of traits in this context ?

If you can work in the problem-algorithm-solver pattern, I think it is safe to follow that. In that pattern, different algorithms are represented by different types, and the solver function solve is dispatched by different algorithm types.

1 Like

I see two potential benefits of Traits of if ... else:

  1. Performance: if you can achieve static dispatch (the value of the option can be inferred at compile time), this can be much faster than if ... else.
  2. Extendability: if the number of options may expand in the future, this is possible with a Trait (without changing the original code), but not (obviously) with the if ... else.

It sounds like neither concern applies in your case, in which case I don’t see anything wrong with if ... else.

1 Like

I did not know that dispatch is faster than if else. Can you give a reference for this please?

Static dispatch can be faster if the to-be-dispatched-to function is inlined. No jump/branch is faster than any branch, most of the time (branch predictors can get lucky).

2 Likes

I should note that type dependent if-else constructs can often be eliminated by the compiler, so it’s not always clear cut which is going to be faster if the branch/dispatch is decided by types alone.

1 Like

It’s a good question. The fact that static dispatch is fast and dynamic dispatch is slow seems implied in many places (and intuitive) (setting aside compiler optimizations where seemingly dynamic dispatch is converted into static dispatch, such as eliding dead branches). But I don’t have a reference.

Optimization by specialization to argument types comes with the tradeoff of increased compilation time. An extreme case of this is shown in the following example.

f(x) = x^2 + 3x + 2
g(::Val{x}) where x = f(x)
f(10^6), g(Val(10^6))
(1000003000002, 1000003000002)
using BenchmarkTools
@btime f(10^6)
@btime g($(Val(10^6)));
  3.500 ns (0 allocations: 0 bytes)
  0.001 ns (0 allocations: 0 bytes)

g(Val(10^6)) is ultra fast because it is specialized to the argument type Val{10^6} and compiled to return 1000003000002.

@code_typed debuginfo=:none g(Val(10^6))
CodeInfo(
1 ─     return 1000003000002
) => Int64

However, g(Val(k)) is compiled separately for each different k. So if you run g(Val(k)) for a large number of different k's, it will perform a large number of compilations and will be very slow. (After compilation, though, it will be explosively fast.)

F(n) = [f(k) for k in 1:n]
G(n) = [g(Val(k)) for k in 1:n]

@time F(10^4)
@time G(10^4);
  0.000010 seconds (2 allocations: 78.203 KiB)
  5.160295 seconds (55.43 M allocations: 3.894 GiB, 14.56% gc time, 95.65% compilation time)

The first execution of G(10^4) is very slow.

Thus, it is not reasonable to try to optimize by specialization to the argument types by different large numbers of Val{k} types.

On the other hand, the native code specialized to argument types is very fast, so if compilation time is not an issue, optimization by specialization to argument types should be done aggressively.

In short, it is a matter of trade-off.

3 Likes

No. Don’t apply “the char_uplo strategy”.

The reason why Hermitian type has the field uplo of type Char instead of Symbol is to conform to the LAPACK library specification.

See, for example, julia/lapack.jl at 44d484222005580432433b7889c4a56d25c0ea67 · JuliaLang/julia · GitHub

Because of these special circumstances, please forget about char_uplo in ordinary Julia programming.

It’s not so bad if you simply follow the problem-algorithm-solver pattern.

1 Like

But you have not compared this to an if-then-else approach so far.
Wouldn’t this be similarly fast?

I gave this a try, but I think the compiler outsmarted me somehow…


function if_vs_dispatch(x)
    if x == 10_000_000
        return 1000003000002
    else 
        return f(x)
    end
end

julia> @btime if_vs_dispatch(10_000_00)
  0.001 ns (0 allocations: 0 bytes)
1000003000002

julia> @btime if_vs_dispatch(10_000_000)
  0.001 ns (0 allocations: 0 bytes)
1000003000002

julia> @btime if_vs_dispatch(932)
  0.001 ns (0 allocations: 0 bytes)
871422

julia> @btime if_vs_dispatch(10_000_000)
  0.001 ns (0 allocations: 0 bytes)
1000003000002

julia> @btime if_vs_dispatch(9112)
  0.001 ns (0 allocations: 0 bytes)
83055882

julia> @btime g($(Val(10^6)));
  0.001 ns (0 allocations: 0 bytes)

julia> @btime if_vs_dispatch(123)
  0.001 ns (0 allocations: 0 bytes)
15500

Sub-nanosecond timings are always impossible for real computation, even just an addition takes 2-3ns. Check with @code_native or @code_llvm how much has been optimized away - in this case, I suspect constant propagation is what eliminated all computation.

Yes. I have not mentioned anything about “an if-then-else approach”.