Vectors vs. Tuples and multiple dispatch

This is my first major project in Julia and I have a core function that will be evaluated many times that I want to optimize. The function accepts a vector and returns a vector, but I only anticipate using it with vectors of 2-6 units. There’s an inner function which I define to accept only Float64 scalars:

F(x::Float64, a::Float64) = 1.2*exp(-a*x)

The simplest, most MATLAB-like definition of the function can be done in one line:

f(x,a) = vcat(1, F.(cumsum(x),a))

However, I’ve found significant performance improvements by defining additional methods for different-sized inputs as tuples rather than vectors, e.g.:

function f(x::Float64, a::Float64)
    y₁ = 1
    y₂ = F(x,a)
    return (y₁, y₂)

function f(x::NTuple{2,Float64}, a::Float64)
    y₁ = 1
    y₂ = F(x[1],a)
    y₃ = F(x[1]+x[2],a)
    return (y₁, y₂, y₃)

function f(x::NTuple{3,Float64}, a::Float64)
    y₁ = 1
    y₂ = F(x[1],a)
    y₃ = F(x[1]+x[2],a)
    y₄ = F(x[1]+x[2]+x[3],a)
    return (y₁, y₂, y₃, y₄)

The speed/memory differences are significant:

julia> using BenchmarkTools

julia> @btime f([0.1,0.2],0.5);
  1.313 μs (19 allocations: 1.00 KiB)

julia> @btime f((0.1,0.2),0.5);
  13.197 ns (0 allocations: 0 bytes)

Since I’m still learning the rope of Julia, I’d appreciate any feedback on two points:

  1. Is this the most idiomatic way to do something like this in Julia? Are there some nuances of the compiler I’m missing that would make all of this unnecessary?
  2. If this is the correct way to do things, would it be an acceptable use of metaprogramming to define these methods rather than having everything typed out in the module? If so, what would be a good way to accomplish that?

Looking forward to learning how to use and understand a beautiful language!

Use StaticArrays.jl instead of tuples. Then you can use your original f definition and it will be fast (by avoiding heap allocations and unrolling loops):

julia> using BenchmarkTools, StaticArrays

julia> F(x, a) = 1.2*exp(-a*x)
F (generic function with 1 method)

julia> f(x,a) = vcat(1, F.(cumsum(x),a))
f (generic function with 1 method)

julia> x = @SVector[0.1, 0.2]
2-element SVector{2, Float64} with indices SOneTo(2):

julia> @btime f($x, 0.5)
  0.051 ns (0 allocations: 0 bytes)
3-element SVector{3, Float64} with indices SOneTo(3):

In general, StaticArrays are a good idea when you have small (≲ 10 element) arrays whose size is more-or-less fixed (typically it should be inferred at compile-time from the function inputs).


In this case, it’s necessary to use the Ref trick to prevent the compiler from optimizing away the static argument:

julia> @btime f($(Ref(x))[], 0.5)
  13.313 ns (0 allocations: 0 bytes)
3-element SVector{3, Float64} with indices SOneTo(3):

And, if you do want to work with tuples, you can write things recursively instead of writing out cases. These should get unrolled by the compiler to pretty much what you had:

julia> g(xs::Tuple, a) = (g(Base.front(xs), a)..., F(sum(xs), a))
       g(::Tuple{}, a) = (1,);

julia> g((1.2, 3.4, 5.6), 0.78) == f((1.2, 3.4, 5.6), 0.78)

julia> @btime g($(Ref(Tuple(x)))[], 0.5)  # same as SVector