Can Julia Optimize an array expression?


#1

Suppose I write a function that operates over (simple) arrays, even just taking floats as an input to construct the array, can Julia optimize the whole thing into a simple expression?

function f(a::Float64,b::Float64)
    [a b] * [a+1; b+1]
end

versus

function g(a::Float64, b::Float64)
    a * (a+1) + b * (b+1)
end

This gist shows an example

Would it be necessary perhaps for me to use a more specialized matrix computation compiler library, for instance, or is this something we might see the Julia compiler do someday? Or is this something that might happen only in the JIT level, perhaps?


#2

can Julia optimize the whole thing into a simple expression?

No, but it works if you use static arrays

using StaticArrays, BenchmarkTools

function f(a,b)
    [a b] * [a+1; b+1]
end

function g(a,b)
    @SVector([a, b])' * @SVector([a+1, b+1])
end


julia> @btime f(3,4)
  74.892 ns (3 allocations: 288 bytes)
1-element Array{Int64,1}:
 32

julia> @btime g(3,4)
  1.266 ns (0 allocations: 0 bytes)
32

#3

You can even use MVectors if the calls get inlined:

julia> using StaticArrays, BenchmarkTools

julia> function f(a,b)
           [a, b]' * [a+1, b+1]
       end
f (generic function with 1 method)

julia> function g(a,b)
           @SVector([a, b])' * @SVector([a+1, b+1])
       end
g (generic function with 1 method)

julia> function h(a,b)
           @MVector([a, b])' * @MVector([a+1, b+1])
       end
h (generic function with 1 method)

julia> @btime f(3,4)
  53.826 ns (2 allocations: 192 bytes)
32

julia> @btime g(3,4)
  0.020 ns (0 allocations: 0 bytes)
32

julia> @btime h(3,4)
  15.013 ns (3 allocations: 80 bytes)
32

julia> @inline function mdot(a::AbstractVector{T},b::AbstractVector{T}) where {T}
           out = zero(T)
           @inbounds for n ∈ eachindex(a,b)
               out += a[n] * b[n]
           end
           out
       end
mdot (generic function with 1 method)

julia> function h2(a,b)
           mdot(@MVector([a, b]), @MVector([a+1, b+1]))
       end
h2 (generic function with 1 method)

julia> @btime h2(3,4)
  2.725 ns (0 allocations: 0 bytes)
32

I would guess that the inline heuristics will eventually become smarter about figuring that out on their own.