Function type declaration

Hi guys,

I read that one should typically try to declare types from variables in functions directly at the beginning of the function definition, something like:

#OPTION1
fun1 = function(feat1::Vector{Float64}, feat2::Matrix{Float64})
#Some calculation
feat1' * feat2
end
fun1([.5, .5], [0.9 0.1; 0.1 0.9])

I would prefer to instead construct a struct where every variable is properly defined and then just plugin that to the function, something like this:

#OPTION2
struct FnObject
    feature1::Vector{Float64}
    feature2::Matrix{Float64}
end

fun2 = function(obj::FnObject)
#define
feat1 = obj.feature1
feat2 = obj.feature2
#Some calculation
feat1' * feat2
end

test = FnObject([.5, .5], [0.9 0.1; 0.1 0.9])
fun2(test)

Do I archieve the same performance with option 2, where I define types in a summary struct and then assign the function variables at the beginning of the function? This seems to be more convenient when using functions for larger projects.

Best regards,

You probably want to write this:

function fun3(feat1::AbstractVector, feat2::AbstractMatrix)
    feat1' * feat2
end

Or leave off the types entirely, they only act to restrict when this method will be called, and don’t affect speed.

Note that fun1 is a global variable which happens to contain an anonymous function, while fun3 is… more like a const, I’m not too sure of the right term. But I think this can lead to performance problems with fun1.

Constructing a struct should be free, if it helps organise your thoughts.

2 Likes

As was already mentioned, type parameters in functions don’t affect the speed. If you are in doubt as to what is a better approach, it’s generally a good idea to use BenchmarkTools -

julia> using BenchmarkTools

julia> @btime fun1($([.5, .5]), $([0.9 0.1; 0.1 0.9]))
  90.450 ns (2 allocations: 112 bytes)
1×2 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
 0.5  0.5

julia> @btime fun2($test)
  90.197 ns (2 allocations: 112 bytes)
1×2 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
 0.5  0.5

Note the use of $ to interpolate variables into the expression (it avoids the penalties associated with global variables). Also, just in case you aren’t aware, leaving type annotations off functions doesn’t affect speed but leaving type annotations off structs does affect the speed.

4 Likes

I am curious where you read this, because it is something that one should not do unless needed for dispatch: it restricts the applicability of the method, and gives no advantage.

4 Likes

Guys, thank you very much!

I read in the constructors section in the Julia 1.1 Documentation that one should specify types in the wrapper, and I assumed this is the same for functions. I noticed now that in the performance tips section it is explicitly stated that this is not the case - even though I am not entirely sure why.

I take it that type declaration in functions are mainly used for method restriction and I can use custom structs freely without loss of performance. Thank you, topic can be closed :slight_smile: !

I’m by no means an expert in the intricacies of the compiler, but my simple way to think about it is this:

As you say, for functions type declarations are required for multiple dispatch - if no types are declared, the compiler will compile a version of the function for the types passed.

Now for any given compiled function, the speed of that function however will depend on how well type inference works - and that depends on how the compiler can reason about the objects passed to the function. So if you think about sum(a, b) = a+b then passing two Int64 to the function makes the compiler compile a version sum(a::Int64, b::Int64), and it is quite easy to prove that a+b will always return another Int64. If however you now pass your own type, the compiler will need to compile sum(a::MyCrazyType, b::MyCrazyType), and if MyCrazyType can hold arbitrary types internally (if you don’t annotate them), it is very hard to do any kinds of optimisations given inference is working with very limited information.

2 Likes

One last comment for other beginners like me:

While it was already said that type parameters in functions don’t affect the speed, if you have a nested custom function within a custom function, and you do not want to set that function as input in the outer function, you can speed up the performance drastically by declaring the return type of your inner functions like myfun(...)::Float64.

I think this is called setting function barries in the performance tips, and was answered in another of my questions here: Large memory allocation in function within function - #4 by mrVeng

Only if they are not type stable (or the compiler can’t infer it).

A lot of code (most?) is not written generically and adding type annotations for the expected types can help readers of the code as well as give better error messages.