Function inside struct allocates when referenced

In the example below, if box1 is passed as an argument to calcA, then calling its function box1.func leads to allocation.

However, if box1.func is passed as a direct argument to calcB, then there is no allocation.

Can you help me understand this behavior?

struct Box
    func  :: Function
    func_array :: Array{Function, 1}
end 

function f1(x)
    return 1
end

box1 = Box(
    func = f1,
    func_array = [f1,f1,f1,f1]
)
 
function calcA(box)
    for i=1:Int(1e6)
        y = box.func(i)
    end
end
@time calcA(box1) 
@time calcA(box1) 

function calcB(box, f)
    for i=1:Int(1e6)
        y = f(i)
    end
end
@time calcB(box1, box1.func) 
@time calcB(box1, box1.func) 

Output

 0.027969 seconds (1.00 M allocations: 15.416 MiB, 9.45% gc time, 21.37% compilation time)
 0.019493 seconds (999.49 k allocations: 15.251 MiB)
 0.000387 seconds (1.63 k allocations: 107.531 KiB, 94.27% compilation time)
 0.000008 seconds

Function is an abstract type, and you may need to parameterize your struct to avoid allocations

8 Likes

See the Performance tips in the manual:

https://docs.julialang.org/en/v1/manual/performance-tips/

1 Like

So in concrete terms, this is what a more efficient Box would look like:

struct Box{F}
    func::F
end

No need to specify F <: Function, since there are objects that are not functions but still callable.

However, this trick will not work well for func_array. Indeed, each function f in Julia has its own type typeof(f), and so you cannot specify a concrete type for the array elements unless all of its elements have the same type, i.e. they are the same function. Which would make said array useless.
Why do you need an array of functions in your use case?

5 Likes

Minor addition to @gdall’s comment: Instead of an array of function, a tuple of function should work just fine (well as long as it doesn’t get too long, but I have no good idea what “too long” is in this context).

1 Like

Does not hold for closures, they don’t have to be singleton types:

julia> f(n) = let n = n
         x -> x + n
       end
f (generic function with 1 method)

julia> typeof(f(1)) == typeof(f(2)) <: Function
true
3 Likes

Thanks for the concrete example!

Why do you need an array of functions in your use case?

The Box struct is meant to represent a network/array of elements, where each element has an associated “behavior” function. There can be many elements (dozens/hundreds), but there are only a handful of behavior functions - hence the function array. (the actual meaning of the “behavior” function is not relevant for the MWE).

Instead of an array of functions, would it be better to have an array of strings holding function names and then add a “function lookup” function based on that string, e.g.?

struct Box
    fname_array :: Array{String, 1}
end 

function lookup(x, name)
    if name == "f1"
        return f1(x)
    elseif ...
end

# example: call the i-th element function on variable x
lookup(x, box1.fname_array[i])

or maybe there’s a better way altogether.

I don’t think it really matters in your case how you store the information of which function to call. There will always be a dynamic dispatch since the compiler can never known which functions there will be. Making things more stable in the struct only shifts this to a different place.

What you can try to improve things (assuming all behavior function have the same signature): use FunctionWrappers.jl. I have never used it so I don’t know how well it will work.
In principle it allows you to put a concrete type on the vector like so

struct Box
    fname_array :: Array{FunctionWrapper{Outputtype,(Input1, input2,...}, 1}
end 
2 Likes

What are the behavior functions like? Giving them all the same type would be a solution.

Or you could do something like this to prevent run time dispatch:

struct Box
  behavior_function_code::UInt8
end

execute_behavior(code::UInt8, args...) =
  if code == 0
    bf0(args...)
  if code == 1
    bf1(args...)
  elseif
    bf2(args...)
  end
2 Likes

Nice dispatch. Have you considered using multiple disparch?

execute_behavior(code::UInt8, args...) = bf(Val(code), args...)
bf(::Val{0x0}, args...) = 0
bf(::Val{0x1}, args...) = 1
bf(::Val{0x2}, args...) = 2

I specifically said right there that the goal is to prevent run time dispatch. Is that not the goal here?

@mkitti 's solution doesn’t use runtime dispatch because Val tells the compiler to consider the parameter value during compilation. See Performance Tips · The Julia Language

You can see below that Julia is able to constant propagate through the multiple dispatch.

julia> f() = execute_behavior(0x1)
f (generic function with 1 method)

julia> f()
1

julia> @code_typed f()
CodeInfo(
1 ─     return 1
) => Int64

julia> @code_warntype f()
MethodInstance for f()
  from f() @ Main REPL[12]:1
Arguments
  #self#::Core.Const(f)
Body::Int64
1 ─ %1 = Main.execute_behavior(0x01)::Core.Const(1)
└──      return %1

It would probably help to know a bit more about what you precisely want to achieve.
Most likely, Functionwrappers will do the job.
Personally, when I needed a collection of functions in the past, the correct solution turned out to be a vector of functors (callable structs).
This is imo the best way to approach this when your functions are all similar in some sense but depend on additional data which is different for. For example

F1(x)  = 2sin(3x)
F2(x)  = 3cos(1x)
[...]

Can be represented by callable structs which save the amplitude and phase of the sine function.

Of course it might also be that your functions are user defined, in which case functionwrappers are probably the only way to go.

3 Likes

What is this meant to prove? It doesn’t seem like 0x1 would be a constant in the real code.

Moving a value into the type domain involves multiple dispatch, that’s exactly what I was referring to.

3 Likes

There is no runtime dispatch involved in my use of multiple dispatch. Your if statement and my use of multiple dispatch can be compiled down to the same machine code.

In fact, due to simplicity Julia inlined the entire thing even before using LLVM’s JIT.

Yeah, but this is specifically because in your example the run time dispatch gets constant folded away.

Only when the argument is a compile-time constant. Which would be pointless.

3 Likes

Thanks for helping me wrap my mind around this. A couple of questions:

Can you comment on what “giving them all the same type” means?
In my case, all bf’s functions take the same Float input (coordinate x) and return a Float output.

In your example, can code be a string and behavior_function_code a string array?

Finally, it was noted in the comment above:

I don’t think it really matters in your case how you store the information of which function to call. There will always be a dynamic dispatch since the compiler can never known which functions there will be.

but it seems your approach avoids dynamic dispatch?

By default, a function is an instance of a singleton type, meaning that type only has 1 instance, the function itself. However, that is not necessarily the case; if a function were to contain data from a local scope, then it would not have a singleton type. The non-singleton function factory example nsajko provided is a bit indirect, it’d be clearer to write out the underlying functors with a type that subtypes Function, making them functions. There would still only be 1 method table associated with the functors’ type, but the functors would act differently because of the data they contain. Of course, this still restricts the functors to do almost the same thing, so it might not be applicable to your use case. FunctionWrappers.jl lets the multiple input functions have different method tables and types as a tradeoff for attaching a call signature restriction. (Normally, a function has multiple methods, and each method can be compiled for multiple call signatures; a call is dispatched to a method based on the function’s type as well as the arguments’ types).

To expand on the explicitness, unlike writing out a functor type to contain data in manually annotated fields, closures handle it implicitly so it’s not visible in one place:

julia> f(n, m) = let n = n, m = m
         foo(x) = x+n  # 1st method captures n
         # maybe a lot of other code
         foo() = m     # 2nd method captures m
       end
f (generic function with 1 method)

julia> f(1, 1.0) # type parameters means function contains both n and m
(::var"#foo#1"{Int64, Float64}) (generic function with 2 methods)

Instantiating functors also don’t have to deal with variable scoping rules; closures are local scopes that technically capture variables, not data at instantiation, and any reassignment prevents the compiler from inferring the variable’s type well, even if unnecessary and accidental.

3 Likes

In Julia there is no connection between the type of a function and possible type signatures. Type signatures belong to methods, not functions (while a function may have arbitrarily many methods attached to it). Benny already expanded on the meaning, however if you want us to help, you have to give us more information on your functions. Also it’s important if there’s just a fixed number of them, or will users of your package be able to add more, etc. Are the functions all defined directly in your source code? If possible just give your source code.

Sure, but that wouldn’t be very efficient. I just used the UInt8 as an example, in practice I’d probably use something enum-like, for clarity. There’s a few different user packages that have Enum as a prefix to their name, so you can pick and choose. There’s also some enum-like functionality built-into Julia, but I don’t like that.

Yes.

1 Like