Alternative to function as field in struct

Hi,

I was hoping you could help me out with a proper way to solve this problem in Julia.

I have defined a struct that represents a statistical model. Among other things, this holds a vector of parameters that at some point are optimized with respect to an objective function. My problem is that I want the user to be able to define a function that is applied to the parameters. The way I have currently solved this is as follows:

struct Model{T<:AbstractFloat}
    θ::Vector{T}
    tf::Function
end

Here θ is a vector of parameters and tf is a function that should accept and return a vector of the same length as θ. A model object may for instance be created as:

m = Model([1.0, 2.0],  (θ) -> θ.^2)

During optimization I may then do:

m.tf(m.θ)

I have understood (perhaps wrongly?) from reading around the forum and manual that having a function as a field in a struct is not an ideal way of coding in Julia and that things like this should rather we handled with multiple dispatch. However I’m not sure how one could do this. Does anyone have a suggestion?

Thanks!

Espen

2 Likes

A Function is an abstract type. I have had huge performance gains with doing something like:

struct Model{T<:AbstractFloat,F<:Function}
    θ::Vector{T}
    tf::F
end

The rest works the same and is fine:

julia> m = Model([1.0, 2.0],  (θ) -> θ.^2)
Model{Float64, var"#11#12"}([1.0, 2.0], var"#11#12"())

julia> m.tf(m.θ)
2-element Vector{Float64}:
 1.0
 4.0

Edit: I would then define for convenience:

julia> (m::Model)() = m.tf(m.θ)

julia> m()
2-element Vector{Float64}:
 1.0
 4.0
14 Likes

Yet another much faster alternative is a closure, for example:

julia> myModel(f, args...) = ()->f.(args)
myModel (generic function with 2 methods)

julia> m1 = myModel(Θ->Θ.^2, 1.0, 2.0)
#15 (generic function with 1 method)

julia> m1()
(1.0, 4.0)

julia> m1.args
(1.0, 2.0)
3 Likes

Well, that would be something like having a Model type without the function:

struct Model{T<:AbstractFloat}
    θ::Vector{T}
end

and defining your function tf as

tf(m::Model) = something with m.\theta

Instead of having tf as a field of m. This is the dispatch way of dealing with that, and the most common pattern. But what to do exactly may depend on the exact use case.

Another alternative is to define a functor, like:

struct Model{T<:AbstractFloat}
      θ::Vector{T}
end
(m::Model)(a,b,c) = function of a, b, c and \theta

with which you will be able to call the structure, like:

m = Model(...)
m(a,b,c) # execute function defined above with parameters a,b,c and the \theta field of model
5 Likes

@pbayer and @lmiq. Thanks a lot! I see now that there a multiple ways of solving this. I think I have to play around a bit with these suggestions to grasp them and see what seems to work the cleanest.

It sounds like the solution you are building is more than just the optimization, but maybe you need not encapsulate the vector and function at all. Maybe you can just provide just an optimize function that accepts the starting theta value vector and the function and returns the optimal theta. If you must encapsulate, then I think that including the function as a field of the struct works in this case. The admonition you cite is to dissuade people from implementing object oriented classes.

Thanks for your input! Yes, it is more than just the optimization. The Model struct is passed around to various functions for computations and also has a set of convenience functions defined for it. This makes it convenient to have everything (data, parameters, function(?)) encapsulated in the same object. So you think that this is an okay solution in this case?

Here’s a relevant blog post a friend of mine wrote on closures, I thought you might be interested. Function like objects and closures

5 Likes

If you have various models with different internal structures and different functions that can be applied to them, then you may still benefit from defining models and functions separately. You can use something like this

abstract type AbstractModel end

data(m::AbstractModel) = m.θ

# in optimize function
function optimize(m::AbstractModel)
    ...
    θ = data(m)
    # do something with θ
    ...
    # use function defined on `m`
    x = f(m)
end

Of course, it would require a more involved setup for the concrete models. For each model, you have to define your own structure and necessary functions

struct MyModel{T} <: AbstractModel
    θ::Vector{T}
end

f(m::MyModel) = m.θ .^ 2

But the benefit is that you will not be restricted by the initial type definition, i.e. you can add as many parameters as needed, and also there will be no issues with functions as a field.

This approach is used for example in DynamicHMC.jl, you can see how it looks in DynamicHMC examples

2 Likes

The group that responded to this question did a fantastic job! Thanks for providing the poster with the answer and the common options to consider.

Thanks, this is a really elegant solution! Looking through DynamicHMC.jl now to see if I can structure this similar. Might be a bit to complex for my use case. This is a cool package!

Couldn’t agree more. Working my way through the suggestions now. Learned a bunch from this tread. Thanks all!

If anyone is interested I’m working on a package to fit variance component models, when the correlation structure among individuals is known. Typically quantitative genetics models.
https://github.com/espenmei/VCModels.jl

1 Like

Reporting back.
For now I ended up using a variant of the different suggestions. This seems to work without having to change almost anything of the original code. First I let Model be a subtype of StatsBase.StatisticalModel (which I think makes sense).

struct Model{T<:AbstractFloat} <:StatsBase.StatisticalModel
    θ::Vector{T}
end

Then I defined a default function for the abstract type

f(m::StatisticalModel) = m.θ

This doesn’t do anything, because that is the typical use case. Then, the user may define another function for a type of Model. For example

f(m::Model) = m.θ.^2

Do you think this makes sense? I think this is quite similar to what @Skoffer suggested?

3 Likes

I just found this thread which is really useful for me coming from OOP. Out of curiosity: why do you need the AbstractModel type and not just define

struct MyModel{T}
   θ::Vector{T}
end

without the abstract type now for each model you have to:

data(m::MyModel1) = m.θ
data(m::MyModel2) = m.θ
data(m::MyModel3) = m.θ
data(m::MyModel4) = m.θ
1 Like

Ah ok, that’s a good reason. Thanks!

@Imiq

Could you translate your answer into the actual solution? I’m having trouble doing it.

Your first altenative I think I’ve made it:

struct Model3{T<:AbstractFloat}
    θ::Vector{T}
end

tf(m::Model3) = map(x -> x^2, m.θ)

tf(Model3([1.0, 2.0, 3.0]))

However, your second alternative has been impossible for me:

struct Model5{T<:AbstractFloat}
      θ::Vector{T}
end

# (m::Model5)(a,b,c) = function of a, b, c and \theta
(m::Model5)(a,b,c) = map(x -> x^2, [a, b, c])

m = Model5([1.0, 2.0, 3.0])
m(a, b, c) # execute function defined above with parameters a,b,c and the \theta field of model

Thanks

I’m unsure about what you are getting wrong there. If I copy/paste this code I get a not defined, because you are calling m(a,b,c) and the variables a, b, and c are effectively not defined.

If you call this with:

julia> m(1,2,3)
3-element Vector{Int64}:
 1
 4
 9

it works. Yet, this is not using the inner information of the model at all. Maybe you want something like:

julia> (m::Model5)(a,b,c) = map(x -> x^2, m.θ .* (a, b, c))

julia> m(1,2,3)
3-element Vector{Float64}:
  1.0
 16.0
 81.0

In which case I use both the input parameters a,b,c and the model parameters in the inner function (I also replaced there the array [a,b,c] by a tuple (a,b,c) to avoid a new allocation that is unnecessary).

3 Likes