I was hoping you could help me out with a proper way to solve this problem in Julia.
I have defined a struct that represents a statistical model. Among other things, this holds a vector of parameters that at some point are optimized with respect to an objective function. My problem is that I want the user to be able to define a function that is applied to the parameters. The way I have currently solved this is as follows:
struct Model{T<:AbstractFloat}
θ::Vector{T}
tf::Function
end
Here θ is a vector of parameters and tf is a function that should accept and return a vector of the same length as θ. A model object may for instance be created as:
m = Model([1.0, 2.0], (θ) -> θ.^2)
During optimization I may then do:
m.tf(m.θ)
I have understood (perhaps wrongly?) from reading around the forum and manual that having a function as a field in a struct is not an ideal way of coding in Julia and that things like this should rather we handled with multiple dispatch. However I’m not sure how one could do this. Does anyone have a suggestion?
Well, that would be something like having a Model type without the function:
struct Model{T<:AbstractFloat}
θ::Vector{T}
end
and defining your function tf as
tf(m::Model) = something with m.\theta
Instead of having tf as a field of m. This is the dispatch way of dealing with that, and the most common pattern. But what to do exactly may depend on the exact use case.
Another alternative is to define a functor, like:
struct Model{T<:AbstractFloat}
θ::Vector{T}
end
(m::Model)(a,b,c) = function of a, b, c and \theta
with which you will be able to call the structure, like:
m = Model(...)
m(a,b,c) # execute function defined above with parameters a,b,c and the \theta field of model
@pbayer and @lmiq. Thanks a lot! I see now that there a multiple ways of solving this. I think I have to play around a bit with these suggestions to grasp them and see what seems to work the cleanest.
It sounds like the solution you are building is more than just the optimization, but maybe you need not encapsulate the vector and function at all. Maybe you can just provide just an optimize function that accepts the starting theta value vector and the function and returns the optimal theta. If you must encapsulate, then I think that including the function as a field of the struct works in this case. The admonition you cite is to dissuade people from implementing object oriented classes.
Thanks for your input! Yes, it is more than just the optimization. The Model struct is passed around to various functions for computations and also has a set of convenience functions defined for it. This makes it convenient to have everything (data, parameters, function(?)) encapsulated in the same object. So you think that this is an okay solution in this case?
If you have various models with different internal structures and different functions that can be applied to them, then you may still benefit from defining models and functions separately. You can use something like this
abstract type AbstractModel end
data(m::AbstractModel) = m.θ
# in optimize function
function optimize(m::AbstractModel)
...
θ = data(m)
# do something with θ
...
# use function defined on `m`
x = f(m)
end
Of course, it would require a more involved setup for the concrete models. For each model, you have to define your own structure and necessary functions
But the benefit is that you will not be restricted by the initial type definition, i.e. you can add as many parameters as needed, and also there will be no issues with functions as a field.
This approach is used for example in DynamicHMC.jl, you can see how it looks in DynamicHMC examples
Thanks, this is a really elegant solution! Looking through DynamicHMC.jl now to see if I can structure this similar. Might be a bit to complex for my use case. This is a cool package!
Couldn’t agree more. Working my way through the suggestions now. Learned a bunch from this tread. Thanks all!
If anyone is interested I’m working on a package to fit variance component models, when the correlation structure among individuals is known. Typically quantitative genetics models. https://github.com/espenmei/VCModels.jl
Reporting back.
For now I ended up using a variant of the different suggestions. This seems to work without having to change almost anything of the original code. First I let Model be a subtype of StatsBase.StatisticalModel (which I think makes sense).
struct Model{T<:AbstractFloat} <:StatsBase.StatisticalModel
θ::Vector{T}
end
Then I defined a default function for the abstract type
f(m::StatisticalModel) = m.θ
This doesn’t do anything, because that is the typical use case. Then, the user may define another function for a type of Model. For example
f(m::Model) = m.θ.^2
Do you think this makes sense? I think this is quite similar to what @Skoffer suggested?
However, your second alternative has been impossible for me:
struct Model5{T<:AbstractFloat}
θ::Vector{T}
end
# (m::Model5)(a,b,c) = function of a, b, c and \theta
(m::Model5)(a,b,c) = map(x -> x^2, [a, b, c])
m = Model5([1.0, 2.0, 3.0])
m(a, b, c) # execute function defined above with parameters a,b,c and the \theta field of model
I’m unsure about what you are getting wrong there. If I copy/paste this code I get a not defined, because you are calling m(a,b,c) and the variables a, b, and c are effectively not defined.
If you call this with:
julia> m(1,2,3)
3-element Vector{Int64}:
1
4
9
it works. Yet, this is not using the inner information of the model at all. Maybe you want something like:
In which case I use both the input parameters a,b,c and the model parameters in the inner function (I also replaced there the array [a,b,c] by a tuple (a,b,c) to avoid a new allocation that is unnecessary).