A data struct is required almost all sub functions

Merry Christmas, everyone. I have a struct containing all data required for running an algorithm to solve the problem, say:

struct ProblemData
    x :: Float64
    y :: Vector{Float64} 
    z :: Matrix{Float64}
end

This problem data is tossed to the main function, which calls many sub functions. The problem data is also required in those sub functions. Right now I have something like:

function main(pd::ProblemData)
    a = sub_function1(pd)
    ...
    b = sub_function2(pd)
    ....
    c = sub_function3(pd)
    ....
end

function sub_function1(pd::ProblemData)
... // needs pd.x
end
function sub_function2(pd::ProblemData)
... // needs pd.x and pd.y
end
function sub_function3(pd::ProblemData)
... // needs pd.z
end

I’m not sure if the above is just fine or not. Is there any performance improving tip in this case? Any design suggestions? A few things to think of:

  1. make sub functions nested within the main function and not pass pd as a function argument
  2. instead of passing the entire pd, just pass what they need specifically, like pd.x.
  3. refactor codes so that I don’t need to pass pd around.

Any comments or suggestions?

1 Like

1: I wouldn’t nest the subfunctions unless it makes the logic simpler. That’s the case when local variables of main are needed and you can make closures, in which case you don’t have to pass them again to the subfunctions (that can come with performance penalties due to boxing though). Normal separate functions are usually the easiest and clearest.

2 and 3: Passing only the relevant fields to the subfunctions is a good idea. Basically, you simplify the sub functions by putting the least amount of redundancy in them, which means they probably shouldn’t be concerned with the field names of a ProblemData struct. That saves work whenever you make changes to your struct, because you only have to look in a couple places where you actually accessed those fields and passed their data to separate functions, which don’t care about the struct at all and only expect simple arrays etc.

A ProblemData struct to pass to main is a good idea on the other hand because it makes your intent clearer that the arguments you’re passing have a special relationship to each other, also there are often a lot of them, and you can do some validation on your struct parameters so that the function main doesn’t have to do that, which should again be clearer code with some separation of concerns.

5 Likes

Oh and with regards to performance, usually field accesses of normal structs are compiled away so it doesn’t matter if they happen in the inner or outer functions I would say. If on the other hand your structs have non-concrete fields the accesses won’t be compiled away and it would be much better to access them once and pass to inner functions with tight loops etc

4 Likes

And take a look at @unpack from Parameters.jl, with which you can make the code of the main function less bloated if you decide to pass only the relevant fields.

One thing I have done also is to define two methods for the subfunctions, one that receives the relevant field (a vector, matrix, etc) and the other that receives the struct and calls the first one passing the field to it. Such as

subfunction(x::Vector) = actual stuff
subfunction(s::ProblemData)  = subfunction(s.x)
1 Like