Need to add flexibility to deeply nested functions---global parameter variable a good idea?

I have a general programming question (which I need to tacking using Julia). I have a couple of functions (that call other functions) to carry out my analysis (some signal processing tasks and sorting data the way I want). In turn these functions are called by a pipeline function that runs series of functions and takes specific outputs from them, and generates plots.

Sometimes I have to go back to the nested “analysis functions” to change something, however, these tweaks are usually very deeply nested so I’ve been manually change parts of the function and generate plots. This was fine when I had to run an instance or two, but now having to run the code several times I have address how to efficiently change the internals.

If I make optional arguments in the highest level function, it will have to to be passed down several functions which doesn’t seem right. I am currently thinking:

  1. Make a global parameter variable, a named tuple, that the functions access. This puts all my parameters in the beginning of the script and I can iteratively add different parameters to it. But i will be using global variables

  2. Change the layout of the analysis.

Would appreciate some feedback. Is going for a global parameter variable going to be very poor for performance? I need to be able to dynamically be able to modify parts of the pipeline that I had not previously considered needing flexibility. Perhaps the drop of performance is okay at this level, where I am trying different pipeline parameters? [~80% of my computation time is on reading the data, and the analysis doesn’t take more than about 5 minutes]

So lets say that I have a bunch of functions like the following one. By assuming that you have two kind of parameters: processing parameters and plotting parameters and functions like:

function processing(data,N,L=10;kw1,kw2)
function processing_bis(data,N,M,K)
function plotdata(N,size)

What I like to do is to overload this functions with either a struct, a dictionnary or a tuple containing all my parameters. For example, I can define my analysis parameters as

(mutable) struct AnalysisParameters
    N
    L
    kw1
    kw2
    M
    K
end

and overload processing functions like this

processing(data,params::AnalysisParameters)=processing(data,params.N,params.L;kw1=params.kw1,kw2=params.kw2)
processing_bis(data,params::AnalysisParameters)=processing_bis(data,params.N,params.M,params.K)

I will do the same thing for plotting by defining a (mutable) struct PlottingParameters and overload the appropriate functions. The advantage of this, is that you only have to modify your main script and add new methods taking your parameters.

Ultimately, if you are interested in keeping track of which parameters you used for your results and avoid overwriting or repeating some simulations, you can parametrize the filename of you results with the hash of your struct/Dict/NameTuple parameters, i.e if the hash already exists in the filenames of you results you can skip the computation and a slight change in parameters will result in two different filenames.

I hope it helps !

2 Likes

I would pack these parameters in a struct and just pass them around. It does look a bit tedious, but I think it results in more organized code — personally, I know I would simply forget that there is a global variable when I look at the code a few months later.

5 Likes

Global variables can also break parallelism. Even in something as simple as a parallel for loop like an floop if each call to the function in the loop assigns different things to the global. As @Tamas_Papp says, globals make your debugging jobs harder.

3 Likes

Depending on how the code is structured, you can use closures to pass the function that will ultimately be tweaked, with parameters, to the internal functions:

julia> function outer(f,x)
         return inner(f,x)
       end
outer (generic function with 1 method)

julia> function inner(f,x)
           f(x)
       end
inner (generic function with 1 method)

julia> f(x,a) = a*x
f (generic function with 1 method)

julia> a = 5.
5.0

julia> outer(x -> f(x,a),2)
10.0



Ah okay, so ultimately this would mean that every function that needs to have a parameter tweaked will have an input which will be the parameter struct.

Parameters.jl seems to have some helpful macros for this. Any differences using either a struct or a named tuple? (I saw that it can take either)

An advantage of NamedTuples is that Revise can handle them just fine, while struct redefinition requires a restart. But if you do it right, the consumers of such types do not need to care (just use Unpack.@unpack, or property accessors). So when you are done you can finalize to a struct if you want to.

1 Like