How to modularize my code like a pro

Hi guys,

I work with numerical sampling methods that evolve some vectorial parameter \boldsymbol{x} and associated momentum \boldsymbol{p} via a force function \boldsymbol{F}(\boldsymbol{x}) that comes from an underlying probability density.
The method runs for N iterations, updating parameters, momenta, and forces in each step, and computes the moving average of certain observables \phi(\boldsymbol{x}, \boldsymbol{p}) any n iterations via \langle \phi \rangle = \frac{1}{k} \sum_{i=1}^{k} \phi(\boldsymbol{x}_i, \boldsymbol{p}_i), where k is the current iteration count.

Below is a spaghetti-code example. I was wondering how one could use modules to modularize it properly in order to increase generality.
There are three main components to a simulation like that:

  1. The choice of the sampling method (different methods evolve the system (\boldsymbol{x}, \boldsymbol{p}) in different ways.
  2. The choice of the observables to collect, i.e. the functions \phi.
  3. The force function to use, i.e. the underlying probability model.

Therefore, I thought one could implement three different modules, one for the samplers, one for the observables, one for the forces. The forces and the observables are independent of one another, but the sampler method needs to “see” both the force function and the observables.

In C++, I had a setup like this using classes and inheritance, but I am not sure about the Julia way.

Any tips would be great!

using Random


# force function
function compute_force!(x::Vector{<:Real}, force::Vector{<:Real})
    #force[1] = ...
    #force[2] = ...
    # ....
end




# simulation parameters, eg.
maxIter = Int(1e7)  # no. of iterations
dt      = 0.01      # step size
init_cond = [0.0, 0.0]  # initial parameter, momentum


output_file_name = "output.csv"  # print observables to file with that name
t_meas = 5                       # collect sample any t_meas iterations


# array initialization
x = [init_cond[1]]      # initialize x and p
p = [init_cond[2]]

force = [0.0]
compute_force!(x, force)    # get first force

# specify observables to measure, in this case just p^2
p_sq_sum = 0.0
no_samples = 0       

res_vec_size = maxIter Ă· t_meas + 1       # size of the results vector 
p_sq_results     = zeros( res_vec_size )  # going to store the evolving time average of the observable p^2



# define rng and some useful constants
rand_vec = zeros(size(p))             
rng = Xoshiro(1234)
# ....other numerical constants....

# run sampler
for i in 0:maxIter

    # take a measurement
    if i%t_meas == 0
    
        p_sq_sum  += p'*p

        no_samples += 1
        
        p_sq_results[1+iĂ·t_meas]  = p_sq_sum / no_samples

    end

    ### evolve (x,p) via some dynamics governed by force and random number generator
    # x=...
    # p=...

    compute_force!(x, force)    # update force

    # x=...
    # p=...


end


# print result to .csv file, in this case just the array p_sq_results

Depending on preference, experience, performance requirements etc., you have multiple options here:

  1. If you are comfortable with OOP design – “In C++, I had a setup like this using classes and inheritance” – just copy that design: Use struct instead of classes and define methods as generic functions, i.e., like method(obj::MyClass, args...) instead of obj.method(args…), and use delegation instead of inheritance. In the end, single-dispatch from OOP is just a special case of multiple dispatch and works fine in Julia as well.

  2. If you prefer a more functional approach, many idioms from functional programming languages can be used. I.e., in the example (assuming that force \mathbf{F}(\mathbf{x}) is indeed a function of \mathbf{x} as in your mathematical description) think of a method as computing new parameters and momenta based on the current values – instead of updating them. Then, a method is basically just a function:

    function mymethod(x, p)
        f = F(x)  # compute current force
        x_ = ...; p_ = ...  # compute next state based on x, p and f
        (x_, p_)
    end
    

    and can be used to produce a state sequence, i.e.,

     using Chains, IterTools
     @chain splat(mymethod) begin
          IterTools.iterated((x0, p0))
          take(N)  # run for N steps
          partition(n)  # break into chunks of n
          map(chunk -> mean(map(splat(phi), chunk)))  # gather statistics
      end
    

    where phi(x, p) is again just a function computing your desired statistics.

  3. Get inspiration from some Julia library that solves a similar problem, e.g., DifferentialEquation, and use a similar API design, e.g., start with a generic function solve(::Problem, ::Method, (x0, p0); kw_options) and build suitable abstractions for your forces, methods, observables etc. around that.

1 Like

You can also pass the functions to the simulate function, something like:

function simulate(
    initial_point; # some vector
    compute_forces::F,
    compute_properties::G,
    sampling_method::H,
) where {F<:Function,G<:Function,H<:Function}
    # iterate calling `compute_forces`, etc.
end

# run with any definition of the functions

my_forces(x) = ...
my_properties(x) = ...
my_sampling(x) = ...

simulate(x,
    compute_forces = my_forces,
    compute_properteis = my_properties,
    sampling_method = my_sampling,
)

You adjust the calling side by using closures, i. e.,

my_forces(x,masses) = # something that needs the mass

simulate(x,
    compute_forces = (x) -> my_forces(x,masses)
...
end

If the inner call of the functions in simulate can require different sets of parameters.

3 Likes

Thanks for the hints. Performance would actually be important. Would you recommend your option 1) or 2) then?
Sorry for the late response, was on holiday last week.

Thanks for the advice, this way seems to be straight forward. I think, I will try this first.

Functional Julia isn’t as easy to make performant due to its functional primitives e.g. map being allocating operations. For performance it’s much easier to go with a procedural approach. Your C++ code should translate very well to Julia as all types should already be stable (due to C++'s static-typed-ness).

1 Like

Well, there’s always map!.

Yes, C++ and Julia are not that different. I think the main differences are:

  1. Julia is less featureful in the way of inheritance (only abstract types may be inherited from)
  2. Julia uses a JIT to compile code that couldn’t compile in C++, due to the static nature of the C++ type system

Coming from C++, i.e., a more imperative background, option 1 might be easier. Functional programming might require some rethinking, but in my experience often leads to more modular code that is also easier to reason about. It can require some experience to make it perform well, though.
In any case, maximum performance usually needs some manual help regarding memory allocations, e.g., reusing pre-allocated buffers to avoid unnecessary allocations via in-place operations such as map! (which has already been suggested). Interpreted strictly, i.e., a la Haskell, such code would not be considered “functional” any more by everyone.

1 Like