# What is the best way to design a long computation?

In the context of the development of a finite element code, I have a dictionary that is formed after reading an input data file containing the finite element mesh (that is, nodal coordinates and element connectivity). I am in the process of deciding how I am going to unpack the finite element mesh from the dictionary to perform the long computation. Here, I present two approaches to perform the long computation. The codes look something like these:

Approach 1:

``````function main1()
file = "inputdata.txt"
data = read_input(file)  # data is a dictionary containing a finite element mesh (nodal coordinates, element connectivity, etc.)
solve1(data)
end

function solve1(data)
# some auxiliary inexpensive computations
someoutputs = do_some_short_computations()

# long computation that depends on data
loop_over_elements1(data)

do_some_other_short_computations()
end

function loop_over_elements1(data)
# this is an array of a "struct" type containing the data of an element
elements = data["elements"]

# the loop
@inbounds for e = 1:length(elements)
element = elements[e]  # the struct containing the information of the current element
nodes = element.nodes  # this is the nodal connectivity of the current element
do_some_stuff_with_nodes(nodes)
end
end
``````

Approach 2:

``````function main2()
file = "inputdata.txt"
data = read_input(file)  # data is a dictionary containing a finite element mesh (nodal coordinates, element connectivity, etc.)
solve2(data)
end

function solve2(data)
# some auxiliary inexpensive computations
someoutputs = do_some_short_computations()

# long computation
elements = data["elements"]  # this is an array of a "struct" type containing the data of an element
loop_over_elements2(elements)

do_some_other_short_computations()
end

function loop_over_elements2(elements)
@inbounds for e = 1:length(elements)
element = elements[e]  # the struct containing the information of the current element
nodes = element.nodes  # this is the nodal connectivity of the current element
do_some_stuff_with_nodes(nodes)
end
end
``````

What do you think it is a better (in term of performance) approach? to pass the data dictionary inside the â€śloop_over_elementsâ€ť function and unpack the elements inside this function (Approach 1) or to unpack the array of elements before entering the â€śloop_over_elementsâ€ť function (Approach 2)?

1 Like

Can you build a toy example of each and then just benchmark it? I usually find it difficult to reason about what will be performant, so this is my approach. That or I make one version that works and then ask people if thereâ€™s a better way to do it.

Youâ€™re much more likely to get useful responses if you have code that actually runs that people can play with.

2 Likes

@kevbonham I donâ€™t have it yet, sorry. My question is tailored to figure out the best way to do it in Julia so that it can be optimized. The â€śelementsâ€ť array can be huge. I am tempted to think that Approach 2 is better, but perhaps there is no difference. I donâ€™t know. I am new to Julia. Anyway, Iâ€™ll do what you say of course when I have it, but in the meantime perhaps someone already did it and can share his/her findings.

`loop_over_elements2()` may be written in a way that will nail down the type of the argument so that the interior of this function is type-stable.
Retrieving something from an untyped dictionary (`Any`) within your computation routine
will not do that.

FinEtools makes sure that all heavy-lifting routines are type stable.

2 Likes

Thank you @PetrKryslUCSD . Actually, I became interested in Julia when I saw your project for the first time about a month ago. I was browsing your code by then and took some ideas from it. Nice project and very interesting. Now, I had another look at your project and found something that I believe it is similar to what I had in mind. It is given in the function stiffness() in FEMMDeforLinearBaseModule.jl of FinEtools:

``````function stiffness(self::FEMMDeforLinearAbstract, assembler::A, geom::NodalField{FFlt}, u::NodalField{T}) where {A<:SysmatAssemblerBase, T<:Number}

fes = self.integdomain.fes

npts,  Ns,  gradNparams,  w,  pc = integrationdata(self.integdomain);

dofnums, loc, J, csmatTJ, gradN, D, B, DB, elmat, elvec, elvecfix = buffers(self, geom, u)

self.material.tangentmoduli!(self.material, D, 0.0, 0.0, loc, 0)

startassembly!(assembler, size(elmat, 1), size(elmat, 2), count(fes),

u.nfreedofs, u.nfreedofs);

for i = 1:count(fes) # Loop over elements

fill!(elmat,  0.0); # Initialize element matrix

for j = 1:npts # Loop over quadrature points

locjac!(loc, J, geom.values, fes.conn[i], Ns[j], gradNparams[j])

Jac = Jacobianvolume(self.integdomain, J, loc, fes.conn[i], Ns[j]);

updatecsmat!(self.mcsys, loc, J, fes.label[i]);

At_mul_B!(csmatTJ, self.mcsys.csmat, J); # local Jacobian matrix

Blmat!(self.mr, B, Ns[j], gradN, loc, self.mcsys.csmat);

end # Loop over quadrature points

complete_lt!(elmat)

gatherdofnums!(u, dofnums, fes.conn[i]); # retrieve degrees of freedom

assemble!(assembler, elmat, dofnums, dofnums); # assemble symmetric matrix

end # Loop over elements

return makematrix!(assembler);

end
``````

It appears that self is at some point similar to what I have in â€śelementsâ€ť in the code I put in my original post â€” self appears to be some sort of struct containing the elements and my â€śelementsâ€ť is an array of structs representing the element data (each entry in elements is an element struct having the nodes defining the element, their coordinates, etc.). Under this understanding, I think that loop_over_elements2 (Approach 2) is more closer to your stiffness function. As you said, passing the untyped dictionary inside the long computation function appears to be not convenient for type stability. Did I get your message right?

Yes, that is what I had in mind. The integration domain is parameterized with the type of the element.

``````mutable struct IntegDomain{S<:FESet, F<:Function}
fes::S # finite element set object
integration_rule::IntegRule  # integration rule object
otherdimension::F # function to compute the "other" dimension (thickness, or cross-sectional area)
axisymmetric::Bool
end
``````

That makes it possible to achieve type stability.

1 Like

But it looks to me like what you care about here isnâ€™t actually about your use case, but about how Julia handles different data structures. Iâ€™m saying, fill a dictionary containing your struct with random numbers, try out your loops and see what happensâ€¦

Just to clarify, this statement:

``````elements = data["elements"]
``````

simply returns a reference to the already existing element array. Thereâ€™s no unpacking or array construction taking place there, unless Iâ€™ve misunderstood something in your code. See this example:

``````julia> struct Element e end

julia> @time original_elements = Element.(1:1_000_000);
0.073706 seconds (2.12 M allocations: 43.801 MiB)

julia> @time data = Dict("elements" => original_elements);
0.000008 seconds (9 allocations: 800 bytes)

julia> @time elements = data["elements"];
0.000004 seconds (4 allocations: 160 bytes)
``````

So in terms of where the â€śunpackingâ€ť occurs, your two implementations are almost identical. The more relevant questions IMO are 1) type-stability, as @PetrKryslUCSD points out and 2) if one form allows you to unit test / re-use your code better.

1 Like

Sorry @bennedich , I didnâ€™t use correctly the term â€śunpacking.â€ť I was refering to the process of getting out of the dictionary, which has many other things as well (Dict{String, Any}). When I wrote the original post, I was thinking about the performance issues that would cause the â€śAnyâ€ť (I didnâ€™t wrote all I was thinking and I appologize for that), so @PetrKryslUCSD gave me the answer I was expecting. Anyway, I am grateful for your kind reply.