I have a number of functions that perform updates on a number of large arrays in the same loop. Previously, I learned that this is often a good idea for performance. Now, I am facing another issue: I need a few variants of these functions that do the updates only slightly differently. Here is a generic example:
module SomeModule
# Some declarations and preallocations of variable arrays,
# let's say the following arrays came out of this:
# A, B, C, D, and E.
# All have the same size.
function modify_arrays1!(A::T, B::T, C::S, D::T, E::T) where {
T <: Array{Float64,2},
S <: BitArray{2}
}
@inbounds @simd for i in eachindex(A)
# Update A[i], B[i], C[i], D[i], and E[i] with some long (but
# simple) arithmetic and boolean checks between them.
end
end
function modify_arrays2!(A::T, B::T, C::S, D::T, E::T) where {
T <: Array{Float64,2},
S <: BitArray{2}
}
@inbounds @simd for i in eachindex(A)
# Same updates as in `modify_arrays1`,
# but with a two-line difference,
# out of, say, around 10 or 20.
end
end
function modify_arrays3!(A::T, B::T, C::S, D::T) where {
T <: Array{Float64,2},
S <: BitArray{2}
}
@inbounds @simd for i in eachindex(A)
# Still the same updates as in `modify_arrays1`,
# but now E is absent from the process, leading
# to another 1 or 2 lines of difference in code,
# out of, again, around 10 or 20.
end
end
These functions have to be run many thousands of times as part of a bigger iteration process. Is it better to factor out the few different lines (which, by the way, still involves 2 or 3 of the input arrays) into their own functions (e.g. function few_lines(A, B, D) ... end
), and call them from a unified modify_arrays!
function (perhaps with E becoming an optional argument); OR to leave the functions as is with lots of duplicate lines between them?
Or, is the answer: “It depends”?
I have also tried declaring the different lines as expressions (Expr
) and passing them as arguments to a unified modify_arrays!
function, but it occasionally (after a few hundred successful iterations) give me an UndefVarError
on some variable name I used in the expression, and I couldn’t figure out why. I was not yet able to find out at that point if it gave any performance (dis-)advantage.
(Side question: Is it better or worse, in terms of performance, to “combine” modify_arrays2!
and modify_arrays3!
to make use of multiple dispatch instead?)
Please kindly advise. Any additional side comments about other aspects of the generic example above will also be very welcomed! Thank you!