I have a doubt regarding a design pattern. To expose it I’m going to refer to IterativeSolvers.jl
’s Conjugate Gradient algorithm.
For the end user, they supply the function cg(A, b; kwargs...)
, which solves the equation Ax = b
and returns the solution x
. To solve it, the package internally defines a mutable struct containing all vectors and matrices that will be needed in the iterative algorithm,
mutable struct CGIterable{matT, solT, vecT, numT <: Real}
A::matT
x::solT
r::vecT
c::vecT
u::vecT
tol::numT
residual::numT
prev_residual::numT
maxiter::Int
mv_products::Int
end
and essentially all the work is done in the following function, which performs a step of the conjugate gradient algorithm, updating the struct of type CGIterable
:
###############
# Ordinary CG #
###############
function iterate(it::CGIterable, iteration::Int=start(it))
# Check for termination first
if done(it, iteration)
return nothing
end
# u := r + βu (almost an axpy)
β = it.residual^2 / it.prev_residual^2
it.u .= it.r .+ β .* it.u
# c = A * u
mul!(it.c, it.A, it.u)
α = it.residual^2 / dot(it.u, it.c)
# Improve solution and residual
it.x .+= α .* it.u
it.r .-= α .* it.c
it.prev_residual = it.residual
it.residual = norm(it.r)
# Return the residual at item and iteration number as state
it.residual, iteration + 1
end
The problem I see with this is that CG is a mathematically well-defined algorithm which needs a specific set of matrices and vectors as input to produce an output. However, the implementation above depends on the particular design of the struct of type CGIterable
, and can only be used if this object is built.
For me it seems more natural to have instead a function CG_step
that is independent of the particular design of the structs in my project, such as:
function CG_step(x, A, r, c, u, residual, prev_residual)
# Check for termination first
# if done(it, iteration)
# return nothing
# end
# u := r + βu (almost an axpy)
β = residual^2 / prev_residual^2
u .= r .+ β .* u
# c = A * u
mul!(c, A, u)
α = residual^2 / dot(u, c)
# Improve solution and residual
x .+= α .* u
r .-= α .* c
prev_residual = residual
residual = norm(r)
# Return the residual at item and iteration number as state
residual, iteration + 1
end
and then have an additional function so that I can use my structs throughout my project:
function iterate(it::CGIterable, iteration::Int=start(it))
return CG_step(it.x, it.A, it.r, it.c, it.u, it.residual, it.prev_residual)
end
That is, what I am trying to convey is that I would think it is desirable to make well-defined mathematical functions that do all the job orthogonal to the particular design of project’s structs. This way, if I ever want to change my structs, I don’t need to touch the actually important function (which could potentially introduce bugs), but just the interface function defined just above. The actual mathematical function would be coded “once and for all” in Julia.
Of course, I guess that I am missing some important points that make this approach undesirable, but I wanted to ask about your opinions on this. What I said makes any sense? What are the advantages/disadvantages of doing what I said? Why didn’t they code the algorithm this way?