Workspace in Julia

Hi all,

New to Julia, but with experience in HPC. I am looking for a canonical way to have module variables that can be shared between the different functions of the module without having to pass these variables as arguments to every function.

The use case is the following. I am solving (many times) sparse linear systems of equations Mx = b. For this I have different solvers: Conjugate Gradient (CG), BiCGstab,
, some problem dependent preconditioning, etc… Each solver needs some some additional allocated memory to actually solve the system of equations (for example, CG needs three adittional vectors).

Usually in fortran or c, one would define these vectors as module private variables, allocate them once, and re-use them in the routine that actually performs the CG. This avoids the allocation/deallocation of memory, that in large sparse systems would kill any performance.

But surprisingly there is no simple way to do something like this in Julia. One can define global variables in a file, but this is produces terrible performance. I always receive the advice to pass this memory as a struct to the function, but this is just terrible programming: There is a long chain of function calls, and one would need to pass the solver workspace to each of them in order to avoid the multiple allocation.

I managed to do something like this using a let construct:

  let Ap = similar(Vector{Float64}, 0), r = similar(Vector{Float64}, 0), p = similar(Vector{Float64}, 0)

      global function alloc_ws(n::Int64)

	  Ap = zeros(n)
	  r = zeros(n)
	  p = zeros(n)

      end

      global function CG!(x::Vector{Float64}, func::Function, b::Vector{Float64}, msq::Float64)::Int64

	  func(Ap, x, msq)
	  r .= b .- Ap
	  res_old::Float64 = dot(r,r)

	  p .= r
	  niter::Int64     = 0
	  res_new::Float64 = 0.0
	  alpha::Float64   = 0.0
	  for i in 1:length(b)
	      niter += 1
	      func(Ap,p,msq)
	      alpha = res_old / dot(p,Ap)
	      x .= x .+ alpha .*  p
	      r .= r .- alpha .* Ap
	      res_new = dot(r,r)
	      if (res_new < 1.0E-10)
		  break
	      end

	      p .= r .+ (res_new/res_old) .* p
	      res_old = res_new
	  end

	  return niter
      end


  end

This looks a reasonable solution, since the temporary arrays Ap, r, p are alllocated once. Also the code is type stable (so I do not see any reason for this not to be optimized).

Still I find it strange that such a common situation in practical numerical situations (i.e. a routine that requires some workspace and you want to avoid multiple allocations by using a type variable valid in the scope of the module), has to be done in such a strange way…

Since I am very new to Julia, am I missing something here?

You can recover the performance with global variables by declaring them const.

1 Like

Yes, you have missed the first point in the performance tips in the manual https://docs.julialang.org/en/v1/manual/performance-tips/index.html

One solution is also to put all of your global variables into a struct that then you can pass around. For example

struct MyParameters{T} where T <: AbstractArray
    Ap::T
    r::T
    p::T
end

This way you don’t need to worry about order of arguments, or having too many variables passed around. You just need to pass one object to your functions.

This won’t just help your performance. It will increase the readability of your code, no more “wait what’s the value of a again and where is it defined?”

3 Likes

Or maybe just a tuple, or a named tuple.

1 Like

Declaring them constant is not an option. Putting them in a structure is also not an option. The reason is that CG! is passed as an argument to a routine that determines a force, that is inside a routine that performs a step of molecular dynamics that is in a routine that solves some equation of motion. Passing a workspace_solver to each of this routines is not good programming practice. Moreover I can decide to change the solver, and use BiCGstab, and then the workspace is complely different.

What seems to work is using global variables and annotate the type in each use:

    global Ap
    global p
    global r

    # Allocation here
   
      function CG!(x::Vector{Float64}, func::Function, b::Vector{Float64}, msq::Float64)::Int64

	  func(Ap, x, msq)
	  r .= b .- Ap
	  res_old::Float64 = dot(r,r)

	  p .= r
	  niter::Int64     = 0
	  res_new::Float64 = 0.0
	  alpha::Float64   = 0.0
	  for i in 1:length(b)
	      niter::Int64 += 1
	      func(Ap::Vector{Float64},p::Vector{Float64},msq)
	      alpha = res_old / dot(p::Vector{Float64},Ap::Vector{Float64})
	      x .= x .+ alpha .*  p::Vector{Float64}
	      r::Vector{Float64} .= r::Vector{Float64} .- alpha .* Ap::Vector{Float64}
	      res_new = dot(r::Vector{Float64},r::Vector{Float64})
	      if (res_new < 1.0E-10)
		  break
	      end

	      p::Vector{Float64} .= r::Vector{Float64} .+ (res_new/res_old) .* p::Vector{Float64}
	      res_old = res_new
	  end

	  return niter
      end

I am happy with this, although it is still very difficult for me to understand why something as simple as global Ap::Vector{Float64} is not supported in the lenguage…

Thanks to all

Could you clarify this some more? In general, it is good programming practice to be explicit about what goes in each function. A global variable that affects behavior very far from where it is defined makes it very difficult to reason about code.

2 Likes

I couldn’t follow that description. But what was wrong with declaring them const ?

Global variables are universally agreed to be bad programming practice, though, so there has to be a better way, but I would have to be able to follow your explanation first.

2 Likes

It is certainly true that having pure functions (i.e. they only operate on its arguments), is clean programming. This is also key to generate code that can be automatically parallelized by the compiler: a compiler can decide if the order of a loop that calls a pure function is important. If the function uses global variables determining the relevance of the order is very tricky. It should be pursued whenever possible. (fortran has the keyword pure to a function to declare percisely that is free of side effects).

On the other hand, many times this is in tension with having portable and extensible code. An example is precisely what I am trying to do: A problem that requires, inside a very nested set of functions, to solve a linear system Mx=b. I can use many solvers for this: CG, BiCG, BiCGstab, SAP, … Some solvers are better for some situations. My objective is to have a function solve_problem(solver), so that I can call either solve_problem(CG) or solve_problem(BiCG). Since each solver has completely different needs regarding the workspace that is needed, there are only two solutions to this problem: Either you pass in a huge structure work then union of all possible workspaces, or you use module global variables. The first option requires that you pass the workspace in each function of your main program. If someday you develop a new solver, you have to modify all the function calls.

workspaces like this are very comon in my experience. For example many FFT implementations precompute the quantities exp(2 pi k/N) for k=0,...,N-1 and N<20 the first time that they are needed. Then they are kept in memory. We do not want to have this workspace being passed around each function that will eventually call an FFT.

There are many more. This is just a convenient tool, that has an impact in the parallelization of the code, but that many times is useful and produces code that is easier to share. In the mentioned case, this approach results in an optimized solver for the sparse system Mx=b that will allocate the neccessary workspace only one time.

I would recommend having a look at how packages in the Julia ecosystem tackle this problem - with multiple dispatch I find the stated concerns to be a non-issue. For instance, look at https://github.com/JuliaNLSolvers/Optim.jl. There is a uniform function interface (optimize) and the algorithm is toggled by passing in the type of the algorithm (e.g. LBFGS()). Solver-dependent options or workspace are then kept in that struct, but importantly the algorithm implementations are kept orthogonal to the function contract.

For your example, you’d have solve_problem(A, b, x0, CG()) or solve_problem(A, b, x0, BiCG()). Calls deeper in the problem would simply pass the algorithm being used. Also take a look at https://github.com/JuliaMath/IterativeSolvers.jl or https://github.com/Jutho/KrylovKit.jl. The former’s interface could use updating, but they are both (mostly) generic and composable - the “matrices” you pass could be matrix-free operators or stored on the GPU, and they will still work.

Another example which may be more relevant to your use case is how JuliaDiffEq handles user-specified linear solvers: https://docs.juliadiffeq.org/stable/features/linear_nonlinear/#linear_nonlinear-1. There is a high-level algorithm which solves the differential equation provided by the user, and that algorithm can rely on a choice of linear solve, also provided by the user. Again, these choices are wholly orthogonal.

8 Likes

Thanks!

This in fact looks very interesting… I am not used to anonymous functions and closures.

You didn’t really address why declaring your global variables constant was not possible. I see nothing in your posted example indicating that it would not be.

1 Like

Hi,

The arrays in the workspace have size that is only known at run time, not at compile time. It can even change from call to call, in which case the array has to be deallocated/allocated again. I do not think that this can be done with a conts declaration.

Have you actually tried?

julia> const a =[2]
1-element Array{Int64,1}:
 2

julia> resize!(a,100)
100-element Array{Int64,1}:
               2
              83
          569348
              83
              98
               0
               0
               0
 139842249674448
... 
1 Like

That doesn’t work so well with more dimensions in the arrays. But

mutable struct Workspace
    a::Matrix{Float64}
    ...
end
const w = Workspace(zeros(Float64, 0, 0), ...)

will let you reassign w.a to whatever size you want. I would still recommend against using globals though.

This is fine, but you’re not far off from re-inventing what Ref already does:

const workspace = Ref(zeros(0, 0))

workspace[] = zeros(5, 5)

Storing a const Ref is a great way of creating a const-typed global whose value can be modified without any type-instability. The only minor downside is that you use the workspace[] syntax to access its value, but I don’t think that’s a huge problem.

Of course, the real caveat here is that I’ve never built a system with mutable global state and not come to regret that decision later.

3 Likes

The way I like implementing iterative methods:

  • define an immutable iterable type to represent the problem to solve;
  • define the iteration state as a struct that holds all auxiliary quantities (including vectors);
  • implement the iteration protocol.

I’m using this in ProximalAlgorithms.jl, see for example how the Douglas-Rachford splitting method, for minimizing the sum of two convex functions, is implemented.

5 Likes