Global sensitivity analysis in a large parameter space - computational cost

Ad 1:

I tested this a bit more and it seems that the memory allocation and deallocation adds considerable overhead (the whole GSA process allocated around 200GB overall). So I have changed from solving a single Ensemble to solving chunks of the parameter space at a time and reassembling the output array afterwards (code is not yet cleaned up but does work).

# Define GSA solver function:

ensemble_chunked = function (p)

  global parameterset = p

  Np::Int64 = size(p,2)

  println("Running ", Np, " cases.")

  chunksize::Int64 = Np/chunks

  println("Using ", chunks, " chunk(s) of size ", chunksize, ".")


  out = zeros(3,Np)

  for k in 1:chunks
      offset = Int((k - 1) * Np/chunks)
      startindx = Int(offset + 1)
      endindx = Int(k * Np/chunks)

      println("Starting chunk ", k, ", from ", startindx, " to ", endindx, " with offset ", offset, ".")

      pchunk = p[:,startindx:endindx]

      prob_func(prob,i,repeat) = remake(prob; u0 = [LV_Vt0, LA_Vt0, pt0sas, qt0sas , pt0sat, qt0sat, pt0svn], p=pchunk[:,i])

      ensemble_prob = EnsembleProblem(prob,prob_func=prob_func)

      @time "    ODE Ensemble solved in " sol = solve(ensemble_prob,GPUTsit5(), reltol=1e-8, abstol=1e-8, EnsembleThreads();saveat = saveat,trajectories=chunksize)

      @time "    Results processed in " Threads.@threads for i in 1:chunksize
          out[1,i+offset] = whatever_measure1(sol[i])
          out[2,i+offset] = whatever_measure2(sol[i])
          out[3,i+offset] = whatever_measure2(sol[i])
      end
  end
  out
end

# Set up Sobol solution:

chunks:Int64 = 100

N = 20000

lb = [5.0, 1.0, 0.07, 1.75, 1.0, 0.21, 0.360, 0.0, 4.0, 1.0, 0.105, 0.175, 1.0, 0.0315, 0.063, 0.92, 0.0231, 0.042, 0.0, 0.056, 0.0021, 4.34e-5, 0.0, 1.12, 0.035, 0.00119, 0.35, 0.364, 0.0525, 0.0, 14.35]
ub = [5.0, 1.0, 0.13, 3.25, 1.0, 0.34, 0.585, 0.0, 4.0, 1.0, 0.165, 0.325, 1.0, 0.0585, 0.117, 0.92, 0.0429, 0.078, 0.0, 0.104, 0.0039, 8.06e-5, 0.0, 2.08, 0.065, 0.00221, 0.65, 0.676, 0.0975, 0.0, 26.65]
bounds = tuple.(lb,ub)
sampler = SobolSample()
A,B = QuasiMonteCarlo.generate_design_matrices(N, lb, ub, sampler)

# And call it:

Sobol_JS_C = gsa(circ_time_post,Sobol(),A,B,batch = true)

With this I went from 100h to 1h wall-clock-time on a 24-core node. Which is pretty close to estimated based on single ODE solver runs, so scales pretty well.

Ad 2:

I’m not sure if that’s required anymore. I had assumed that the long run-time was somehow related to the Sobol calculations, but it seems it dropped down to single-core for the memory management tasks - which strace confirmed.

With the chunked approach I get the Sobol indices calculated in a few seconds, which is negligibly small relative to the ODE solver runtime.

1 Like