Hi everyone,
I have been trying to improve the performance of a model that I am using to fit experimental data with LsqFit. Unfortunately, after calculating the theoretical solution it must be convolved with a same sized array instrument response (experimentally measured) and then fit with LsqFit. (which has to iteratively call it multiple times) The convolution step is about 10x longer than producing the theoretical curve (benchmarks shown below) and then LsqFit takes about 100x longer to converge.
Code
using DSP, BenchmarkTools
## theoretical model
function model(t, β::Array{Float64,1}, ρ::Float64)
Rt1 = Array{Float64}(undef, length(t))
Rt2 = Array{Float64}(undef, length(t))
A::Float64 = - 1/β[2]
B::Float64 = 2/β[2] + 1/β[2]
Threads.@threads for n in eachindex(t)
Rt1[n] = -exp(-(ρ^2/(t[n])) - β[1]*t[n])
Rt1[n] = Rt1[n]/(t[n]^(5/2))
Rt2[n] = A*exp(-(A^2/(t[n]))) - B*exp(-(B^2/(t[n])))
end
Rt = replace!(Rt1.*Rt2, NaN => 0)
return Rt./maximum(Rt)
end
## function that is fit in LsqFit that convolves the model with measurement
function conv_Rt(t, β::Array{Float64,1}, ρ::Float64, It)
Rt = Array{Float64}(undef, length(t))
convRt = Array{Float64}(undef, 2*length(t) - 1)
Rt = model(t, β, ρ)
convRT = conv(It, Rt) # slowest line
convRt = convRt./maximum(convRt)
end
Things I’ve tried
- Take advantage that one array is always a constant (and that both are always the same length) by using a planned FFT and feeding that into the
conv_RT
function. I could then usemul!
to compute the product with the plan and vector which is faster for a single fft calculation, though I have to make a total of two products and then the inverse transform for the convolution which ended up being slower than just the in houseconv
function. I’d imagineconv
is calling the highly optimized FFTW library… - Try to decrease the array size in the convolution as much as possible. This can be done but trickier because it would have to be a different time window than what we are fitting with LsqFit.
Benchmarks and Test Usage
t = 0:0.01:10
It = [Vector(1:length(t)/2); length(t)/2+1:-1:1]
@benchmark model(t, [0.1,10.0], 1.0)
#=BenchmarkTools.Trial:
memory estimate: 40.97 KiB
allocs estimate: 66
--------------
minimum time: 15.443 μs (0.00% GC)
median time: 17.000 μs (0.00% GC)
mean time: 18.842 μs (4.91% GC)
maximum time: 1.497 ms (95.80% GC)
--------------
samples: 10000
evals/sample: 1
=#
@benchmark conv_Rt(t, [0.1,10.0],1.0, It)
#=BenchmarkTools.Trial:
memory estimate: 166.73 KiB
allocs estimate: 167
--------------
minimum time: 149.688 μs (0.00% GC)
median time: 155.159 μs (0.00% GC)
mean time: 183.873 μs (2.23% GC)
maximum time: 6.121 ms (0.00% GC)
--------------
samples: 10000
evals/sample: 1
=#
Thanks for all the help!
-Michael