# Speed up applying a function to matrices

I’ve been struggling to speed up this problem I’m working with. Apologies about posting this long snippet, I couldn’t think of a more minimal example that has all the features stumping me.

``````using BenchmarkTools
using NLsolve
using ComponentArrays

θ = ComponentArray{Float64}(m=(α=1.2, γ=[10.6, 10.03], θx=[-0.50, -0.005, 1.0]),
f=(α=1.2, γ=[10.6, 10.03], θx=[-0.50, -0.005, 1.0])
)

const ρ = 0.1

const glpoints = gausslaguerre(500)
const jump1 = 62
const jump2 = 65

const ll = 0.001
const ul = 500.0

transfθ(θ) = abs(θ)

function discountedintegral(f, r, T; glpoints=glpoints)
x, w = glpoints
exp(-r*T)/r * sum( w[i]*f(x[i]/r + T) for i in 1:length(x) )
end

function H̃i(i, ti, x; θ=θ, ρ=ρ, glpoints=glpoints)
discountedintegral(s -> Zi(i, s; θ=θ), ρ, ti; glpoints=glpoints) * φi(i, x; θ=θ)
end

function Hi(i, si, xi; θ=θ)
Zi(i, si; θ=θ) * φi(i, xi; θ=θ)
end

function φi(i, xi; θ=θ)
if i == 1
θx = θ.m.θx
elseif i == 2
θx = θ.f.θx
end
res = zero(eltype(θ))
for i in 1:length(xi)
res += θx[i + 1] * xi[i]
end
return exp(res + θx[1])
end

function Zi(i, si; θ=θ, jump1=jump1, jump2=jump2)
if i == 1
α = θ.m.α
γ = θ.m.γ
elseif i == 2
α = θ.f.α
γ = θ.f.γ
end
α = transfθ(α)
γ .= transfθ.(γ)
abs(si)^α + γ[1]*transitionF(si, (jump1 - 45)*4) + γ[2]*transitionF(si, (jump2 - 45)*4)
end

function transitionF(t, t̄)
t < t̄                && return 0
t̄ <= t < t̄ + 0.5     && return 2(t - t̄)^2
t̄ + 0.5 <= t < t̄ + 1 && return 1 - 2(t - 1 - t̄)^2
t̄ + 1 <= t           && return 1
end

function U1(t, x1; θ=θ, K=K, ρ=ρ, glpoints=glpoints)
t1 = t[1]
K1 = K[1]
K1*ρ^(-1)*(1 - exp(-ρ*t1)) + H̃i(1, t1, x1; θ=θ, ρ=ρ, glpoints=glpoints)
end

function U2(t, x2; θ=θ, K=K, ρ=ρ, glpoints=glpoints)
t2 = t[2]
K2 = K[2]
K2*ρ^(-1)*(1 - exp(-ρ*t2)) + H̃i(2, t2, x2; θ=θ, ρ=ρ, glpoints=glpoints)
end

function uni_foc(sto, t; θ=θ, x=x, K=K)
sto[1] = Hi(1, t[1], x[1]; θ=θ) - K[1]
sto[2] = Hi(2, t[2], x[2]; θ=θ) - K[2]
end

function Aifun!(ret; x=x, θ=θ, K=K, uni_t0=[10.0, 11.0], ρ=ρ, uni_meth=:trust_region, autodiff=:forward,
xtol=1e-6, ftol=1e-6)
sol = nlsolve((sto, t) -> uni_foc(sto, t; θ=θ, x=x, K=K), uni_t0, method=uni_meth, autodiff=autodiff, xtol=xtol, ftol=ftol)
ret[1] = U1(sol.zero, x[1]; θ=θ, K=K, ρ=ρ, glpoints=glpoints)
ret[2] = U2(sol.zero, x[2]; θ=θ, K=K, ρ=ρ, glpoints=glpoints)
return ret .* 0.6
end

x = [[10.0, 6.0], [6.0, 10.0]]
K = [1.6, 0.6]
ret = zeros(2)
Aifun!(ret)
``````

It’s all good, but now suppose I have matrices for `x` and `K`. I’m trying to apply function `Aifun!` to those matrices:

``````matkk = abs.(randn(500, 2))
dataca = ComponentArray{Float64}(m=(x=rand(500, 2),), f=(x=rand(500, 2),))
Amat = similar(matkk)

function Aimat!(Amat, matkk, dataca; θ=θ, uni_t0=[10.0, 11.0], ρ=ρ, uni_meth=:trust_region, autodiff=:forward,
xtol=1e-6, ftol=1e-6)
Ai = similar(matkk, 2)
Ki = similar(matkk, 2)
covs = size(dataca.m.x, 2)
xi = [similar(dataca.m.x, covs), similar(dataca.f.x, covs)]
n = size(matkk, 1)
for i in 1:n
Ai[1] = Amat[i, 1]
Ai[2] = Amat[i, 2]
Ki[1] = matkk[i, 1]
Ki[2] = matkk[i, 2]
xi[1] = dataca.m.x[i, :]
xi[2] = dataca.f.x[i, :]
Aifun!(Ai; x=xi, θ=θ, K=Ki, uni_t0=uni_t0, ρ=ρ, uni_meth=uni_meth, autodiff=autodiff,
xtol=xtol, ftol=ftol)
Amat[i, 1] = Ai[1]
Amat[i, 2] = Ai[2]
end
return Amat
end
``````

`Aimat!(Amat, matkk, dataca)` works, but I’m looking for a more efficient and faster way, if there is any.

In these two lines it seems that you should be broadcasting, i. e. with .=

Also those two-element arrays could be static arrays (MArrays for convenience, at least). Maybe that speedups things a little. I’m assuming that the loop there is taking time. Probably it would be good to profile everything and see where are the bottlenecks.

Edit: looking again, probably these changes will not be related to the overall performance.

Since I want to return a mutated matrix `Amat`, I thought about using `@view` to access elements in the first 2 lines of the `for` loop:

``````        Ai[1] = @view Amat[i, 1]
Ai[2] = @view Amat[i, 2]
``````

but I get errors:

``````ERROR: MethodError: Cannot `convert` an object of type SubArray{Float64, 0, Matrix{Float64}, Tuple{Int64, Int64}, true} to an object of type Float64
Closest candidates are:
convert(::Type{T}, ::Base.TwicePrecision) where T<:Number at twiceprecision.jl:250
convert(::Type{T}, ::AbstractChar) where T<:Number at char.jl:180
convert(::Type{T}, ::CartesianIndex{1}) where T<:Number at multidimensional.jl:136
``````

Also because I want to return the mutated matrix, I have to add the last 2 lines:

``````        Amat[i, 1] = Ai[1]
Amat[i, 2] = Ai[2]
``````

Since Julia is column major, you generally want to slice along the other direction `dataca.m.x[:, i]`, see Performance Tips · The Julia Language.

You also want to use views for slices, see Performance Tips · The Julia Language.

In general, the performance tips has a pretty good collection of stuff to try.

2 Likes

Yes, it feels like I am not using the loop efficiently, as I am calling `Ai[1] = Amat[i, 1]` at the beginning, and then reversing it at the end with `Amat[i, 1] = Ai[1]`.

I tried `views`, but it seems that I cannot assign a `view` to be an element of a vector, so I’d have to bind those `views` to variables that I then have to compose into a vector, which is again not allowed.

Uh? Sure you can, the vector just has to be able to accept those views, i.e. its element type has to match the type of the view. In the case of a view, this is `SubArray`, which is an `AbstractArray` (or an `AbstractVector`, depending on the dimensionality of the view):

``````julia> v = AbstractVector[[1],[2],[3]]
3-element Vector{AbstractVector}:
[1]
[2]
[3]

julia> v[1] = view(v[2], 1:1)
1-element view(::Vector{Int64}, 1:1) with eltype Int64:
2

julia> v
3-element Vector{AbstractVector}:
[2]
[2]
[3]

julia> typeof(v[1])
SubArray{Int64, 1, Vector{Int64}, Tuple{UnitRange{Int64}}, true}

julia> typeof(v[1]) |> supertype
AbstractVector{Int64} (alias for AbstractArray{Int64, 1})
``````

If you have a `Matrix` though, you can’t just set a single element equal to a vector (unless the elementtype of that `Matrix` is a vector as well, of course) - you’ll have to set the whole subset/column/row (and the shapes have to match, i.e. you can’t set a 2x3 matrix equal to an 3x2 matrix).

I changed `Aimat!` to this and still get an error

``````function Aimat!(Amat, matkk, dataca; θ=θ, uni_t0=[10.0, 11.0], ρ=ρ, uni_meth=:trust_region, autodiff=:forward,
xtol=1e-6, ftol=1e-6)
Ai = similar(matkk, SubArray{eltype(matkk)}, 2)
Ki = similar(matkk, 2)
covs = size(dataca.m.x, 2)
xi = [similar(dataca.m.x, covs), similar(dataca.f.x, covs)]
n = size(matkk, 1)
for i in 1:n
Ai[1] = view(Amat, i, 1)
Ai[2] = view(Amat, i, 2)
Ki[1] = matkk[i, 1]
Ki[2] = matkk[i, 2]
xi[1] = dataca.m.x[i, :]
xi[2] = dataca.f.x[i, :]
Aifun!(Ai; x=xi, θ=θ, K=Ki, uni_t0=uni_t0, ρ=ρ, uni_meth=uni_meth, autodiff=autodiff,
xtol=xtol, ftol=ftol)
end
return Amat
end

julia> Aimat!(Amat, matkk, dataca)
ERROR: MethodError: Cannot `convert` an object of type
Float64 to an object of type
SubArray{Float64, N, P, I, L} where {N, P, I, L}
``````

Sorry if this is clear already, but I have the impression that there is some confusion here. For example:

This is a vector of vectors, and you have preallocated the vectors it contains. Thus, when you do this:

You are just replacing the preallocated vector by a new array which is a slice of `destacam.x`. It would be more natural, since the vector is preallocated, to copy element-wise, i. e., with `.=`.

Alternatively, you may want to just put in the position `xi[1]` a view (not a copy) of the slice of `destacam.x`. In that case, use `xi[1] = @view(dataca.m.x[i, :])`, but then it does not really make sense to preallocate the elements of `xi`, or even having `xi` as a vector or vectors, instead of a temporary scalar (a label that is temporarily assigned to that slice) inside the loop.

The same goes for the other assignments/copies. Sorry not looking the code more carefully, but, for example:

If `Amat[i,1]` is an array and `Ai` is an array of arrays, this is not copying anything, just assigning a pointer:

``````julia> x = Vector{Vector{Float64}}(undef,2)
2-element Vector{Vector{Float64}}:
#undef
#undef

julia> y = rand(2);

julia> x[1] = y
2-element Vector{Float64}:
0.6758178015247569
0.060842494125993074

julia> x[1][1] = 0
0

julia> y # note that y[1] was changed
2-element Vector{Float64}:
0.0
0.060842494125993074

``````

thus, it does not make sense to use “views” there.

I don’t think I can eliminate the allocations arising from creating `[x1, x2]` because I am extracting data from `dataca` with hardcoded “indices”. I will also work on transposing the data matrices to be able to access them in a more efficient way.

``````function Aimat!(Amat, matkk, dataca; θ=θ, ρ=ρ)
n = size(matkk, 1)
for i in 1:n
Ai = @view(Amat[i, :])
Ki = @view(matkk[i, :])
x1 = @view(dataca.m.x[i, :])
x2 = @view(dataca.f.x[i, :])
Aifun!(Ai; x=[x1, x2], θ=θ, K=Ki, ρ=ρ)
end
return Amat
end
``````