# Speeding up contraction mapping

Hello everyone,

I’m implementing the demand estimation for Berry-Levinsohn-Pakes (1995) for my research, and it would be great to get some help on this. While my code works, it takes a while, and so I was looking to speed it up.

I have about 6600 markets and a contraction mapping has to be done for each market separately . There are about 2-20 products per market. It takes anywhere between 40-200 iterations per market for convergence to happen, and so for a given set of values the whole process takes a while. I was curious to see if anyone had suggestions for how to speed up each iteration.

Here’s my code:

``````ns = size(draws.v1, 1)

function get_params(sigma, draws)
betai = [sigma .* draws.v1 sigma .* draws.v2] # ns x 2 matrix
end

function calc_indiv_util(c)
sum_of_exputils_indiv = 1 + sum(c)
choice_prob_indiv = c ./ sum_of_exputils_indiv
return choice_prob_indiv
end

function calc_predicted_shares(betai, delta, mkt, ns)

mkt_obs = size(mkt, 1)
X2_mkt = [mkt.var1 mkt.var2]
u1 = repeat(delta, outer=(1, ns)) + X2_mkt * betai'
exputils = exp.(u1)

p_ijt = mapslices(calc_indiv_util, exputils, dims=1)
p_jt = mapslices(sum, p_ijt, dims=2) ./ ns

return p_jt

end

function contraction_mapping(betai, ns, data, market_ids, num_mkts, maxtol, maxiter)

tuning_par = 1.0
delta_container = Vector{Any}(missing, num_mkts)

mkt = data[data.market_ids .== i,:]
local delta =  log.(mkt.s_jt ./ mkt.s_0jt)
local iter = 1.0
local tol = 1e6

while ( (tol > maxtol) && (iter < maxiter) )
new_delta = delta + tuning_par .* ( log.(mkt.s_jt) - log.(calc_predicted_shares(betai, delta, mkt, ns))  )
tol = abs(maximum(new_delta .- delta))
delta = new_delta
iter = iter + 1
end
delta_container[i] = delta
end
delta_full = reduce(vcat, delta_container)

return delta_full
end

betai = get_params(sigma, draws)
delta = contraction_mapping(betai, ns, data, market_ids, num_mkts, maxtol, maxiter)

``````

EDIT: There is now a MWE in the replies.

I also realize a lot of people might be confused about the reasoning behind the algorithm itself, so my question is asking a lot - but really any general help for speeding up iterations/function calls/variable types that you see here should be helpful. I’d think a decent number of economists use BLP 1995 and/or contraction mapping in their work, so the answers should be helpful to a wide audience. I’d also prefer to use my own code than a package implementing BLP written by someone else, although I’m happy to take a look and pick up ideas.

Lot’s of free lunches here.

``````	delta_container = Vector{Any}(missing, num_mkts)
``````

should probably be narrowed. Try `Vector{Float64}(undef, num_mks)`. Unless `num_mkts` is small, in which case it doesn’t matter. Or `zeros(num_mks)`

``````mkt = data[data.market_ids .== i,:]
``````

Assuming `data` is a `DataFrame`, this is very bad for performance. `DataFrame`s are type unstable. But this doesn’t always have to mean bad performance. Create that vector of vectors earlier and use them in the loop. One thing that helps this is DataFramesMeta.jl’s `@with` which lets you make an anonymous function and work with data frames columnes in a type-stable way.

1. You allocate a lot of small arrays, such as ` [sigma .* draws.v1 sigma .* draws.v2]`. Pass an array into the function and mutate it, or use a static array.

It’s hard to really dig into this without an MWE. Isolate as many things as possible into separate functions if you need to in order to let us replicate your slowness without knowing your full dataset.

2 Likes

Okay here’s a MWE with some fake data. This has just one market; hence me commenting out the parts of the code that looped over each market. I’ve also added a println so that you can see how many iterations it took. Note that the original post also goes over the loop per market, which can (hopefully) be optimized?

``````using DataFramesMeta
using HaltonSequences
using Distributions

data = DataFrame(s_jt = [0.0011, 0.0131, 0.059, 0.074], s_0jt = repeat([0.80], 4), var1 = [-0.255, -1.35, -1.39, -1.67], var2 = [0, 1, 1, 1])

h1 = Halton(3, start=1000, length=100)
h2 = Halton(7, start=1000, length=100)

draws = DataFrame()
draws.v1 = Distributions.quantile(Normal(0, 1), h1)
draws.v2 = Distributions.quantile(Normal(0, 1), h2)

ns = size(draws.v1, 1)
sigma = [0.3, 5]
market_ids = 0
num_mkts = 1

maxtol = 1e-12
maxiter = 1900
ns = size(draws.v1, 1)

function get_params(sigma, draws)
betai = [sigma .* draws.v1 sigma .* draws.v2] # ns x 2 matrix
end

function calc_indiv_util(c)
sum_of_exputils_indiv = 1 + sum(c)
choice_prob_indiv = c ./ sum_of_exputils_indiv
return choice_prob_indiv
end

function calc_predicted_shares(betai, delta, mkt, ns)

mkt_obs = size(mkt, 1)
X2_mkt = [mkt.var1 mkt.var2]
u1 = repeat(delta, outer=(1, ns)) + X2_mkt * betai'
exputils = exp.(u1)

p_ijt = mapslices(calc_indiv_util, exputils, dims=1)
p_jt = mapslices(sum, p_ijt, dims=2) ./ ns

return p_jt

end

function contraction_mapping(betai, ns, data, market_ids, num_mkts, maxtol, maxiter)

tuning_par = 1.0
delta_container = Vector{Any}(missing, num_mkts)

#mkt = data[data.market_ids .== i,:]
mkt = data
local delta =  log.(mkt.s_jt ./ mkt.s_0jt)
local iter = 1.0
local tol = 1e6

while ( (tol > maxtol) && (iter < maxiter) )
new_delta = delta + tuning_par .* ( log.(mkt.s_jt) - log.(calc_predicted_shares(betai, delta, mkt, ns))  )
tol = abs(maximum(new_delta .- delta))
delta = new_delta
iter = iter + 1
end
println(iter)
i = 1
delta_container[i] = delta
#end
delta_full = reduce(vcat, delta_container)

return delta_full
end

betai = get_params(sigma, draws)
delta = contraction_mapping(betai, ns, data, market_ids, num_mkts, maxtol, maxiter)

``````

Interesting. Profiler shows some type instabilities. @pdeffebach is correct in that the usage of `DataFrames` in your kernels is dangerous, I changed them to `NamedTuple` in the MWE. Furthermore inhomogeneous types for `data.var1` and `data.var2` lead to `Matrix{Any}` construction in `[mkt.var1 mkt.var2]` (similar problem in `[sigma .* draws.v1 sigma .* draws.v2]`), so I tried to change these to have `eltype` `Float64`. The next dynamic dispatch was diagnosed here

``````	p_ijt = mapslices(calc_indiv_util, exputils, dims=1)::Matrix{Float64}
p_jt = mapslices(sum, p_ijt, dims=2)::Matrix{Float64} ./ ns
``````

which I fixed locally with type annotations. Complete code

``````using DataFramesMeta
using HaltonSequences
using Distributions
using BenchmarkTools

function get_params(sigma, draws)
betai = [sigma .* draws.v1 sigma .* draws.v2] # ns x 2 matrix
end

function calc_indiv_util(c)
sum_of_exputils_indiv = 1 + sum(c)
choice_prob_indiv = c ./ sum_of_exputils_indiv
return choice_prob_indiv
end

function calc_predicted_shares(betai, delta, mkt, ns)

X2_mkt = [mkt.var1 mkt.var2]
u1 = repeat(delta, outer=(1, ns)) + X2_mkt * betai'
exputils = exp.(u1)

p_ijt = mapslices(calc_indiv_util, exputils, dims=1)::Matrix{Float64}
p_jt = mapslices(sum, p_ijt, dims=2)::Matrix{Float64} ./ ns

return p_jt

end

function contraction_mapping(betai, ns, data, market_ids, num_mkts, maxtol, maxiter)

tuning_par = 1.0
delta_container = Vector{Any}(undef, num_mkts)

#mkt = data[data.market_ids .== i,:]
mkt = data
local delta =  log.(mkt.s_jt ./ mkt.s_0jt)
local iter = 1.0
local tol = 1e6

while ( (tol > maxtol) && (iter < maxiter) )
new_delta = delta + tuning_par .* ( log.(mkt.s_jt) - log.(calc_predicted_shares(betai, delta, mkt, ns))  )
tol = abs(maximum(new_delta .- delta))
delta = new_delta
iter = iter + 1
end
i = 1
delta_container[i] = delta
#end
delta_full = reduce(vcat, delta_container)

return delta_full
end

function test()
data = (
s_jt = Float64[0.0011, 0.0131, 0.059, 0.074],
s_0jt = repeat([0.80], 4),
var1 = Float64[-0.255, -1.35, -1.39, -1.67],
var2 = Float64[0, 1, 1, 1])

h1 = Halton(3, start=1000, length=100)
h2 = Halton(7, start=1000, length=100)

draws = (
v1 = Distributions.quantile(Normal(0, 1), h1),
v2 = Distributions.quantile(Normal(0, 1), h2))

ns = size(draws.v1, 1)
sigma = Float64[0.3, 5]
market_ids = 0
num_mkts = 1

maxtol = 1e-12
maxiter = 1900
ns = size(draws.v1, 1)

betai = get_params(sigma, draws)
@btime delta = contraction_mapping(\$betai, \$ns, \$data, \$market_ids, \$num_mkts, \$maxtol, \$maxiter)
end
``````

Benchmark before

`````` 3.709 ms (53659 allocations: 3.30 MiB)
``````

and after

`````` 3.275 ms (52396 allocations: 3.28 MiB)
``````

Now the runtime of `contraction_mapping` seems to be dominated by

``````	p_ijt = mapslices(calc_indiv_util, exputils, dims=1)::Matrix{Float64}
``````

which shows significant internal dynamic dispatch. If I’d know what that does I’d try to find an alternate formulation.

1 Like

Replacing `calc_predicted_shares` with

``````function calc_predicted_shares(betai, delta, mkt, ns)

X2_mkt = [mkt.var1 mkt.var2]::Matrix{Float64}
u1 = repeat(delta, outer=(1, ns)) + X2_mkt * betai'
exputils = exp.(u1)

# p_ijt = mapslices(calc_indiv_util, exputils, dims=1)::Matrix{Float64}
# p_jt = mapslices(sum, p_ijt, dims=2)::Matrix{Float64} ./ ns

p_ijt = hcat((calc_indiv_util(col) for col in eachcol(exputils))...)::Matrix{Float64}
p_jt = [sum(row) for row in eachrow(p_ijt)] ./ ns

return p_jt

end
``````

yields

``````  1.272 ms (16578 allocations: 2.50 MiB)
``````

Now I don’t see any dynamic dispatch, and

``````	p_ijt = hcat((calc_indiv_util(col) for col in eachcol(exputils))...)::Matrix{Float64}
``````

is the hotspot due to allocations.

1 Like

Have you thought about accelerating this by using a better algorithm?

What you are doing In Berry-Levinson-Pakes demand estimation is essentially just a fixed-point iteration, right? Why not use something like Anderson accleration from e.g. FixedPointAcceleration.jl to see if you can get it to converge in fewer iterations?

(If you are able to compute Jacobians then there are even more options.)

1 Like

Ah I’ve heard of that but didn’t know Julia had a package that does this. Thanks for pointing out!

You should definitely experiment with a few options here, as the best algorithm for solving nonlinear equations is pretty problem-dependent.

2 Likes