 # Help with bottleneck in dynamic programming

My goal is to calculate all the possible outcomes in a finite dynamic programming problem. The grid size is 100, there are three state variables and T periods.

A bottleneck is the following function, which I’ve started to improve in version 2. According to profiling, the issue is in line 28. Something about abstractarray.jl and multidimensional.jl. I’ve read that this is expected when indexing arrays. But I don’t know what can be done to speed up the code in spite of it. I’ve tried using inbounds and views, but these didn’t help much.

Also, according to code_warntype, Julia seems to have problems inferring the data types in lines 25 and 28.

Would you please have a look at it and tell me how I can improve it? Any help is appreciated.

Thanks!

``````using LinearAlgebra, Statistics

using BenchmarkTools

#packages to increase the speed of the code
using Profile # @profile command
using ProfileView #profiling using graphs

#number of periods
T = 13

#grid size
grid = 100

#continuation values
#period t (not t+1), θkt, ℓkt, ℓmt
s_W = zeros(T+1, grid, grid, grid)
j_W = similar(s_W)

#optimal payments
#period, θkt, ℓmt, outputs
Pst_array = zeros(T, grid, grid, 6) #we won't need payment proposal for period T, a.k.a. "T+1"
Pjt_array = similar(Pst_array)

#probability mass function
pmf = Array{Float64}(undef,grid,grid)

#example
pmf[:,:] .= 0.01

t, hkt, lkt, lmt, Pmt_array, k_W = T, 50, 50, 50, Pjt_array, s_W

continuation_value = 0.011
payment_threshold = 91

s_W[T+1, :, :, :] .= continuation_value
Pjt_array[t, payment_threshold:end, lkt, 1] .= continuation_value
``````
``````#function to calculate the payment offer threshold ϕmt

#test with searchsorted first

function threshold_m(t, lkt, lmt, Pmt_array)

#searches the index where the payment offer is positive
ts = @views searchsortedfirst(Pmt_array[t, :, lkt, 1], 0.01)

#returns error message if threshold does not exist (101 is returned if missing by searchsortedfirst)
@assert ts < 101

return ts

end

#test
# t, lkt,lmt, Pmt_array = T, 50, 51, Pjt_array
# @btime threshold_m(t, lkt,lmt,  Pmt_array)
# @code_warntype threshold_m(t, lkt,lmt,  Pmt_array)
``````
``````function respond_paymentv1(t, hkt, lkt, lmt, Pmt_array, k_W)

ts = threshold_m(t, lkt, lmt, Pmt_array)

lmt = max(ts, lmt)

#the possible values of payment given by the opponent
Pmt_grid = Pmt_array[t, lmt:end, lkt, 1]

#the screening cutoffs associated with each payment proposal, simplified
cutoffs = Array{Int64}(undef, length(Pmt_grid))
cutoffs .= lkt

#A1###

#payoff when accepting the payment is the best response
pA1 = copy(Pmt_grid)

#weighting by the probabilities
probk = pmf[hkt, hkt:end]
probm = pmf[lmt, lmt:end]

lowerbounds = collect(lmt:1:grid)

#K and M to make the shape of the arrays easier to see
K, M = length(probk), length(probm)

pA2 = Array{Float64}(undef, K, M)
for i in eachindex(cutoffs)
pA2[:, i] .= (k_W[t+1, hkt:end, cutoffs[i], lowerbounds[i]])
end

pA2

#indicator function
# IA1 = np.where(k_W[t+1, hkt:, cutoffs, np.arange(lmt, grid, step = 1)].T <= Pmt_grid, 1, 0) #python code

IA1 = Array{Int64}(undef, K, M)
IA1[findall(pA2 .<= transpose(repeat(Pmt_grid, 1, K)))] .= 1

#IA1 has S rows and J columns
#we want a matrix of probabilities with the same dimension
# prob = np.outer(probk, probm) #python code
#     prob = repeat(probk, 1, M) .* transpose(repeat(probm, 1, K))

prob = probm' .* probk #outer product

pA1 = transpose(pA1) .* prob

A1 = pA1 .* IA1

#A2####

IA2 = @. ifelse(IA1==1, 0, 1)

pA2 = pA2 .* prob

A2 = pA2 .* IA2

payoff = A1 + A2
payoff = sum(payoff)

#returns the expected value of answering a payment offer and the probability of liquidation
return payoff
end

#test1: expected payoff of S
t, hkt, lkt, lmt, Pmt_array, k_W = T, 50, 50, 50, Pjt_array, s_W
@btime respond_paymentv1(t, hkt, lkt, lmt, Pmt_array, k_W)
``````
``````

function respond_paymentv2(t, hkt, lkt, lmt, Pmt_array, k_W)

ts = threshold_m(t, lkt, lmt, Pmt_array)

lmt = max(ts, lmt)

#the possible values of payment given by the opponent
Pmt_grid = Pmt_array[t, lmt:end, lkt, 1]

#the screening cutoffs associated with each payment proposal, simplified
cutoffs = Array{Int64}(undef, length(Pmt_grid))
cutoffs .= lkt

payoff = zeros(length(Pmt_grid))
#     payoff = Array{Float64}(undef, length(Pmt_grid))

for (i, pmt) in enumerate(Pmt_grid)
#receives pmt if his skill level is below the cutoff
payoff[i] += pmt * sum(pmf[hkt, hkt:(cutoffs[i]-1)]) * pmf[lmt, lmt+(i-1)]

#receives pmt if his skill level greather than or equal to the cutoff
payoff[i] += dot(k_W[t+1, cutoffs[i]:end , cutoffs[i], lmt+(i-1)], pmf[hkt, cutoffs[i]:end]) * pmf[lmt, lmt+(i-1)]

end

return sum(payoff)

end

#test1: expected payoff of S
t, hkt, lkt, lmt, Pmt_array, k_W = T, 50, 50, 50, Pjt_array, s_W
@btime respond_paymentv2(t, hkt, lkt, lmt, Pmt_array, k_W)

``````
``````@code_warntype respond_paymentv2(t, hkt, lkt, lmt, Pmt_array, k_W)
``````
``````using Profile
Profile.clear()
for i in 1:10000
@profile respond_paymentv2(t, hkt, lkt, lmt, Pmt_array, k_W)
end

ProfileView.view()
``````

PS1: Context
I’m studying a game where two players bargain with each other. I’ll call K the player who is taking the actions and M his adversary.

The gamel has asymmetric information. Each player bargaining power increases on his true skill level, hkt, which only he knows. But there are lower bounds, lkt and lmt, that help the adversary to infer hkt.

The function I’m struggling with calculates the expected payoff when K answers a payment proposal. K receives the payment pmt(hmt) in case he accepts it and k_W(hkt+1) if he rejects it. So he chooses the maximum between the two options. There are different possible values for hmt and for hkt+1, so we use the pmf (probability mass function) to calculate the expected payoff.

There is a cutoff level which is the minimum level of hkt+1 necessary to be lucrative to reject a payment offer. I’ve used this in version 2 to speed up the code.

PS2: If you have interest, the complete code is here. Again, tips are welcome

Most of the problem is `cutoffs = Array{Int64}(undef, length(Pmt_grid))`
`Array{Int64}` is an abstract type. You presumably meant `Vector{Int64}`. Note however, that this line and the following can be replaced with
`cutoffs = fill(lkt, length(Pmt_grid))`

Similarly, you want `pmf = fill(.01, grid, grid)

Also, indexing with a range makes a copy. (ie `pmf[hkt, hkt:(cutoffs[i]-1)]`) To fix this, you can put `@views` before the `for` loop.

5 Likes

Hi Oscar, thanks for the tips.

I followed your suggestions and the function takes now 4.857 microseconds, down from 6.720 microseconds.

I put `@view` before each array indexed with range because it was faster than putting `@views` before the `for` loop.

Also, thanks for the tip on using `fill`, but these values are just an example. In the real code, each cell of `cutoffs` and `pmf` has a different value.

Version 3 of the code is:

``````

function respond_paymentv3(t, hkt, lkt, lmt, Pmt_array, k_W)

ts = threshold_m(t, lkt, lmt, Pmt_array)

lmt = max(ts, lmt)

#the possible values of payment given by the opponent
Pmt_grid = Pmt_array[t, lmt:end, lkt, 1]

#the screening cutoffs associated with each payment proposal, simplified
cutoffs = Vector{Int64}(undef, length(Pmt_grid))
cutoffs .= lkt

payoff = zeros(length(Pmt_grid))
#     payoff = Array{Float64}(undef, length(Pmt_grid))

for (i, pmt) in enumerate(Pmt_grid)
#receives pmt if his skill level is below the cutoff
payoff[i] += pmt * sum(@view(pmf[hkt, hkt:(cutoffs[i]-1)])) * pmf[lmt, lmt+(i-1)]

#receives pmt if his skill level greather than or equal to the cutoff
payoff[i] += dot(@view(k_W[t+1, cutoffs[i]:end , cutoffs[i], lmt+(i-1)]), @view(pmf[hkt, cutoffs[i]:end])) * pmf[lmt, lmt+(i-1)]

end

return sum(payoff)

end

#test1: expected payoff of S
t, hkt, lkt, lmt, Pmt_array, k_W = T, 50, 50, 50, Pjt_array, s_W
@btime respond_paymentv3(t, hkt, lkt, lmt, Pmt_array, k_W)

``````

Now I’m working on two issues that came up in this newer version, in case anybody can help:

1. `@code_warntype` show possible issues with `SubArrays`, which I thinks is due to the use of `@view`:
``````@code_warntype respond_paymentv3(t, hkt, lkt, lmt, Pmt_array, k_W)
Variables
#self#::Core.Compiler.Const(respond_paymentv3, false)
t::Int64
hkt::Int64
lkt::Int64
lmt@_5::Int64
Pmt_array::Array{Float64,4}
k_W::Array{Float64,4}
ts::Int64
Pmt_grid::Array{Float64,1}
cutoffs::Array{Int64,1}
payoff::Array{Float64,1}
@_12::Union{Nothing, Tuple{Tuple{Int64,Float64},Tuple{Int64,Int64}}}
i::Int64
pmt::Float64
@_15::Int64
lmt@_16::Int64
@_17::SubArray
@_18::SubArray{Float64,1,Array{Float64,4},Tuple{Int64,UnitRange{Int64},Int64,Int64},true}
@_19::SubArray

Body::Float64
1 ──       (lmt@_16 = lmt@_5)
│          (ts = Main.threshold_m(t, lkt, lmt@_16, Pmt_array))
│          (lmt@_16 = Main.max(ts, lmt@_16))
│    %4  = lmt@_16::Int64
│    %5  = Base.lastindex(Pmt_array, 2)::Int64
│    %6  = (%4:%5)::UnitRange{Int64}
│          (Pmt_grid = Base.getindex(Pmt_array, t, %6, lkt, 1))
│    %8  = Core.apply_type(Main.Vector, Main.Int64)::Core.Compiler.Const(Array{Int64,1}, false)
│    %9  = Main.length(Pmt_grid)::Int64
│          (cutoffs = (%8)(Main.undef, %9))
│    %11 = cutoffs::Array{Int64,1}
│          Base.materialize!(%11, %12)
│    %14 = Main.length(Pmt_grid)::Int64
│          (payoff = Main.zeros(%14))
│    %16 = Main.enumerate(Pmt_grid)::Base.Iterators.Enumerate{Array{Float64,1}}
│          (@_12 = Base.iterate(%16))
│    %18 = (@_12 === nothing)::Bool
│    %19 = Base.not_int(%18)::Bool
└───       goto #10 if not %19
2 ┄─ %21 = @_12::Tuple{Tuple{Int64,Float64},Tuple{Int64,Int64}}::Tuple{Tuple{Int64,Float64},Tuple{Int64,Int64}}
│    %22 = Core.getfield(%21, 1)::Tuple{Int64,Float64}
│    %23 = Base.indexed_iterate(%22, 1)::Core.Compiler.PartialStruct(Tuple{Int64,Int64}, Any[Int64, Core.Compiler.Const(2, false)])
│          (i = Core.getfield(%23, 1))
│          (@_15 = Core.getfield(%23, 2))
│    %26 = Base.indexed_iterate(%22, 2, @_15::Core.Compiler.Const(2, false))::Core.Compiler.PartialStruct(Tuple{Float64,Int64}, Any[Float64, Core.Compiler.Const(3, false)])
│          (pmt = Core.getfield(%26, 1))
│    %28 = Core.getfield(%21, 2)::Tuple{Int64,Int64}
│    %29 = Base.getindex(payoff, i)::Float64
│    %30 = pmt::Float64
│          true
│    %32 = Base.getindex(cutoffs, i)::Int64
│    %33 = (%32 - 1)::Int64
│    %34 = (hkt:%33)::UnitRange{Int64}
│          (@_17 = (view)(Main.pmf, hkt, %34))
└───       goto #4
3 ──       Core.Compiler.Const(:(@_17 = false), false)
4 ┄─ %38 = @_17::SubArray
│    %39 = Main.sum(%38)::Any
│    %40 = lmt@_16::Int64
│    %41 = lmt@_16::Int64
│    %42 = (i - 1)::Int64
│    %43 = (%41 + %42)::Int64
│    %44 = Base.getindex(Main.pmf, %40, %43)::Any
│    %45 = (%30 * %39 * %44)::Any
│    %46 = (%29 + %45)::Any
│          Base.setindex!(payoff, %46, i)
│    %48 = Base.getindex(payoff, i)::Float64
│          true
│    %50 = (t + 1)::Int64
│    %51 = Base.getindex(cutoffs, i)::Int64
│    %52 = (lastindex)(k_W, 2)::Int64
│    %53 = (%51:%52)::UnitRange{Int64}
│    %54 = Base.getindex(cutoffs, i)::Int64
│    %55 = lmt@_16::Int64
│    %56 = (i - 1)::Int64
│    %57 = (%55 + %56)::Int64
│          (@_18 = (view)(k_W, %50, %53, %54, %57))
└───       goto #6
5 ──       Core.Compiler.Const(:(@_18 = false), false)
6 ┄─ %61 = @_18::SubArray{Float64,1,Array{Float64,4},Tuple{Int64,UnitRange{Int64},Int64,Int64},true}
│          true
│    %63 = Base.getindex(cutoffs, i)::Int64
│    %64 = (lastindex)(Main.pmf, 2)::Any
│    %65 = (%63:%64)::Any
│          (@_19 = (view)(Main.pmf, hkt, %65))
└───       goto #8
7 ──       Core.Compiler.Const(:(@_19 = false), false)
8 ┄─ %69 = @_19::SubArray
│    %70 = Main.dot(%61, %69)::Any
│    %71 = lmt@_16::Int64
│    %72 = lmt@_16::Int64
│    %73 = (i - 1)::Int64
│    %74 = (%72 + %73)::Int64
│    %75 = Base.getindex(Main.pmf, %71, %74)::Any
│    %76 = (%70 * %75)::Any
│    %77 = (%48 + %76)::Any
│          Base.setindex!(payoff, %77, i)
│          (@_12 = Base.iterate(%16, %28))
│    %80 = (@_12 === nothing)::Bool
│    %81 = Base.not_int(%80)::Bool
└───       goto #10 if not %81
9 ──       goto #2
10 ┄ %84 = Main.sum(payoff)::Float64
└───       return %84
``````

`@_17` is `@view(pmf[hkt, hkt:(cutoffs[i]-1)])` and `@_19` is `@view(pmf[hkt, cutoffs[i]:end])`. So the problem is creating views of `pmf`. I’m searching how to fix it.
I’ve tried to change `pmf` to `pmf = fill(0.01, grid, grid)` and `pmf = zeros(grid, grid)`, but it didn’t work.

1. I noted that I didn’t need to create an array for `payoff`. It is just an escalar, so I created the following version of the code, which is slightly faster (4.2 microseconds). The problem is that Julia cannot infer that the type of `payoff` is Float64:
``````

function respond_paymentv4(t, hkt, lkt, lmt, Pmt_array, k_W)

ts = threshold_m(t, lkt, lmt, Pmt_array)

lmt = max(ts, lmt)

#the possible values of payment given by the opponent
Pmt_grid = Pmt_array[t, lmt:end, lkt, 1]

#the screening cutoffs associated with each payment proposal, simplified
cutoffs = Vector{Int64}(undef, length(Pmt_grid))
cutoffs .= lkt

payoff = Float64(0.0)
#     payoff = Array{Float64}(undef, length(Pmt_grid))

for (i, pmt) in enumerate(Pmt_grid)
#receives pmt if his skill level is below the cutoff
payoff += pmt * sum(@view(pmf[hkt, hkt:(cutoffs[i]-1)])) * pmf[lmt, lmt+(i-1)]

#receives pmt if his skill level greather than or equal to the cutoff
payoff += dot(@view(k_W[t+1, cutoffs[i]:end , cutoffs[i], lmt+(i-1)]), @view(pmf[hkt, cutoffs[i]:end])) * pmf[lmt, lmt+(i-1)]

end

return payoff

end

#test1: expected payoff of S
t, hkt, lkt, lmt, Pmt_array, k_W = T, 50, 50, 50, Pjt_array, s_W
@btime respond_paymentv4(t, hkt, lkt, lmt, Pmt_array, k_W)

``````
``````@code_warntype respond_paymentv4(t, hkt, lkt, lmt, Pmt_array, k_W)
payoff::Any
``````

In the output of `@code_warntype` there is

That is you are accessing `pmf` as a global variable. You should pass it as a parameter to `respond_payment`.

1 Like

Thanks, @mikkoku! This made the function 4x faster.

Also, `@code_warntype` shows that now Julia can infer the types of all variables. So I think it’s everything that can be done to speed up the function.

``````function respond_paymentv5(t, hkt, lkt, lmt, Pmt_array, k_W, pmf)

ts = threshold_m(t, lkt, lmt, Pmt_array)

lmt = max(ts, lmt)

#the possible values of payment given by the opponent
Pmt_grid = Pmt_array[t, lmt:end, lkt, 1]

#the screening cutoffs associated with each payment proposal, simplified
cutoffs = Vector{Int64}(undef, length(Pmt_grid))
cutoffs .= lkt

payoff = Float64(0.0)
#     payoff = Array{Float64}(undef, length(Pmt_grid))

for (i, pmt) in enumerate(Pmt_grid)
#receives pmt if his skill level is below the cutoff
payoff += pmt * sum(@view(pmf[hkt, hkt:(cutoffs[i]-1)])) * pmf[lmt, lmt+(i-1)]

#receives pmt if his skill level greather than or equal to the cutoff
payoff += dot(@view(k_W[t+1, cutoffs[i]:end , cutoffs[i], lmt+(i-1)]), @view(pmf[hkt, cutoffs[i]:end])) * pmf[lmt, lmt+(i-1)]

end

return payoff

end

#test1: expected payoff of S
t, hkt, lkt, lmt, Pmt_array, k_W, pmf = T, 50, 50, 50, Pjt_array, s_W, pmf
@btime respond_paymentv5(t, hkt, lkt, lmt, Pmt_array, k_W, pmf)

973.267 ns (3 allocations: 336 bytes)
``````

Summary:

``````@btime respond_paymentv2(t, hkt, lkt, lmt, Pmt_array, k_W)
6.800 μs (174 allocations: 13.45 KiB)

@btime respond_paymentv3(t, hkt, lkt, lmt, Pmt_array, k_W)
4.914 μs (174 allocations: 5.02 KiB)

@btime respond_paymentv4(t, hkt, lkt, lmt, Pmt_array, k_W)
4.414 μs (152 allocations: 4.53 KiB)

@btime respond_paymentv5(t, hkt, lkt, lmt, Pmt_array, k_W, pmf)
973.267 ns (3 allocations: 336 bytes)

``````

From v2 to v5, I made the following changes:

• created `cutoffs` as `Vector{Int64}` instead of `Array{Int64}` to help Julia to infer the data types
• created `payoffs` as an escalar instead of an array in order to avoid using `sum()` unnecessarily
• put `@view` before arrays indexed with ranges, i.e. `@view(pmf[hkt, hkt:(cutoffs[i]-1)]` , to avoid copies of the arrays
• passed `pmf` as an argument to the function to avoid global variables

Thank you again, @Oscar_Smith and @mikkoku