I am just interested in knowing if anyone here sees any performance improvements in my code. Either algorithmically or Julia specific.
The code in question is a function which calculates local maxima and minima.
The function does this by calculating the rolling maximum (the maximum in a moving window) and uses that to obtain local optima.
Below is the code! vect is usually on the order of 1000 entries.
using DataStructures
function rolling_optima(vect::Any, win_size::Integer; debug=false)
roll_max = zeros(Float64, length(vect))
roll_min = zeros(Float64, length(vect))
opt_index = zeros(Int16, length(vect))
opt_type = zeros(Int8, length(vect))
d_max = Deque{Float64}()
d_min = Deque{Float64}()
win_forward = win_size
win_backward = win_forward*2
opt_ind = 0
#comp = 10
@inbounds for i in eachindex(vect)
@inbounds for k in 1:length(d_max)
if vect[i] >= front(d_max) popfirst!(d_max)
else break end
end
@inbounds for k in 1:length(d_min)
if vect[i] <= front(d_min) popfirst!(d_min)
else break end
end
pushfirst!(d_max,vect[i])
pushfirst!(d_min,vect[i])
j = i + 1 - win_forward
if j >= 1
roll_max[i] = back(d_max)
roll_min[i] = back(d_min)
if vect[j] == back(d_max)
if roll_max[i] == roll_max[j]
if j-win_backward > 0
if roll_max[j] > maximum(vect[j-win_backward:j-1])
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = 1
end
end
end
pop!(d_max)
end
if vect[j] == back(d_min)
if roll_min[i] == roll_min[j]
if j-win_backward > 0
if roll_min[j] < minimum(vect[j-win_backward:j-1])
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = -1
end
end
end
pop!(d_min)
end
end
end
opt_index = opt_index[1:opt_ind]
opt_type = opt_type[1:opt_ind]
if debug
return opt_index, opt_type, roll_max, roll_min
else
return opt_index, opt_type
end
end
What definition are you using? The usual definition of a local maximum would be a value that is ≥ both its neighbors, which you could implement in a trivial loop that checks vect[i-1] ≤ vect[i] ≥ vect[i+1].
To be defined as maxima there should be no greater entries win_forward places after the spot and win_backward places before that spot. So it has to check a larger area around each point.
While it seems you need locations as well as values, I would bet you could use the ImageFiltering code directly by creating a new AbstractArray wrapper where indexing returns an index=>val pair and comparison looked only at the val. (You’d want to define a custom Pair type so that you weren’t pirating Base’s Pair.)
Thanks! I will look into that. My current algorithm does not do the naive thing of checking all neighbours every step though. So I don’t think there is a huge win in performance in improving the algorithm (I can be wrong) but I know that memory handling is often at least as important as the algorithm in practice.
Having looked at the pseudo code from the article I could improve the runtime of my code by about 10-15%. I already used most of the concepts from that article, why the improvement wasn’t larger.
The main improvement came from realising that the algorithm either finds a larger or a smaller element than the previous one in the same step, and never none of them and never both, so there could be an if - else instead of two checks. I.e:
if vect[i] > vect[i-1] # ...
If anyone is interested, I post the improved code below. I am still interested in more Julia specific improvements which I suspect would further improve the performance.
Improved code:
function rolling_optima(vect::Any, win_size::Integer; debug=false)
roll_max = zeros(Float64, length(vect))
roll_min = zeros(Float64, length(vect))
opt_index = zeros(Int16, length(vect))
opt_type = zeros(Int8, length(vect))
d_max = Deque{Float64}()
d_min = Deque{Float64}()
win_forward = win_size
win_backward = win_forward*2
opt_ind = 0
#comp = 10
pushfirst!(d_max,vect[1])
pushfirst!(d_min,vect[1])
@inbounds for i in 2:length(vect)
if vect[i] > vect[i-1]
@inbounds for k in 1:length(d_max)
if vect[i] >= front(d_max) popfirst!(d_max)
else break end
end
else
@inbounds for k in 1:length(d_min)
if vect[i] <= front(d_min) popfirst!(d_min)
else break end
end
end
pushfirst!(d_max,vect[i])
pushfirst!(d_min,vect[i])
j = i + 1 - win_forward
if j >= 1
roll_max[i] = back(d_max)
roll_min[i] = back(d_min)
if vect[j] == back(d_max)
if roll_max[i] == roll_max[j]
if j-win_backward > 0
if roll_max[j] > maximum(vect[j-win_backward:j-1])
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = 1
end
end
end
pop!(d_max)
elseif vect[j] == back(d_min)
if roll_min[i] == roll_min[j]
if j-win_backward > 0
if roll_min[j] < minimum(vect[j-win_backward:j-1])
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = -1
end
end
end
pop!(d_min)
end
end
end
opt_index = opt_index[1:opt_ind]
opt_type = opt_type[1:opt_ind]
if debug
return opt_index, opt_type, roll_max, roll_min
else
return opt_index, opt_type
end
end
I’m curious to know happens if you replace Deque{Float64} with Vector{Float64}? (The native Julia array type also supports O(1) amortized push/pop from both ends, and I’m curious to know how it compares.)
Note that maximum(vect[j-win_backward:j-1]) allocates a copy of the slice; you might want to use @views or just replace it with a loop. (And it seems like you could do even better, e.g. with a heap-like data structure, so that you don’t have to loop over the entire window for every j.) Similarly for the minimum call.
Using Vector{Float64} seems to increase the runtime by almost 40% compared to using the Deque{Float64}
Yes! This is the part of the code that is the newest functional addition and yes it’s probably far from optimal, however I don’t think the code usually spends a lot of time here anyway so the benefits of optimizing this part are probably small.
if j-win_backward > 0
failed = false
for v in @view vect[j-win_backward:j-1]
if roll_max[j] < v
failed = true
break end
end
if !failed
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = 1
end
it added a slight improvement in speed (again a few %). Is that what you were thinking about or were you thinking about something more sophisticated?
I think your algorithm is O(N), so there is nothing more to improve here. I checked that your code allocates a lot, despite of using views. Maybe I have no your latest code, so please veryfy.
using LinkedLists
function movingmax(array::AbstractArray{T}, K::Integer) where T <: Real
@assert K > 0
@assert K < length(array)
result = zeros(T, length(array)-K+1)
list = LinkedLists.LinkedList{Tuple{Int, T}}()
windowindex = 0
resultindex = 1
isready = false
for item in array
if !isempty(list) && first(list)[1] == windowindex
popfirst!(list)
end
while !isempty(list) && last(list)[2] < item
pop!(list)
end
push!(list, (windowindex, item))
windowindex = (windowindex + 1) % K
if windowindex == 0
isready = true
end
if isready
result[resultindex] = first(list)[2]
resultindex = resultindex + 1
end
end
return result
end
My latest code below: (note that I have changed the name of a few functions since they were changed in the updated DataStructures.jl package which I now use.
function rolling_optima(vect::Any, win_size::Integer; debug=false)
roll_max = zeros(Float64, length(vect))
roll_min = zeros(Float64, length(vect))
opt_index = zeros(Int16, length(vect))
opt_type = zeros(Int8, length(vect))
d_max = Deque{Float64}()
d_min = Deque{Float64}()
win_forward = win_size
win_backward = win_forward*2
opt_ind = 0
#comp = 10
pushfirst!(d_max,vect[1])
pushfirst!(d_min,vect[1])
@inbounds for i in 2:length(vect)
if vect[i] > vect[i-1]
@inbounds for k in 1:length(d_max)
if vect[i] >= first(d_max) popfirst!(d_max)
else break end
end
else
@inbounds for k in 1:length(d_min)
if vect[i] <= first(d_min) popfirst!(d_min)
else break end
end
end
pushfirst!(d_max,vect[i])
pushfirst!(d_min,vect[i])
j = i + 1 - win_forward
if j >= 1
roll_max[i] = last(d_max)
roll_min[i] = last(d_min)
if vect[j] == last(d_max)
if roll_max[i] == roll_max[j]
if j-win_backward > 0
failed = false
for v in @view vect[j-win_backward:j-1]
if roll_max[j] < v
failed = true
break end
end
if !failed
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = 1
end
end
end
pop!(d_max)
elseif vect[j] == last(d_min)
if roll_min[i] == roll_min[j]
if j-win_backward > 0
failed = false
for v in @view vect[j-win_backward:j-1]
if roll_min[j] > v
failed = true
break end
end
if !failed
opt_ind +=1
opt_index[opt_ind] = j
opt_type[opt_ind] = -1
end
end
end
pop!(d_min)
end
end
end
opt_index = opt_index[1:opt_ind]
opt_type = opt_type[1:opt_ind]
if debug
return opt_index, opt_type, roll_max, roll_min
else
return opt_index, opt_type
end
end
Now to your results. It seems that something was strange with running my function. Here are my current results.
While not as good as the results for your function movingmax not as bad as what you had. (I of course did “warm up” runs first when using @time) Could you verify my results?
This is very nice to hear! And you are right, my function is doing more. Do you have a feeling that your function is close to what is possible in terms of speed? It’s actually very satisfying to se how few allocations my code is currently performing!
In terms of complexity my code (your also) is optimal - O(N). Anyway, I think much more could be improved in my function, for instance when I changed LinkedList to Deque I am 15% faster.
Very nice! Did you get rid of some of the allocations? Maybe the part of my code that does the simple rolling maximum is very similar to your code after that…?
Just for comparison I reduced my function to a rolling_maxmin() and a rolling_max() one. The last one of course loses some tricks but should be very close to yours?
function rolling_maxmin(vect::Any, win_size::Integer)
roll_max = zeros(Float64, length(vect))
roll_min = zeros(Float64, length(vect))
d_max = Deque{Float64}()
d_min = Deque{Float64}()
pushfirst!(d_max,vect[1])
pushfirst!(d_min,vect[1])
@inbounds for i in 2:length(vect)
if vect[i] > vect[i-1]
@inbounds for k in 1:length(d_max)
if vect[i] >= first(d_max) popfirst!(d_max)
else break end
end
else
@inbounds for k in 1:length(d_min)
if vect[i] <= first(d_min) popfirst!(d_min)
else break end
end
end
pushfirst!(d_max,vect[i])
pushfirst!(d_min,vect[i])
j = i + 1 - win_size
if j >= 1
roll_max[i] = last(d_max)
roll_min[i] = last(d_min)
if vect[j] == last(d_max)
pop!(d_max)
elseif vect[j] == last(d_min)
pop!(d_min)
end
end
end
return roll_max, roll_min
end
function rolling_max(vect::Any, win_size::Integer)
roll_max = zeros(Float64, length(vect))
d_max = Deque{Float64}()
pushfirst!(d_max,vect[1])
@inbounds for i in 2:length(vect)
if vect[i] > vect[i-1]
@inbounds for k in 1:length(d_max)
if vect[i] >= first(d_max) popfirst!(d_max)
else break end
end
end
pushfirst!(d_max,vect[i])
j = i + 1 - win_size
if j >= 1
roll_max[i] = last(d_max)
if vect[j] == last(d_max)
pop!(d_max)
end
end
end
return roll_max
end