# An efficient way to compute the average length of streaks of a given number

Suppose I have a large vector of integers ranging from 1 to 5. Now suppose I want to compute the average length of streaks of say, the number 5, where a single appearance of “5” before a different number appears, is counted as a streak of 1.

So, for example, for the vector [1,2,5,5,1,5,3,5,5,5], the average streak-length of the number 5 is (2 + 1 + 3) / 3 = 2

What is an efficient way to compute the average length of streaks of a given number in Julia?

I would just take the naive approach with a simple, straight for loop:

``````function count_streak(pred::Function, itr)
n_streaks, cumlen, len = 0, 0, 0
reset(n_streaks, cumlen, len) = (n_streaks+1, cumlen+len, 0)
for i in itr
if pred(i)
len += 1
else
if len > 0
(n_streaks, cumlen, len) = reset(n_streaks, cumlen, len)
end
end
end
if len > 0
(n_streaks, cumlen, len) = reset(n_streaks, cumlen, len)
end
cumlen / n_streaks
end
``````
3 Likes

If you want a much slower one-liner, this is one option:

``````mean(length(v) for v in filter(v -> v[1] == 5, collect(groupby(==(5), x))))
``````
1 Like

I think you are looking for run length encoding, which is available in StatsBase

https://juliastats.org/StatsBase.jl/v0.18/misc.html#StatsBase.rle

2 Likes

Another one liner version of `count_streak`:

``````using IterTools

cnt_strk(p, v) =
sum(p.(v)) ./ (p(v[1])+sum(map(x -> <(x...), partition(p.(v), 2, 1))))
``````

Test:

``````julia> cnt_strk(==(5),V)
1.248118414450577

julia> count_streak(==(5),V)
1.248118414450577
``````

Or, the same idea but a little more code-golfy:

``````cnt_strk_back(p, v) =
((s,t)-> s/(s-t))(sum(p.(v)), sum(map(all,partition(p.(v),2,1))))
``````
1 Like