Pushing a large array into CircularDeque fails

Adriel · April 23, 2018, 11:53pm

Can anyone explain why the following fails?

julia> using DataStructures

julia> d = CircularDeque{Float64}(100_000)
CircularDeque{Float64}([])

julia> push!(d, randn(), randn())   # no problem
CircularDeque{Float64}([0.2451865901032298,0.3001460029950263])

julia> push!(d, randn(25_000)...)   # wait 2 minutes...
ERROR: StackOverflowError:

I couldn’t find the root definition of push! for multiple values in Base, but I expect it’s a recursion along the n input values, passing n-1 values to a new push! call at every stage. That would explain the excessive memory use.

I can easily move forward via a simple loop so it’s not a problem I need solved - I’m just wondering why push! is implemented this way and if it could be improved, e.g. with an iterative solution.

mohamed82008 · April 24, 2018, 12:33am

Splatting is bad for the function’s health, too many arguments to gulp down. ~~Try append! instead.~~

Edit: append! does not work. A good old loop does the trick though.

TsurHerman · April 24, 2018, 8:02am

Or use Broadcasting tricks:

push!.([d], randn(25_000))

jlapeyre · April 24, 2018, 11:25am

No, it has nothing to do with splatting:

function mypush!(a,items...)
    for i in items
        push!(a,i)
    end
    a
end

julia> using DataStructures;

julia> d = CircularDeque{Float64}(100_000);


julia> @time mypush!(d,randn(25_000)...);
  0.089943 seconds (113.93 k allocations: 6.478 MiB, 5.70% gc time)

julia> @time mypush!(d,randn(25_000)...);
  0.000831 seconds (25.01 k allocations: 781.625 KiB)

Another way to handle this is reduce(push!,d,randn(25_000)).

Or write special methods

function Base.push!(a::DataStructures.CircularDeque{T}, item::T) where T
    invoke(push!, Tuple{DataStructures.CircularDeque{T},Any}, a,item)
end

function Base.push!(a::DataStructures.CircularDeque{T}, item::T, items::T...) where T
    push!(a,item)
    for i in items
        push!(a,i)
    end
    a
end

This gives:

julia> @time push!(d,randn(25_000)...);
  0.002142 seconds (25.01 k allocations: 781.625 KiB)

mohamed82008 · April 24, 2018, 11:35am

I stand corrected, it has to do with how the splatted arguments are treated inside, my bad.

jlapeyre · April 24, 2018, 11:49am

There are several functions like this. I used to think it was due to splatting. Even many of the most experienced people say the inefficiency is due to splatting too many arguments until they look into it more deeply.

Note that I am not saying that splatting 10^4 or 10^5 arguments is good or bad practice, just that it is not the cause of the huge inefficiency observed.

tkoolen · April 24, 2018, 4:56pm

Vararg functions often (but not always) recurse. Recursing 25000 times leads to a stack overflow.

Topic		Replies	Views
Possible Bug with push! and overwriting array General Usage question	5	844	May 18, 2018
Why is push! still so slow? Internals & Design	39	1732	March 29, 2024
Buffer for CircularDeque Performance	5	466	August 16, 2020
Push!() giving Method type error New to Julia question	5	710	February 13, 2022
Avoid calling push! in Base.find Internals & Design	2	800	January 7, 2017

Pushing a large array into CircularDeque fails

Related topics