Push! and insert! very slow? (julia 1.11) (EDIT: No; just one-time allocations.)

yes, my sentence was also not clear at all. Thanks a lot for your help!

Yes, thanks! The ordering is important in my case, but, as you say, maybe there are better ways to achieve that. I will think about this!

To extend my last post and for people who run into similar issues with insert! and deleteat!. With a relatively simple trick I was able to drastically improve the run time of the simulation algorithm, eliminating the cost associated with these two methods entirely.

I introduced a temporary positions array that allowed to switch to push! and pop! methods instead, and sorting the array only in the end into the correct order. Run time improvement scales with the length of the input array; in my use cases it was ~100 fold.

Reduced versions of the simulation algorithm are posted below; together with some benchmarks. Thanks for all the help and hints along the way; for my needs I can consider this solved! :slight_smile:

# original version with insert! and deleteat!
function sim_0!(array, nstep)
    for __ in 1:nstep
        isempty(array) && break

        arrind = rand(1:length(array))

        if rand() < 0.5
            insert!(array, arrind, array[arrind])
        else
            deleteat!(array, arrind)            
        end
    end

    array
end

# improved version with push! and pop! and position array
function sim_1!(array, nstep)
    positions = collect(1:length(array))

    for __ in 1:nstep
        isempty(array) && break

        arrind = rand(1:length(array))

        if rand() < 0.5
            push!(array, array[arrind])
            push!(positions, positions[arrind])
        else
            array[arrind], array[end] = array[end], array[arrind]
            positions[arrind], positions[end] = positions[end], positions[arrind]
            pop!(array)
            pop!(positions)
        end
    end

    # sort array according to positions
    array = array[sortperm(positions)]

    array
end

using BenchmarkTools

@btime sim_0!(a, 1_000_000) setup=(a = collect(1:100)) evals=1
@btime sim_1!(a, 1_000_000) setup=(a = collect(1:100)) evals=1
# 28.981 μs (0 allocations: 0 bytes)
# 5.332 μs (4 allocations: 992 bytes)
# => same order of magnitude

@btime sim_0!(a, 1_000_000) setup=(a = collect(1:10_000)) evals=1
@btime sim_1!(a, 1_000_000) setup=(a = collect(1:10_000)) evals=1
# 199.213 ms (3 allocations: 216.00 KiB)
# 10.512 ms (16 allocations: 676.62 KiB)
# => 10 times difference

@btime sim_0!(a, 1_000_000) setup=(a = collect(1:100_000)) evals=1
@btime sim_1!(a, 1_000_000) setup=(a = collect(1:100_000)) evals=1
# 3.686 s (3 allocations: 1.36 MiB)
# 15.469 ms (16 allocations: 5.70 MiB)
# => >200 times difference

a0 = sim_0!(collect(1:100_000), 1_000_000)
a1 = sim_1!(collect(1:100_000), 1_000_000)

# check statistical properties (are identical)
using StatsBase
mean(values(countmap(a0))), mean(values(countmap(a1)))
std(values(countmap(a0))), std(values(countmap(a1)))
1 Like