Initializing an array of mutable objects is surprisingly slow

How can one performantly preallocate a vector of mutable objects? Currently allocating and initializing a vector of mutables is ~60x slower than doing the same thing for mutables:

# same result on v0.6.3 and latest master
using Compat
using BenchmarkTools
struct A
    x::Int
end
mutable struct B
    x::Int
end
function f1(x::Vector{T}) where{T}
    @inbounds for i = 1:length(x)
        x[i] = T(i)
    end
end
const N = 1000000
const xa = Vector{A}(undef, N)
const xb = Vector{B}(undef, N)
@benchmark f1($xa)
"""
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     316.354 ÎĽs (0.00% GC)
  median time:      317.092 ÎĽs (0.00% GC)
  mean time:        317.859 ÎĽs (0.00% GC)
  maximum time:     751.201 ÎĽs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
"""
@benchmark f1($xb)
"""
  memory estimate:  15.26 MiB
  allocs estimate:  1000000
  --------------
  minimum time:     5.012 ms (0.00% GC)
  median time:      19.824 ms (74.42% GC)
  mean time:        15.386 ms (66.95% GC)
  maximum time:     22.908 ms (64.39% GC)
  --------------
  samples:          326
  evals/sample:     1
"""

Apparently Vector{mutable_type}(undef, N) only allocate space for pointers, and results in significant slowdown as illustrated by the aboving example.

In C, structs are mutable and the whole vector can be allocated at once.

My questions are:

  1. I remember reading from some post that immutability enables many optimizations that is not possible for mutable types. So is there any chance that the performance gap will converge?

  2. What can I do if I absolutely need fast allocation for a vector of mutable objects now?
    One solution will be allocating a vector of UInt8s and unsafe_load everything from it. But I suppose that’s not the julian way and would like to learn the alternatives.

Thank you for your help!

Indeed, isbits type are stored inline in the array, non isbits typed are stored with pointers.

If your end use case is known, it is likely that the answers might be more relevant.

1 Like

I need to keep a list of record of an incoming signal (x_t, y_t). Each record is like (a=cumsum(x_t), b=cum_sum(y_t)). Whenever a>2b, a new record is started afresh.

I thought about the following solution, but acctually I have a1,…a8 and b1,…b12, and I don’t like the idea of typing it twice.

struct Rec
a
b
end
mutable struct Recs
a
b
r::Vector{Rec}
end

The title of this thread should probably be “Initializing an array of immutable objects is surprisingly fast” :grin:

3 Likes

Indeed! The compiler somehow managed to reuse memory for the construction call A(i). Really impressive that it all comes free.

I just found this issure (#21912), indicating that mutating an immutable probably will become a feature.

For my use case, I can live with a mutable wrapper l mentioned above. Maybe save some typing with a macro. Still interested to see if you have alternative solutions :wink:

Mutables are the wrong approach. You want one giant data vector (one Tuple or struct for each sampling time) and store offsets into this. Then you either use views or pointer-views into this giant data vector in order to retrieve records (depending on whether your specific use-case requires that the giant data array is kept alive by the views). Think about how you would lay out your data in C; that’s how you should write it in julia (C is a macro-assembler, and julia can be viewed the same way). Only feature I’m really missing are stack-allocated mutable buffers (llvm alloca); tuples are too inconvenient due to their immutability.

2 Likes

@foobar_lv2 @StefanKarpinski

Recently, I also bump on this issue. I have to define some mutable numbers, but Array with mutable elements are not as performant as non-mutable ones due to the optimization with respect to bits types
https://docs.julialang.org/en/v1/devdocs/isbitsunionarrays/

Now I am using Base.RefArray as a temporary solution, however, I can not reuse [mutable types...] and zeros(MutableType, size...) anymore.

Is there any known workflow in Julia to provide high performance meanwhile reusability?

Mutability is inherently problematic for performance, I’m afraid. The union optimization is unlikely to be the actual issue, the real problem is probably that you have to represent an array of mutable values as an array of pointers to individually heap-allocated values. Are you sure you can’t make the type immutable?

2 Likes