Slow instantiation of immutable type with string field


#1

I find that an immutable type is usually faster to instantiate than a mutable type with the same field. For example,

julia> type MutableWithTuple
           t::NTuple{5,Int64}
       end

julia> immutable ImmutableWithTuple
           t::NTuple{5,Int64}
       end

julia> using BenchmarkTools

julia> @benchmark MutableWithTuple((0,1,2,3,4))
BenchmarkTools.Trial:
  memory estimate:  48.00 bytes
  allocs estimate:  1
  --------------
  minimum time:     10.504 ns (0.00% GC)
  median time:      22.430 ns (0.00% GC)
  mean time:        30.561 ns (18.68% GC)
  maximum time:     3.945 μs (97.80% GC)
  --------------
  samples:          10000
  evals/sample:     999
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark ImmutableWithTuple((0,1,2,3,4))
BenchmarkTools.Trial:
  memory estimate:  0.00 bytes
  allocs estimate:  0
  --------------
  minimum time:     2.653 ns (0.00% GC)
  median time:      3.238 ns (0.00% GC)
  mean time:        4.549 ns (0.00% GC)
  maximum time:     1.333 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000
  time tolerance:   5.00%
  memory tolerance: 1.00%

Here, the two types are the same except that one is mutable and the other is immutable. We can see that it is a lot faster to instantiate the immutable type. The main reason seems to be the difference in the number of allocations.

However, if we create similar types with string fields instead of tuples, we do not get such performance difference:

julia> type MutableWithString
           s::String
       end

julia> immutable ImmutableWithString
           s::String
       end

julia> @benchmark MutableWithString("01234")
BenchmarkTools.Trial:
  memory estimate:  16.00 bytes
  allocs estimate:  1
  --------------
  minimum time:     10.401 ns (0.00% GC)
  median time:      11.835 ns (0.00% GC)
  mean time:        15.367 ns (11.84% GC)
  maximum time:     3.035 μs (97.66% GC)
  --------------
  samples:          10000
  evals/sample:     999
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark ImmutableWithString("01234")
BenchmarkTools.Trial:
  memory estimate:  16.00 bytes
  allocs estimate:  1
  --------------
  minimum time:     10.190 ns (0.00% GC)
  median time:      11.933 ns (0.00% GC)
  mean time:        15.787 ns (12.76% GC)
  maximum time:     3.684 μs (98.17% GC)
  --------------
  samples:          10000
  evals/sample:     999
  time tolerance:   5.00%
  memory tolerance: 1.00%

Here are my questions:

  1. When does making a type immutable reduce the number of allocations during instantiation, and when doesn’t? According to the above examples, this seems to be related the kinds of the fields the type has, but I thought both String and Tuple are immutable, so doesn’t understand why there is a difference between the type with a tuple and type with a string.

  2. For the type with a string field, is there a way to make the number of allocations zero? I am creating some type, and as soon as I introduce a string field (to give its instance a name), instantiation of the type becomes a lot slower, which is very frustrating.


#2

String is an immutable type, but it contains a reference mutable value (the data field is a Vector{UInt8}). As a result, when you create a new String, you need to also allocate this array. NTuple{5,Int64} on the other hand contains immutable objects all the way down, so doesn’t require allocating any extra memory. If you changed this to, e.g. NTuple{5, Vector{Float64}} you would have the same problem.

The distinction then is not so much of immutability, but isbits (which you can roughly think of as “immutable all the way down”)

julia> isbits(ImmutableWithTuple)
true

julia> isbits(ImmutableWithString)
false

Not easily: the problem ultimately is that the compiler can’t know ahead of time how big the string is. You might be able to use some “unsafe” approaches like WeakRefStrings.jl.


#3

That makes sense. I didn’t know isbits() has such a meaning. Thanks @simonbyrne!

Follow-up questions:

  1. I find (see the result below) that instantiating a mutable type uses only one allocation, no matter how many mutable or immutable fields it has (as long as the type has at least one field and the default constructor is called). Exactly what does this single allocation do when a mutable type is instantiated? How can this single allocation allocate multiple fields?
julia> type MutableWith1String
           s::String
       end

julia> type MutableWith2Strings
           s1::String
           s2::String
       end

julia> type MutableWith3Strings
           s1::String
           s2::String
           s3::String
       end

julia> @benchmark MutableWith1String("abc")
BenchmarkTools.Trial:
  memory estimate:  16.00 bytes
  allocs estimate:  1
  ...

julia> @benchmark MutableWith2Strings("abc", "ABC")
BenchmarkTools.Trial:
  memory estimate:  32.00 bytes
  allocs estimate:  1
  ...

julia> @benchmark MutableWith3Strings("abc", "ABC", "123")
BenchmarkTools.Trial:
  memory estimate:  32.00 bytes
  allocs estimate:  1
  ...
  1. isbits(String) is false, so I thought instantiating a string would use one allocation according to your answer. On the contrary, I find that instantiating a string doesn’t use any allocations (like instantiating a tuple), whereas instantiating an array does. (See the result below.) How can we explain this?
julia> @benchmark "123"
BenchmarkTools.Trial:
  memory estimate:  0.00 bytes
  allocs estimate:  0
  ...

julia> @benchmark (1,2,3)
BenchmarkTools.Trial:
  memory estimate:  0.00 bytes
  allocs estimate:  0
  ...

julia> @benchmark [1,2,3]
BenchmarkTools.Trial:
  memory estimate:  112.00 bytes
  allocs estimate:  1
  ...

#4

I’m not 100% sure what is going on here, but I suspect that the string literal has already been allocated by the time the @benchmark macro sees it, so the allocation measurement is only for the wrapper object.