Memory Comprehensions

Is there a Memory comprehension in Julia, which I missed so far?

We have Array comprehensions, but when you don’t need the multiple dimensions and resizing properties of an Array and a Memory is enough, it would be nice to construct a Memory directly without going through a (non-public) field access of an Array or an uninitialized Memory and a loop.

[… for …] and (… for …) are already used. {… for …} might be free: But are the cases so common that a dedicated syntax is necessary? On the one hand, Julia’s performance would probably benefit from everyone thinking about whether the functionalities of Array are really needed and make sense. On the other hand, it only gets rid of some inefficiencies, so not too much to gain here.

Would a Memory constructor accepting a generator make sense? This sounds very useful in general and could be a good compromise between code verbosity and syntax complexity.

Or is there any other already existing elegant solution to the problem?

I think the generator answer is probably or best option at this point

2 Likes

It turned out, that I didn’t need a generator as a function was enough. So I could mirror ntuple, which makes sense, when you think about it.

function memory(f::Function, n::Integer)
    n >= 1 || lazy"Creation of empty `Memory` is not supported using `memory`." |> ArgumentError |> throw
    temp = f(1)
    mem = Memory{typeof(temp)}(undef, n)
    mem[1] = temp
    for i in 2 : n
        mem[i] = f(i)
    end
    return mem
end

julia> memory(i -> 4-i, 4)
4-element Memory{Int64}:
 3
 2
 1
 0

However, as Memory needs the element type before construction (in contrast to a Tuple), this is either a little bit less efficient (as shown) or needs the type as an additional parameter. Using the additional parameter supports the empty Memory, too. However, the user effectively then needs to provide the type twice for a non-empty Memory (as f’s return type and as a separate argument to memory).

Improvements welcome!

Should I hand this in to Base (or Core)?.

I cannot see where you show this. Could you add a benchmark?

For performance, you could try

function memory(f::F, n::Integer) where {F} 

in order to specialize on the function type (also, it doesn’t have to be a Function at all).

1 Like

Thanks! This is indeed both slightly faster and more generic.

Sure. The effect is small, but seems to be consistent here.

function memory(f::F, n::Integer) where F # use a parameter to enforce specialization
    n >= 1 || lazy"Creation of empty `Memory` is not supported using `memory`." |> ArgumentError |> throw
    temp = f(1)
    mem = Memory{temp |> typeof}(undef, n)
    mem[1] = temp
    for i in 2 : n
        mem[i] = f(i)
    end
    return mem
end

function memory(f::F, T::DataType, n::Integer) where F # use a parameter to enforce specialization
    mem = Memory{T}(undef, n)
    for i in 1 : n
        mem[i] = f(i)
    end
    return mem
end

julia> @b memory(zero, $7)
9.653 ns (1 allocs: 80 bytes)

julia> @b memory(zero, Int, $7)
9.539 ns (1 allocs: 80 bytes)

I wouldn’t put much stock in <1ns or <10% differences like that, it is just as likely up to random chance or some artifact of your evaluation order.

julia> @benchmark memory(zero, Int, $7)
BenchmarkTools.Trial: 10000 samples with 998 evaluations per sample.
 Range (min … max):  16.232 ns … 719.639 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     17.836 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   22.084 ns ±  26.904 ns  ┊ GC (mean ± σ):  6.76% ± 6.16%

  ▆█▆▅▄▃▂▁        ▁                                            ▁
  █████████▆▅▆▆▇▇▆██▇▇▇▆▄▅▄▃▃▃▃▄▅▅▆▇▇▆▆▇▆▆▇▆▇▆▇▇▆▆▆▆▅▅▅▃▄▄▂▃▃▂ █
  16.2 ns       Histogram: log(frequency) by time      68.4 ns <

 Memory estimate: 80 bytes, allocs estimate: 1.

julia> @benchmark memory(zero, $7)
BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
 Range (min … max):  16.316 ns …  1.192 μs  ┊ GC (min … max):  0.00% … 95.13%
 Time  (median):     17.518 ns              ┊ GC (median):     0.00%
 Time  (mean ± σ):   23.684 ns ± 41.628 ns  ┊ GC (mean ± σ):  12.41% ±  7.46%

  ██▅▃▁                    ▁▂▂▁▁                              ▂
  ████████▅▅▄▆▆▅▅▅█▇▅▆█▇▅▄▇███████▇█▆▆▆▆▆▆▅▆▆▆▅▅▄▅▅▅▅▅▃▅▃▃▄▁▄ █
  16.3 ns      Histogram: log(frequency) by time      77.5 ns <

 Memory estimate: 80 bytes, allocs estimate: 1.

The distributions are noticeably different even with 10000 samples each, those small differences in statistics are to be expected. If I just ran @btime, I’d see that the extra Int input makes it a bit faster, but the actual stats usually went the other way.

Yes, the difference is small, indeed. That’s why I selected the user friendly approach here (only two parameters – although you could argue that it’s not user friendly when it fails for n = 0).

However, the effect was noticeable for different sizes. Additionally, the three argument version seems to be easier/better to optimize in general. The three argument version could use SIMD from the start (although I think that it doesn’t currently), while the two argument version needs some really clever compiler tricks to do so.

If we want to have it as a PR to Julia, I am open which version to prepare. We could even provide both methods.