Iterating over cols, vcat, repeat?

qsong · September 29, 2018, 10:18am

I come from R world here for performance. @time shows that (123.55 k allocations: 6.260 MiB) for the below code. It seems more allocations than my expectation. So I was wondering if there be some improvement with the code?

m = reshape(1:12, 3, :);
map([m[:, i] for i = 1:size(m, 2)]) do x
    vcat(repeat(x, 1, 2), zeros(Int, 1, 2))
end

More specifically, two sub-questions below:

were iterating over columns [m[:, i] for i = 1:size(m, 2)] expensive? The thread https://github.com/JuliaLang/julia/issues/14491 has not been closed yet.
were vcat or/and repeat expensive? if so, any suggestions for improvement? Thanks!

baggepinnen · September 29, 2018, 10:51am

You are benchmarking in global scope, try this

julia> m = reshape(1:12, 3, :);

julia> @time map([m[:, i] for i = 1:size(m, 2)]) do x
       vcat(repeat(x, 1, 2), zeros(Int, 1, 2))
       end
  0.088479 seconds (123.55 k allocations: 6.259 MiB)
4-element Array{Array{Int64,2},1}:
 [1 1; 2 2; 3 3; 0 0]      
 [4 4; 5 5; 6 6; 0 0]      
 [7 7; 8 8; 9 9; 0 0]      
 [10 10; 11 11; 12 12; 0 0]

julia> function f()
           m = reshape(1:12, 3, :);
           map([m[:, i] for i = 1:size(m, 2)]) do x
           vcat(repeat(x, 1, 2), zeros(Int, 1, 2))
           end
       end
f (generic function with 1 method)

julia> using BenchmarkTools

julia> @btime f()
  929.750 ns (23 allocations: 2.23 KiB)

The manual tells you all you need to know about benchmarking and performance.

baggepinnen · September 29, 2018, 12:17pm

If you are worried about allocations, you can reduce them further by

julia> function f()
           m = reshape(1:12, 3, :);
           map(@view(m[:, i]) for i = 1:size(m, 2)) do x
           vcat(repeat(x, 1, 2), zeros(Int, 1, 2))
           end
       end
f (generic function with 1 method)

julia> using BenchmarkTools

julia> @btime f()
  800.585 ns (17 allocations: 1.67 KiB)

The changes are

Generator instead of comprehension as argument to map, this avoids the allocation of the vector.
Use of @view to avoid allocating the slice m[:, i]

bennedich · September 30, 2018, 5:18am

For optimal peformance, consider a devectorized solution:

function f()
    m = reshape(1:12, 3, :)
    p,q = size(m)
    A = Vector{Matrix{Int64}}(undef, q)
    @inbounds for i = 1:q
        A[i] = Matrix{Int64}(undef, p+1, 2)
        for k = 1:p
            A[i][k,1] = A[i][k,2] = m[k,i]
        end
        A[i][p+1,1] = A[i][p+1,2] = 0
    end
    A
end

julia> @btime f()
  172.140 ns (5 allocations: 688 bytes)

qsong · October 1, 2018, 11:56pm

Many thanks for all the help. Testing the built-in function repeat along the “devectorized” way and found that:

                                                                                                                                                                                     
julia> @btime repeat(1:5, 2, 3);                                                                                                                                                     
  121.025 ns (1 allocation: 336 bytes)                                                                                                                                               
                                                                                                                                                                                     
julia> @btime v = Vector{Int}(undef, 30); for i=1:5:30; v[i:(i+4)] = 1:5; end; reshape(v, 10, 3);                                                                                    
  34.007 ns (1 allocation: 336 bytes)

Well, could I conclude that better avoiding built-in vectorized array functions for optimal performance in Julia? Thanks!

rdeits · October 2, 2018, 12:11am

This doesn’t do what you think. You’re only benchmarking v = Vector{Int}(undef, 30), and then running the remaining lines after the benchmark is finished. The semicolon ends the expression, and so only the part before the first semicolon is picked up by @btime.

Try running it in a new Julia session to see that it actually errors when trying to reference v after the benchmark completes:

julia> @btime v = Vector{Int}(undef, 30); for i=1:5:30; v[i:(i+4)] = 1:5; end; reshape(v, 10, 3);
  27.772 ns (1 allocation: 336 bytes)
ERROR: UndefVarError: v not defined
Stacktrace:
 [1] top-level scope at ./REPL[23]:1 [inlined]
 [2] top-level scope at ./none:0

tkoolen · October 2, 2018, 12:21am

And note that you don’t necessarily have to squeeze everything onto one line; you can use begin/end instead to create a compound expression.

@btime begin
    v = Vector{Int}(undef, 30)
    for i=1:5:30
        v[i:(i+4)] = 1:5
    end
    reshape(v, 10, 3)
end

bennedich · October 2, 2018, 8:23am

My advice: Don’t worry so much about performance at the initial stage of your project. Instead try to make your code as readable and flexible as possible. That typically means using the built-in functions over devectorized code. During development, add plenty of unit tests for your code. Once your project is nearing completion, profile the code, with production-sized input data, to see where the bottlenecks are. Then optimize those parts only. Your unit tests will make sure that you don’t mess anything up while refactoring your code.

In my experience, with enough work, you can usually speed up code by devectorizing it, but in practice there is rarely a need to do so. Either the speedup is negligible, the part you’re optimizing is not a bottleneck, or the built-in function does such a good job that it’s not worth the effort and added complexity in your code to try to beat it.

Topic		Replies	Views
Julia 1.0.2 reapeat function cost many times than matlab‘s repmat Internals & Design	5	2890	November 19, 2018
Repeating (with function repeat) array elements using a vector of numbers General Usage	13	9697	March 5, 2019
Julia-ism for two-dimensional map? New to Julia	22	3180	May 1, 2018
Performance difference between two code New to Julia	3	805	January 18, 2017
Iterating over two vectors with vcat is faster than Iterators.flatten and CatViews? Performance memory-allocation	10	1911	October 1, 2018

Iterating over cols, vcat, repeat?

Related topics