I wonder if there is a simple way to construct a container (Array or Tuple) with comprehensions or similar but with the performance of a hand-written container:
julia> @time [fill(1, 1000), fill(1, 1000), fill(1, 1000), fill(1, 1000)]
0.000024 seconds (9 allocations: 32.016 KiB)
julia> @time [fill(1, 1000) for i in 1:4]
0.101525 seconds (49.46 k allocations: 2.536 MiB)
I am also asking if there is a way to declare variables in a @threads for-loop on everey thread independent. I came to the upper example during a workaround for the latter problem.
You’re just measuring compilation time here — a comprehension creates a function to do its work, so at the REPL each time you run it, it’ll create a new function that needs to be compiled. But when the comprehension itself is in a function, then you’ll only pay that cost once:
julia> f() = [fill(1, 1000) for i in 1:4]
f (generic function with 1 method)
julia> @time f();
0.049694 seconds (102.72 k allocations: 5.554 MiB)
julia> @time f();
0.000039 seconds (9 allocations: 32.016 KiB)
I highly recommend using the BenchmarkTools package — it’ll deal with many of these intricacies for you.
For tuples, the situation is a bit more complicated, since you might want the compiler to correctly infer the tuple type (including its size).
Example
Hand-written version:
julia> using Test: @inferred
julia> using BenchmarkTools: @btime
julia> f1() = (fill(1,1000), fill(1,1000), fill(1,1000), fill(1,1000))
f1 (generic function with 1 method)
julia> @inferred f1();
julia> @btime f1();
2.575 μs (5 allocations: 31.80 KiB)
Comprehension: not only do you lose a bit of performance, but also (and more importantly) you lose the correct inference of the return type (the tuple size is not deduced):
julia> f2() = Tuple(fill(1,1000) for i in 1:4)
f2 (generic function with 1 method)
julia> @inferred f2();
ERROR: return type NTuple{4,Array{Int64,1}} does not match inferred return type Tuple{Vararg{Array{Int64,1},N} where N}
julia> @btime f2();
3.451 μs (6 allocations: 31.91 KiB)
To get the nice properties of the hand-written form in a more concise and DRY way, you can use the ntuple function: