What's the fastest way to generate `[1,2,....,n]`?


#1

For an algorithm I need to generate x = [1,2,...,n] for a large n and I can’t do x = 1:n as I need to reassign the values in x so I can do something like x[10] = x[2].

Currently I am doing x=collect(1:n) and I tested x =[1:n...] and they are both similar in speed. But am wondering if those are the fastest way already?


#2

Perhaps Array(1:10)? The speed should be the same though.


#4

For collect and Array in v0.6.1, it is almost the same:

using BenchmarkTools
f1(n) = Array(1:n)
f2(n) = collect(1:n)
n = 1_000_000

julia> @benchmark f1($n)
BenchmarkTools.Trial: 
  memory estimate:  7.63 MiB
  allocs estimate:  3
  --------------
  minimum time:     514.845 μs (0.00% GC)
  median time:      1.730 ms (0.00% GC)
  mean time:        1.729 ms (28.30% GC)
  maximum time:     5.186 ms (64.53% GC)
  --------------
  samples:          2878
  evals/sample:     1

julia> @benchmark f2($n)
BenchmarkTools.Trial: 
  memory estimate:  7.63 MiB
  allocs estimate:  3
  --------------
  minimum time:     407.858 μs (0.00% GC)
  median time:      579.133 μs (0.00% GC)
  mean time:        808.760 μs (27.26% GC)
  maximum time:     7.762 ms (57.44% GC)
  --------------
  samples:          6156
  evals/sample:     1

Preallocating and reusing x will save you a bit.


#5
  [1:10;]

#6

Why would that be faster? You are generating 5 random numbers that you then throw away and then fill the array with what you actually want.


#7

OK, reread it 5 times now. I misunderstood. Removing comment as to keep this clean.


#8

Will lazy.jl suit this purpose?


#9

Because your question on SO I think you probably want to have array with 2 billion elements. And that you like some concrete problem to solve.

I am just curious. If you cannot use something like array(n) = n instead of array = [1:2^31...] then you probably need to change these values.

But are you planning shrinking this array with deleteat! or something similar?

Maybe you could set array by init values: array = zeros(2^31) (which has to be quick) and use something like getvalue(index) = array[index] == 0 ? index : array[index] if you don’t need to look at values too often.

Don’t you want describe your problem in gist or blog (or here if it is not too long) to help us understand what to optimize?


#10

It’s actually something different to SO post now. Basically I am trying to
create an algorithm to sort a vector and produce its sortperm at the same
time. To start the algo I need an index vector from 1:n which I then
permute.

Anyway I think collect(v) is pretty fast but just wanted to make sure


#11

@benchmark zeros(Int64, $n) # I am surprised - this is really not faster?


#12

Maybe use a dict storing all all elements where xi !=I ?


#13
using BenchmarkTools
function genn(n)
    x = zeros(Int,n)
    x .= 1:n
    x
end
@benchmark genn(100_000_000)

function genn1(n)
    collect(1:n)
end
@benchmark genn1(100_000_000)

got roughly the same results


#14

If you can use Int32, it might be faster.


#15

Indeed.

using Base.Threads
function fcollect(N,T=Int)
    nt = nthreads()
    n,r = divrem(N,nt)
    a = Vector{T}(N)
    @threads for i=1:nt
        ioff = (i-1)*n
        nn = ifelse(i == nt, n+r, n)
        @inbounds for j=1:nn
            a[ioff+j] = ioff+j
        end
    end
    a
end

This is twice as fast for T=Int32 as for T=Int64, and both are substantially faster than collect(1:N) with several threads. But it’s still a factor of 2 under the memory bandwidth limit (on my system, at least) so the field is still open.


#16

Wow! This is amazing


#17

In the future, it would be nice to do this threading automatically at least in simple cases. Should be doable once threading is more solid and turned on by default.


#18

It’d be great if collect(1:n) ends up as the fastest implementation of this.