Concatenate Generators

Hi,
I try to initialize a vector in two parts, for example, a vector with fifty 0’s then fifty 1’s.
I can create it in a loop, but it is not very concise:

function foo1()
    a = Array{Int64}(undef, 100)
    for i in 1:50
        a[i]=0
    end
    for i in 51:100
        a[i]=1
    end
    return a
end

julia> @btime foo1()
  189.182 ns (1 allocation: 896 bytes)

I also can use vcat, but then, unnecessary allocations happen.

julia> @btime [zeros(Int, 50); ones(Int, 50)]
  341.602 ns (3 allocations: 1.84 KiB)

However, when working with ranges, vcat is smart enough to not allocate in the vcat:

julia> @btime [1:50;1:50]
  155.489 ns (1 allocation: 896 bytes)

I tried to vcat generators like this, but it did not work:

vcat((0 for i in 1:50), (1 for i in 1:50))

I also tried this, but it is surprisingly allocating a lot (and it is pretty ugly for such a simple task):

julia> @btime collect(Iterators.Flatten(((0 for i in 1:50), (1 for i in 1:50))))
  16.240 μs (313 allocations: 16.50 KiB)

Is there a way to concatenate generators properly and without allocating ?

julia> VERSION
v"1.6.0"

How about

[x>50 ? 0 : 1 for x=1:100]
4 Likes
function foo(N)
    a = Vector{Int64}(undef, 2N)
    a[1:N] .= 0
    a[N+1:end] .= 1
    return a
end

# or, a little bit slower

function bar(N)
    a = zeros(Int, 2N)
    a[N+1:end] .= 1
    return a
end
2 Likes

you can have zero-step ranges with StepRangeLen(start, step, length)

julia> @btime [StepRangeLen(0,0,50); StepRangeLen(1,0,50)];
  65.886 ns (1 allocation: 896 bytes)

julia> @btime [1:50;1:50];
  64.460 ns (1 allocation: 896 bytes)
2 Likes

Ok, very interesting solutions so far !

Works very well for this example, nice solution.

I thought that slicing would make copy, but I guess this is not the case here because it is a lvalue. I think this is the cleaner solution.

I knew about the arguments length, step ant stop of range, but I didn’t know about this function, nice !

Slicing syntax on the left hand side of an assignment does not create a copy, but calls the setindex! function, and mutates a view of the array the supplied indices.

This returns a view of the last 50 indices, not the entire vector.

2 Likes

Oops, you are right… deleting it :slight_smile:

Yes, this makes sense, thank you very much !

This bit did not need the collect, unless you really need a Vector instead of a generator. However, it seems that even without the collect it allocates a lot, what is really surprising to me.

The following creates a bit vector of 0,1. Not the fastest but simple:

julia> @btime (1:100) .> 50
  192.980 ns (2 allocations: 128 bytes)

I really need a Vector, but personally, I have no allocations without the collect.

ah, I am using Julia 1.5.3, you probably are using 1.6?

Well, if you need a Vector then you will always need at least one allocation of all its elements.

1 Like

One issue is that in Iterators.Flatten(((0 for i in 1:50), (1 for i in 1:50)))) the eltype-trait mechanism has given up and returns Any

julia> eltype(Iterators.Flatten(((0 for i in 1:50), (1 for i in 1:50))))
Any

that doesn’t matter for iteration, but for collect.

Yes I’m using 1.6. I have no problem with the allocations proposed in the previous examples, as they do not allocate more than needed. I still don’t understand where the allocations in this specific example comes from.

Edit :

Oh, I see, maybe there is a place for improvement here ?

Not so much, generators save anonymous functions for the inner expression (here a function giving 0) and one runs into the issue of return type inference of functions in Julia

julia> G = (0 for i in 1:50)
Base.Generator{UnitRange{Int64}, var"#13#14"}(var"#13#14"(), 1:50)

julia> G.f
#13 (generic function with 1 method)

julia> G.f(1)
0

This one is cool:

@btime round.(LinRange(0,1,100))
  69.485 ns (1 allocation: 896 bytes)

and better:

@btime (sign.(-49:50) .+ 1).÷2
  57.986 ns (1 allocation: 896 bytes)

I don’t see where the problem is, julia seems to be able to infer the return type ?

julia> Base.return_types(G.f, (eltype(G.iter),))
1-element Vector{Any}:
 Int64

As a workaround, you could use e.g.

Iterators.flatten(map(x -> (x for i in 1:50), (0, 1)))
2 Likes

For me, it falls in the same issue, the type inferred is Any.