Concatenating iterables without allocating memory

Hi, for a library I am working now I need to find minimum and maximum values. I am looking elegant way to express that.

signal1 = rand(10000000)
signal2 = rand(10000000)
@time lo, hi = extrema(vcat(signal1, signal2))
  0.159702 seconds (7 allocations: 152.588 MiB)

Code above looks good but allocates lots of memory just to iterate values, so it’s not making sense for me.

In C# I can write

(lo, hi) = signal1.Concat(signal2).Extrema();

and there is no allocation at all. It there any predefined method in Base that works in a similar way?

2 Likes

Iterators.flatten

7 Likes
julia> vec1 = rand(100);

julia> vec2 = rand(100);

julia> extrema1 = extrema(vec1)
(0.019555577401858537, 0.9995712478383711)

julia> extrema2 = extrema(vec2)
(0.016606741252352952, 0.9849117868980357)

julia> extrema_combined = extrema(vcat(collect(extrema1),collect(extrema2)))
(0.016606741252352952, 0.9995712478383711)
1 Like

collect will allocate, and so will vcat.

1 Like

@ChrisRackauckas it still allocates memory

@time Iterators.flatten((signal1, signal2)) |> extrema
  0.285192 seconds (20.00 M allocations: 610.352 MiB, 24.17% gc time)
1 Like
julia> length(extrema1)
2

Yes it will allocate! I am losing sleep over the remaining of my 8 gigabytes of computer memory.

1 Like

@ChrisRackauckas, ok this works for me, I need to read implementations to understand what is going on

@time Iterators.flatten(zip(signal1, signal2)) |> extrema
  0.080821 seconds (8 allocations: 256 bytes)

Just because an example is length 2 doesn’t mean the real code is… it’s still a valid question to compose iterators without allocating intermediates…

2 Likes

I think @StevenSiew is right here, extrema always returns 2 elements no matter how long input is, and it does not allocate.

I will stick to solution with Iterators.flatten, I wrote a helper method

function concatenate(iterables...)
    return zip(iterables...) |> Iterators.flatten
end

@time lo, hi = extrema(concatenate(signal1, signal2))
  0.072824 seconds (12 allocations: 352 bytes)
(4.430056321780285e-9, 0.9999998165880861)

I cannot reason why sometimes flatten allocates even more than the size of arrays is

@time Iterators.flatten((signal1, signal2)) |> extrema
  0.281161 seconds (20.00 M allocations: 610.352 MiB, 23.62% gc time)
(4.430056321780285e-9, 0.9999998165880861)

@time Iterators.flatten([signal1; signal2;]) |> extrema
  0.174777 seconds (8 allocations: 152.588 MiB, 11.98% gc time)
(4.430056321780285e-9, 0.9999998165880861)

@time Iterators.flatten(vcat(signal1, signal2)) |> extrema
  0.159507 seconds (8 allocations: 152.588 MiB)

@time Iterators.flatten(zip(signal1, signal2)) |> extrema
  0.073975 seconds (8 allocations: 256 bytes)
(4.430056321780285e-9, 0.9999998165880861)

It seems implementation is here, correct me if I am wrong:
https://github.com/JuliaLang/julia/blob/2d5741174ce3e6a394010d2e470e4269ca54607f/base/iterators.jl#L898
I will try to figure it out tomorrow, it’s very late in my time zone.

1 Like

It looks like the problem is with extrema function. I put details in existing github issue. It describes different problem but they can be fixed together. I think I have draft of the solution, if no one will reply I will open pull request.

https://github.com/JuliaLang/julia/issues/31442#issuecomment-573939429

Is CatViews.jl useful for your purposes?

1 Like

Nope, it is slow and allocates memory like a crazy, but thanks, the API is nice.

ulia> @time CatView(signal1, signal2) |> minimum
 24.332661 seconds (320.16 M allocations: 14.611 GiB, 6.30% gc time)
2.4065947235030194e-8

julia> @time CatView(signal1, signal2) |> extrema
 25.199472 seconds (340.00 M allocations: 14.901 GiB, 6.13% gc time)
(2.4065947235030194e-8, 0.9999999501303669)

julia> @time Iterators.flatten((signal1, signal2)) |> extrema
  0.311673 seconds (20.00 M allocations: 610.352 MiB, 25.23% gc time)
(2.4065947235030194e-8, 0.9999999501303669)

Wow, interesing! I thought that was what CatViews was for!

Argel Ramírez Reyes

In addition to the problems with extrema, you shouldn’t use the @time macro for benchmarking, and especially avoid benchmarking with non-constant globals.

Use the BenchmarkTools package, and remember to interpolate the variables.

1 Like

Flatten is a typical example where performance is not great with the lazy iterator approach. Fortunately, we’ll have transducers in Julia 1.4. It means that we can write efficient “fused” extrema quite easily:

julia> using BenchmarkTools

julia> @btime extrema(Iterators.flatten(($signal1, $signal2)));
  258.958 ms (20000002 allocations: 610.35 MiB)

julia> @btime mapfoldl(
           x -> (x, x),
           ((a, b), (c, d)) -> (min(a, c), max(b, d)),
           Iterators.flatten(($signal1, $signal2)),
       );
  42.014 ms (0 allocations: 0 bytes)

OK, I lied. Above example actually does not work yet, until this bug fix is merged

https://github.com/JuliaLang/julia/pull/34369

Trying to comment here actually helped me find out this bug :slight_smile:

5 Likes

This solution is not particularly nice but it is allocation free in released Julia 1.x versions:

julia> @btime extrema(Iterators.flatten(extrema.(($signal1, $signal2))));
  47.482 ms (0 allocations: 0 bytes)
1 Like

Is this the syntax with transducers? I must admit I find it a bit hard to read.

The ugly mapfoldl I wrote above is just a plain old mapfoldl. It should be executable in all Julia 1.x.

I should’ve clarified that transducers built in Base is just an implementation detail for executing foldl (see Transducer as an optimization: map, filter and flatten by tkf · Pull Request #33526 · JuliaLang/julia · GitHub). Things like sum(x for x in xs if x !== missing) are processed through transducers “behind your back” so that it can be compiled to a better machine code. But there is no stable API for touching transducers in Base.

I think it’s better to use transducers in reduce as well so that you can just write extrema(Iterators.flatten(...)) and it’s fast.

2 Likes

@tkf, thank you so much, I am happy my example helped you to fix that :smiley:
I am coming from C# and F# world, and I am really missing those pipelining features. When I first realised that in Julia map(x->2x, someArray) allocates and array for its result I was more than disappointed :slight_smile: