List Comprehension and Map

I know there are a number of topics already open on this subject, but I haven’t found this specific pattern yet. I think I’m look for the map do equivalent of conditional array comprehension.

Here’s a small example and why it isn’t working

a = rand(10)
f = exp  # or any other function

b = map(a) do x
  c = f(x)
  if c > 2
    return c

filter(!isnothing, b)

The problem with this approach is that the final type is Vector{Union{Nothing, Float64}} when I need it to simply be Vector{Float64} for subsequent steps.

If I use conditional array comprehension the type is correct, but I need to repeat the calculation of the intermediate variable c = f(x) which is more complicated in my use case.

b = [f(x) for x in a if f(x) > 2]

Is there a third way that returns the correct type without requiring I calculate c = f(x) twice?

1 Like

I sometimes use this pattern for that sort of thing:

b = [c for x in a for c = f(x) if c > 2]

This should do the trick

using BenchmarkTools 

a = rand(1000000)
f = exp  # or any other function

b1 = @btime [$f(x) for x in $a if $f(x) > 2]
b2 = @btime [fx for fx in ($f(x) for x in $a) if fx > 2]


  14.796 ms (14 allocations: 5.23 MiB)
  14.368 ms (14 allocations: 5.23 MiB)

but maybe we need to use a more expensive function to see an effect.

This should be quite fast:

b = filter(>(2), f.(a))

It is :slight_smile:

b3 = @btime filter(>(2), f.(a))


  7.972 ms (7 allocations: 15.26 MiB)

Even faster to use filter! here.


I don’t know if that’s the case for you, but if your function is invertible, you could do something like this

julia> f.(filter(>(log(2)), a))==filter(>(2), f.(a))

julia> @btime f.(filter(>(log(2)), a))
  2.317 ms (7 allocations: 9.96 MiB)

1 Like

Thanks for the help, everyone. I realize that my MWE was a little too minimum. Here’s a better one below.

function foo(x)  # expensive function
    for _ in 1:1000
        x = sqrt(x^2)
    return x

function bar(x)  # more expensive function
    for _ in 1:10000
        x = sqrt(x^2)
    return x

# desired functionality
b0 = map(a) do x
    c = foo(x)
    if c > 0.5
        return bar(c)
filter(!isnothing, b0)

@goerch’s and @Seif_Shebl’s suggestions are still applicable and give the following

a = rand(100)
b1 = @btime [$bar($foo(c)) for c in $a if $foo(c) > 0.5]
b2 = @btime [$bar(c) for c in ($foo(x) for x in $a) if c > 0.5]
b3 = @btime $bar.(filter!(>(0.5), $foo.($a)))


  3.304 ms (5 allocations: 1.98 KiB)
  3.097 ms (5 allocations: 1.98 KiB)
  3.031 ms (2 allocations: 1.41 KiB)

Unfortunately, I can’t assume my function is invertible @rocco_sprmnt21.

Is there a more efficient way to use filter! in this context?


In the case that you can’t filter out data before applying foo, this is about as efficient as it gets, since the time of calling those function wastly dominates the time it takes to make a few allocations along the way:

# since we want only the time necessary for execution but not for allocating, we need something to prevent calls from being compiled away entirely
julia> @noinline noop(x) = x
noop (generic function with 1 method)

# time it takes for the foo and bar calls only:
julia> @btime for x in $a
  511.200 μs (0 allocations: 0 bytes)

julia> barinputs = filter!(>(0.5), foo.(a));

julia> @btime for x in $barinputs
  2.552 ms (0 allocations: 0 bytes)

so this is fairly optimal already:

julia> b3 = @btime bar.(filter!(>(0.5), foo.($a)));
  3.057 ms (2 allocations: 1.31 KiB)

In case you worry about the allocations: if you are fine with overwriting your input a, you can get rid of those, but unless the immediate values are huge in storage, the net gain is negligible:

julia> function foowithoutallocations!(a)
           @inbounds for i in eachindex(a)
               a[i] = foo(a[i])
foowithoutallocations! (generic function with 1 method)

julia> function barwithoutallocations!(a)
           @inbounds for i in eachindex(a)
               a[i] = bar(a[i])
barwithoutallocations! (generic function with 1 method)

julia> b4 = @btime barwithoutallocations!(filter!(>(0.5), foowithoutallocations!(acopy))) setup = (acopy=copy(a));
  3.025 ms (0 allocations: 0 bytes)

I’d throw in this for readability:

julia> b5 = @btime [bar(c) for c in foo.($a) if c > 0.5];
  3.083 ms (6 allocations: 2.86 KiB)