List Comprehension and Map

I know there are a number of topics already open on this subject, but I haven’t found this specific pattern yet. I think I’m look for the map do equivalent of conditional array comprehension.

Here’s a small example and why it isn’t working

a = rand(10)
f = exp  # or any other function

b = map(a) do x
  c = f(x)
  if c > 2
    return c
  end
end

filter(!isnothing, b)

The problem with this approach is that the final type is Vector{Union{Nothing, Float64}} when I need it to simply be Vector{Float64} for subsequent steps.

If I use conditional array comprehension the type is correct, but I need to repeat the calculation of the intermediate variable c = f(x) which is more complicated in my use case.

b = [f(x) for x in a if f(x) > 2]

Is there a third way that returns the correct type without requiring I calculate c = f(x) twice?

1 Like

I sometimes use this pattern for that sort of thing:

b = [c for x in a for c = f(x) if c > 2]
4 Likes

This should do the trick

using BenchmarkTools 

a = rand(1000000)
f = exp  # or any other function

b1 = @btime [$f(x) for x in $a if $f(x) > 2]
b2 = @btime [fx for fx in ($f(x) for x in $a) if fx > 2]

yielding

  14.796 ms (14 allocations: 5.23 MiB)
  14.368 ms (14 allocations: 5.23 MiB)

but maybe we need to use a more expensive function to see an effect.

This should be quite fast:

b = filter(>(2), f.(a))
5 Likes

It is :slight_smile:

b3 = @btime filter(>(2), f.(a))

yielding

  7.972 ms (7 allocations: 15.26 MiB)

Even faster to use filter! here.

3 Likes

I don’t know if that’s the case for you, but if your function is invertible, you could do something like this

julia> f.(filter(>(log(2)), a))==filter(>(2), f.(a))
true

julia> @btime f.(filter(>(log(2)), a))
  2.317 ms (7 allocations: 9.96 MiB)

1 Like

Thanks for the help, everyone. I realize that my MWE was a little too minimum. Here’s a better one below.

function foo(x)  # expensive function
    for _ in 1:1000
        x = sqrt(x^2)
    end
    return x
end

function bar(x)  # more expensive function
    for _ in 1:10000
        x = sqrt(x^2)
    end
    return x
end

# desired functionality
b0 = map(a) do x
    c = foo(x)
    if c > 0.5
        return bar(c)
    end
end
filter(!isnothing, b0)

@goerch’s and @Seif_Shebl’s suggestions are still applicable and give the following

a = rand(100)
b1 = @btime [$bar($foo(c)) for c in $a if $foo(c) > 0.5]
b2 = @btime [$bar(c) for c in ($foo(x) for x in $a) if c > 0.5]
b3 = @btime $bar.(filter!(>(0.5), $foo.($a)))

for

  3.304 ms (5 allocations: 1.98 KiB)
  3.097 ms (5 allocations: 1.98 KiB)
  3.031 ms (2 allocations: 1.41 KiB)

Unfortunately, I can’t assume my function is invertible @rocco_sprmnt21.

Is there a more efficient way to use filter! in this context?

Hi,

In the case that you can’t filter out data before applying foo, this is about as efficient as it gets, since the time of calling those function wastly dominates the time it takes to make a few allocations along the way:


# since we want only the time necessary for execution but not for allocating, we need something to prevent calls from being compiled away entirely
julia> @noinline noop(x) = x
noop (generic function with 1 method)

# time it takes for the foo and bar calls only:
julia> @btime for x in $a
           noop(foo(x))
       end
  511.200 μs (0 allocations: 0 bytes)

julia> barinputs = filter!(>(0.5), foo.(a));

julia> @btime for x in $barinputs
           noop(bar(x))
       end
  2.552 ms (0 allocations: 0 bytes)

so this is fairly optimal already:

julia> b3 = @btime bar.(filter!(>(0.5), foo.($a)));
  3.057 ms (2 allocations: 1.31 KiB)

In case you worry about the allocations: if you are fine with overwriting your input a, you can get rid of those, but unless the immediate values are huge in storage, the net gain is negligible:

julia> function foowithoutallocations!(a)
           @inbounds for i in eachindex(a)
               a[i] = foo(a[i])
           end
           a
       end
foowithoutallocations! (generic function with 1 method)

julia> function barwithoutallocations!(a)
           @inbounds for i in eachindex(a)
               a[i] = bar(a[i])
           end
           a
       end
barwithoutallocations! (generic function with 1 method)

julia> b4 = @btime barwithoutallocations!(filter!(>(0.5), foowithoutallocations!(acopy))) setup = (acopy=copy(a));
  3.025 ms (0 allocations: 0 bytes)

I’d throw in this for readability:

julia> b5 = @btime [bar(c) for c in foo.($a) if c > 0.5];
  3.083 ms (6 allocations: 2.86 KiB)