Anonymous function applied to multiple arguments

hendri54 · June 11, 2020, 7:14pm

What is the correct syntax for the following:

julia> xV = 1:5; yV = 2:6;

julia> findall((x,y) -> ((x > 2) & (y > 4)), (xV, yV))

The docs only have examples with one argument.

anon94023334 · June 11, 2020, 7:22pm

findall(x -> x[1] > 2 && x[2] > 4, collect(zip(xV, yV)))

will do what you want, though this doesn’t answer your question.

rdeits · June 11, 2020, 7:38pm

You can also write this as:

((x, y),) -> x > 2 && y > 4

i.e. a function which takes a single argument which is unpacked into the tuple (x, y). For example:

julia> f = ((x, y),) -> x > 2 && y > 4
#11 (generic function with 1 method)

julia> args = (1, 2)
(1, 2)

julia> f(args)
false

hendri54 · June 11, 2020, 7:50pm

Thank you for that. Is this any better than

findall((xV .> 2) .& (yV .> 4))

I was trying to avoid this because it would allocate two vectors.

hendri54 · June 11, 2020, 7:54pm

Sorry if I am dense here, but how what I apply this to findall?

anon94023334 · June 11, 2020, 7:56pm

Well, you’ll want to use the logical && instead of the binary & operator, for one. Other than that, it’s probably not much different in terms of allocations. The collect is necessary because you can’t index a zip (unless there’s some really cool thing I don’t know about), so you’re going to allocate for that.

anon94023334 · June 11, 2020, 7:58pm

You’d use that as a replacement for my single-argument function:

findall(f, collect(zip(xV, yV)))

hendri54 · June 11, 2020, 8:06pm

Thanks again.

This is something very common in my line of work. It’s unfortunate that the allocations are not easily avoided. I guess I need to think a bit more.

mauro3 · June 11, 2020, 8:19pm

You can do it with a list comprehension

[i for (i,(x,y)) in enumerate(zip(xV,yV)) if x > 2 && y > 4]

but that gets a bit unwieldily. Maybe just a loop?

You should benchmark to see what’s fastest (BenchmarkTools.jl).

DNF · June 11, 2020, 8:23pm

julia> @btime findall(($xV .> 2) .& ($yV .> 4))
  101.396 ns (3 allocations: 224 bytes)
2-element Array{Int64,1}:
 4
 5

julia> @btime findall(x -> x[1] > 2 && x[2] > 4, collect(zip($xV, $yV)))
  184.162 ns (7 allocations: 432 bytes)
2-element Array{Int64,1}:
 4
 5

The first one allocates a bitvector, which is very memory efficient.

anon94023334 · June 11, 2020, 8:50pm

This is a great point and the difference between the two approaches will only get more pronounced as the vectors get bigger.

hendri54 · June 11, 2020, 9:42pm

I really like this solution. It reads unwieldy, but can be packaged into a function that makes the intention clear.

But before I do this, I need to benchmark the naive solution

findall((xV .> 2) .& (yV .> 4))

against it. The point made by @DNF that this allocates a bitvector perhaps means that I should not worry too much about allocations. Though I am running this on vectors of length around 10,000 in a loop that gets called about 100 times for each model solution.

Again, thanks for all the suggestions.

hendri54 · June 11, 2020, 10:08pm

By my benchmark, the simple findall wins hands down:

using BenchmarkTools, Random

rng = MersenneTwister(123);
n = 10_000;
xV = rand(rng, UInt8.(1:4), n);
yV = rand(rng, UInt8.(1:5), n);

x0 = UInt8(2);
y0 = UInt8(4);

idxV = findall((xV .== x0) .& (yV .== y0));
idx2V = [i for (i,(x,y)) in enumerate(zip(xV,yV)) if x == x0 && y == y0];
@assert isequal(idxV, idx2V)

println("findall")
@benchmark idxV = findall(($xV .== $x0) .& ($yV .== $y0))

println("comprehension")
@benchmark idx2V = [i for (i,(x,y)) in enumerate(zip($xV,$yV)) if x == $x0 && y == $y0]

with output

findall
BenchmarkTools.Trial:
  memory estimate:  9.55 KiB
  allocs estimate:  4
  --------------
  minimum time:     1.892 μs (0.00% GC)
  median time:      2.402 μs (0.00% GC)
  mean time:        3.665 μs (29.99% GC)
  maximum time:     877.968 μs (99.38% GC)
  --------------
  samples:          10000
  evals/sample:     9

comprehension
BenchmarkTools.Trial:
  memory estimate:  8.52 KiB
  allocs estimate:  15
  --------------
  minimum time:     31.903 μs (0.00% GC)
  median time:      32.940 μs (0.00% GC)
  mean time:        34.938 μs (0.00% GC)
  maximum time:     135.270 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

I used UInt8 because that’s what I am using in my actual application.

Thanks again to everyone for helping out.

jlapeyre · June 12, 2020, 1:10am

I’ve benchmarked that kind of thing about half a dozen times when trying to improve efficiency. The allocating version was as fast or faster. This may change in the future due to internal changes. And of course, which is faster may depend on the details of your problem, eg size of the vectors.

Topic		Replies	Views
Findall slow General Usage	8	1915	October 24, 2019
Is there a non-allocating version of findall? Performance indexing , memory-allocation	1	129	October 2, 2024
Findall function New to Julia	10	13767	January 18, 2020
Replacement for find(all) on zip? General Usage	8	825	November 17, 2018
Extracting indices using findall() New to Julia	4	239	October 25, 2022

Anonymous function applied to multiple arguments

Related topics