Boolean short circuit "fusion"


#1

I have an array A. I want to test if there are any values in A satisfying some test:

julia> f1(x) = any(x .> 99)

Easy code to read, but inefficient since we allocate the whole boolean array.

We could also define the following:

julia> function f2(xs)
           for x in xs
               if x > 99
                   return true
               end
           end
           false
       end

which will be faster, but we have to write a new for loop for every boolean expression.

Has anyone explored macros / tricks to get the syntactical lightness of the first with the performance of the second? i.e. to somehow get short-circuited broadcast fusion of elementwise boolean expressions? This should extend naturally to more complex expressions e.g. any((A .> 0) .& (A .< 1) .| (B .== 5)). The same pattern applies to all - i.e. return false as soon as the first false is encountered.


#2
f1(itr) = any(x > 99 for x in itr)

#3

Or

any(x->x>99, itr)

Since it is possible to do this:

any(==(99), itr)

would it be an idea to define >(99) too? (and >=(99), <=(99), etc?)

any(>(99), itr)

#4

Yes, it’s been proposed and I think it should be done.


#5

Yes, thanks, doing it as an iterator works nicely for one array, but gets a bit clunky when you have a few arrays:

julia> any((a > 0) & (b < 0) & (c == 0) for (a, b, c) in zip(A, B, C))

#6

Note that you should use && instead of & to short circuit the inner expression.


#7
any((x) -> x[1]>0 && x[2]<0 && x[3]==0, zip(X,Y,Z))

is a bit lighter, but, yes, it would be great if there were some (terse) way to get ‘fusion’ of broadcasting with reductions.


#8

I like do syntax for this:

any(zip(A, B, C)) do (a, b, c)
    (a > 0) && (b < 0) && (c == 0)
end

#9

I think that ideally this should be solved by something like #24990 in a generic way, instead of defining one-argument versions of multiple functions.

Or, for the slightly less generic case of partial application, one could define a @pa macro such that

@pa f(x)

expands to

y -> f(y, x)

which would take care of >, isequal, and similar.


#10

My impression was that #24990 was some distance away from reaching consensus. I really like that notation, but it’s not obvious to me if it would be equally well-suited to multi-argument lambdas, such as

(_ > 0) && (_ < 0) && (_ == 0)

A really generic solution might be some sort of lazy evaluation, perhaps obviating the need for the reducer(f, itr) pattern entirely.


#11

For some reason, I haven’t become comfortable with the do-notation. Every time I encounter it, I try to read it as the with/as notation from Python. Some times that clarifies it, but some times it doesn’t.


#12

I am not sure that

is a multi-argument lambda — are the _ placeholders different arguments?

In any case, while I am looking forward to the _ syntax, I think I would mostly use it for simple stuff.

For some of that, a simple partial application construct is also feasible; the question is

  1. whether we make some functions PA manually, like isequal (this has already happened, should it be extended like you suggested for >?),
  2. introduce a macro to make (1) more convenient, but still manually applied (ie at function definition),
  3. make PA automatic (I don’t think this meshes well with Julia’s semantics though, and is a large and breaking change),
  4. introduce a macro for using PA, like the @pa I suggested above.

I am always undecided between these (except 3, which I don’t consider to be a good solution) and alternatives (the _ syntax for closures, etc). Sometimes I would find the more convenient, but at the same time closures are universal, quite readable, and just take a few extra characters.


#13

Exactly, it’s not obvious what the meaning ought to be. In this case I intended it to be different arguments, but it could just as easily be one argument.


#14

Yes, that’s why I find

(x, y, z) -> (x > 0) && (y < 0) && (z == 0)

to be the best solution, even if more verbose.

It’s not that there aren’t more concise ways of writing this, but none of these are as general and/or run into corner cases (as shown eg in the discussion of the PR above).


#15

I’ve wondered for a while whether it is possible to abuse the broadcast machinery to do something like this (basically to avoid materialize being called on the lazy broadcasted object). I’ve not found a nice way of doing it (i.e., a way that doesn’t obscure/muddy the semantics of dot broadcasting) but here is a not-so-nice way of doing it.

myany(x) = any(x)  # avoid piracy!
Broadcast.broadcasted(::typeof(myany), x) = myany(x)  # exploit the fact that the broadcasted object is iterable

On a simple test case this gives

julia> a = rand(10000) ; b = rand(10000) ; c = rand(10000) ;

julia> @btime any(($a .> 0.5) .& ($b .> 0.5) .& ($c .> 0.5))
  19.828 μs (3 allocations: 5.55 KiB)
true

julia> @btime myany.(($a .> 0.5) .& ($b .> 0.5) .& ($c .> 0.5))
  59.478 ns (10 allocations: 240 bytes)
true

So much faster than the original case but not as fast as the anonymous function or generator expression versions as mentioned above (and it allocates more).

julia> @btime any(((aa, bb, cc),) -> (aa > 0.5) && (bb > 0.5) && (cc > 0.5), zip($a, $b, $c))
  13.820 ns (2 allocations: 64 bytes)
true

julia> @btime any((aa > 0.5) && (bb > 0.5) && (cc > 0.5) for (aa, bb, cc) in zip($a, $b, $c))
  16.822 ns (3 allocations: 80 bytes)
true

#16

That would be very nice if there was a way to make that work generically; this pattern comes up pretty frequently, e.g. dot(x,y) = sum(x.*y), exit = all(abs.(err) <.= tol) and friends. Can’t all reducers in Base just define broadcasted like you do?


#17

Wouldn’t that be bad if you want to apply the reduction on a collection of iterables? Seems like it would be bette to have a way to signal this explicitly.


#18

Yes, I don’t like this solution since it obscures the meaning of the dot notation. I’ve been trying to find a way to do this explicitly but I can’t find a way to coerce the lowering code to remove the final materialize call that appears when the dot notation is expanded out. (It doesn’t help that I don’t really know enough Scheme to be able to understand the lowering code properly.)


#19

See the discussion here:


#20

After reading that issue I had some inspiration -

@inline _lazy(x) = x[1]  # unwrap the tuple
@inline Broadcast.broadcasted(::typeof(_lazy), x) = (x,)  # wrap the Broadcasted object in a tuple to avoid materializing
macro lazy(x)
	return esc(:(_lazy(_lazy.($x))))
end

Which gives

julia> a = rand(10000) ; b = rand(10000) ;

julia> @btime sum(a .+ b)
  10.896 μs (5 allocations: 78.27 KiB)
9902.652963666198

julia> @btime sum(@lazy a .+ b)
  13.095 μs (3 allocations: 64 bytes)
9902.652963666245

julia> @btime any(a .+ b .> 1.9)
  15.549 μs (7 allocations: 5.64 KiB)
true

julia> @btime any(@lazy a .+ b .> 1.9)
  1.016 μs (4 allocations: 96 bytes)
true

So not materializing the array isn’t always faster (depends on the operation) but it does save on allocations.


Best practice for broadcastable callables