Strict left-to-right *order* of comparisons [chaining] considered harmful [by me]

Just to illustrate, e.g. for:

slow(big(10000000)) < 1 < 0

1 < 0 could have been evaluated to false first (fast, also for variables). In fact it’s allowed by semantics.** [Then this code could have been dead-code eliminated, since in this case I used constants to show it.]

The current implementation of Julia evaluates left-to-right (here). It’s allowed NOT to, see docs (meant for a future implementation, it seems):

“However, the order of evaluations in a chained comparison is undefined. It is strongly recommended not to use expressions with side effects (such as printing) in chained comparisons. If side effects are required, the short-circuit && operator should be used explicitly”

In other words e.g.:

a(x) < b < c(y)

is not the same as:

a(x) < b && b < c(y)

and I would in fact want the same flexibility in the absence of the syntactic sugar for “comparison chaining”, by doing:

a(x) < b and b < c(y)

where and doesn’t have to mean &&. If strict left-to right is desired, when you optimize, you could always substitute && for and.

** I’m assuming this doesn’t have side-effects (e.g. I found a way to make such function):

function slow(n)
    a=0
    for i in 1:n
        a=a*i
    end
    return a
end

If you know that a part of a chained comparison in your own code might be expensive, then just manually arrange the control flow to avoid calling it if possible. This is everyday programming.

If you know how to implement this optimization in general — without slowing down any of the more mundane cases — then great! It wouldn’t even require a change to the documentation. As you note, the order of evaluation isn’t defined.

The current behavior is hardly harmful. Please respect the sensibilities of users and developers alike; posts like this aren’t productive.

2 Likes

You can’t know in general where the slow function is (and it doesn’t have to be that expensive). The slow function may be data dependent, so the same function isn’t always the slowest. A function call (ignoring inlining and outlining) is always likely to be slower than evaluating a variable.

I do have ideas… including for in general. I’m just pointing out a specific case first, something that is easier to change.

Scientists, are going to copy formulas from papers and inequalities. Are they likely to rearrange for speed?

Note, not all scientists are computer scientists.

On your other comments, I’ve just expanded the post, and evaluations order is defined for && (and ||), so your comment doesn’t apply to them - you in fact must evaluate the slow function - even when you could have dead-code eliminated it otherwise.

The title of the post referred to left-to-right order (I didn’t only have the special syntax for chaining, in mind). I’ve now made it more clear, reflecting the updated post.

With strict left to right (as implemented now), you may be lucky, the longer the chain, the likelihood of that order being optimal goes down.

I’m just pointing that out, and:

“The order in which the operations shall be performed in every particular case is a very interesting and curious question…”
— Ada Byron’s notes on the analytical engine 1842.

I think the point is that the compiler can’t know which part of the comparison will be slow either, so doesn’t know that it should have evaluated the 1 < 0 first.

If you have useful code, please submit a Pull Request that people can look at.
Otherwise there is no use in continuing such discussions.

One can probably find a way to give the compiler lattitude in general to do this optimization (note that if there’s no side effects, it already does). However, I fear it would be very surprising, esp. if one of the terms in the comparison has side effects. There is no other case where optimizations/heuristics in julia can lead to observable behavior differences, and adding one here doesn’t necessarily seem worth it. In some respect this is similar to type-inference dependent comprehensions (though they’re worse since the behavior difference is significantly more observable). As one may remember that didn’t turn out too well.

In any case, for the non side-effect case in this thread, the way you’d do it is to make sure llvm can infer that it doesn’t touch memory, and then write an llvm pass that switches comparisons based on estimated runtime (it may already have one).

1 Like

Yes, I’ve found several (I was going to delay responding until with my PR…), and as you know it’s difficult with side-effects (and it’s hard to know if there are any). I’m confused with:

I thought I showed that it didn’t, good to know if it can happen in other cases? [E.g. my case (and I guess then other) can take arbitrarily longer time.] Do you mean any function call presumes side-effects?

I’ll look into it more, maybe I can locate where you think alternative codegen happens.

just as a remark: for constants, it is already happening, because of compile time constant evaluation

julia> f(a) = a < 1 < 0

julia> @code_llvm f(2)

; Function Attrs: uwtable
define i8 @julia_f_71996(i64) #0 {
top:
  ret i8 0
}

the situation is more interesting when you have function calls and variables. however, let me also point out that comparing is a function which can be in general just as slow as the function call. consider

f()::T1 < a::T2 < b::T2

there are three invocations here, <(T1, T2), <(T2, T2) and f(). you can not be sure that the second comparison is any faster.

What I mean is that it has lattitude to do the optimization if it can prove there’s no side effect. Whether that happens depends on whether or not it notices that it can do this and determines this is profitable. How to improve upon that is what I suggested in the last paragraph.

You’ve been helpful as expand(:(a < 1 < 0)) in contrast gave misleading info.

Your way is good to know to confirm what the optimizer does, but for constants it isn’t always smart as shows with: 1 < a < 0

It would be nice if transitive would be considered=>can help.

< is a function in general, yes, but for common cases: machine integers and floats should be fast.