Fun One Liners

Hm, If we had a way to mark a function as pure (no side effects, I don’t mean Base.@pure), the Julia compiler could potentially recognize such duplications and optimize them away. I assume this will have been considered already multiple times by core (and other) Julia developers - but does anybody know if there’s concrete work going in in that direction, currently?

1 Like

There was

https://github.com/rdeits/CommonSubexpressions.jl

but I am not sure this is being considered at the language level.

IMO just assigning the common part to a variable is much cleaner anyway (in terms of code readability).

1 Like

I agree, in many cases it is, especially with more heavyweight functions.

But with more lightweight functions, e.g. used several times in a (not too long) mathematical expression, manual re-use of results can make the code less readable, as it increases the difference between the code and the “natural” representation. Of course if the functions are very lightweight, inlining and @fastmath can sometimes do the job automatically, but in many cases they can’t.

1 Like

You can also use my pkg Reduce.jl for that if you want symbolic optimized common sub-expressions

julia> using Reduce; load_package(:scope)

julia> Algebra.optimize(:(z = a^2*b^2+10*a^2*m^6+a^2*m^2+2*a*b*m^4+2*b^2*m^6+b^2*m^2))
quote
    g40 = b * a
    g44 = m * m
    g41 = g44 * b * b
    g42 = g44 * a * a
    g43 = g44 * g44
    z = g41 + g42 + g40 * (2g43 + g40) + g43 * (2g41 + 10g42)
end

or if you really want @pure you can just locally define a new method for that like I described here

4 Likes

I am a little stuck on this because I agree with both of you.

In machine learning and physics there are often large block matrices where certain computations are repeated. So defining a bunch of ‘extra’ variables to save on compute is a simple way to optimize. Meanwhile having it internally handled would be awesome - would lead to some expressions reading exactly how they do mathematically at no cost post compilation. However, end users may want to construct blocks with nonpure functions(unbeknownst to them) then wonder why their code is so inefficient.

Wasn’t there some warning against @pure functions using non-@pure functions?

No, but @pure is not very well documented and can cause your program to segfault, so only use it if you are extremely confident that you fully understand what you are doing with it.

Also, it is not exported from Base and could be pulled from under the rug at any time and change behavior.

2 Likes

It would be interesting to see an example.

I have the opposite view (but I recognize it is possible for rational people to disagree about these things, it is mostly a matter of taste). To make this concrete, consider the

e^{-\frac{1}{2} (X - \mu)^T \Sigma^{-1} (X - \mu)}

in the PDF of a normal distribution. For practical purposes, I would always introduce z = X - \mu and code it that way. I find code like this more maintainable.

2 Likes

Cf Bug in doc system? And question on "pure" [in docs]; meaning Julia 1.0 is close? - #5 by mbauman

That’s a good example - I would often prefer the “textbook” mathematical form in the code in such cases - having the code closer to the math helps readability, in my opinion. I don’t see why this would be more readable if broken up into several lines.

Though in this specific case, automatic elimination of common sub-expressions wouldn’t be enough - I would rewrite this completely to also avoid the creation of short-lived temporary arrays (esp. if \Sigma is smallish).

In practical code I would try to work with (ie store) the Cholesky decomposition \Sigma = LL^T, eg

exp(0.5 * sum(abs2, L \ (x .- mu)))

I think that most “textbook” formulas facilitate proof techniques, not computation (unless, of course, the textbook is on numerical methods :wink:) So preserving them in code is not always a good choice. YMMV.

3 Likes

In practical code I would try to work with (ie store) the Cholesky decomposition

Yes, obviously, in this specific case.

I think that most “textbook” formulas facilitate proof techniques … So preserving them in code is not always a good choice

Certainly - not always. An you’re right, in cases like the above, as soon as a bit of linear algebra is concerned, the actual implementation one would choose is often very different from the textbook - and that’s a bit much to ask from the compiler

However, there are certainly also many cases where the only optimization that will happen is manual extraction of common sub-expression into variables. That, a compiler can do.

Of course it’s not just adding an “ispure” tag to a function - with multiple dispatch, the purity may be hard for the function’s author to guarantee, depending on what code the function uses internally. Still conventions like bang-functions may be sufficient to make this workable.

YMMV.

Indeed - i would be surprised if there wouldn’t be quite a lot of cases in which optimizations based on compiler knowledge about purity of functions would be beneficial.

My posting wasn’t intended as a feature request in any way, I was just wondering if there was work going on in this direction.

1 Like

I wouldn’t call this “not very well documented” https://docs.julialang.org/en/latest/base/base/#Base.@pure :

@pure gives the compiler a hint for the definition of a pure function, helping for type inference.

A pure function can only depend on immutable information. This also means a @pure function cannot use any global mutable state, including generic functions. Calls to generic functions depend on method tables which are mutable global state. Use with caution, incorrect @pure annotation of a function may introduce hard to identify bugs. Double check for calls to generic functions. This macro is intended for internal compiler use and may be subject to changes.

In particular, I think it is very clear that you can’t use generic functions inside @pure functions. Please notice that this is incompatible with your usage:

I think it’s OK to depend on implementation details in some private code or maybe even in public code if you are aware that is an implementation detail and hence cannot rely on the semver promises. For example, ForwardDiff.jl uses Threads.atomic_add! inside @generated generator which also is documented that it must be pure. Also, there are (many?) packages using @generated to hoist out argument checking to compile time. Since throw is a side-effect, this is arguably not a valid use case. CUDAnative.jl is mentioning that it can be a real trouble if you want to use it with GPU: https://juliagpu.gitlab.io/CUDAnative.jl/man/hacking/#Generated-functions-1

@NHDaly’s JuliaCon talk is a great summary of the current status on this topic and explaining why you can’t use @pure or @generated in this way safely:

Personally, I often just lift values to type domain as soon as possible and compute things using recursion.

5 Likes

Ha, I like the frog problem.

Here’s a progression of solutions:

Experimental solution:

frog2(n) = n==0 ? 0 : frog2(rand(0:n-1)) + 1
mean(frog2(10) for _ in 1:10^8)

Analytic solution:

frog3(n) = n==0 ? 0 : sum(1 + frog3(n-i) for i = 1:n) / n

Or with exact rational result:

frog3(n) = n==0 ? 0 : sum(1 + frog3(n-i) for i = 1:n) // n

And for the mathematicians:

frog4(n) = sum(1/i for i = 1:n)
4 Likes

Find indexes of elements in array b matching those of array a

I wonder if there is a built in method for this?
(m-> findfirst(x -> x = m ,b)).(a)

b= [ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8]
a = [1, 2, 6, 3]

julia> (m-> findfirst(x -> x >= m ,b)).(a)
4-element Array{Int64,1}:
3
4
8
5

I use this to bulk plot PDE solutions at particular times t as f(x,t) vs x

1 Like

indexin:

julia> b = [ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8];
       a =  [1, 2, 6, 3];

julia> indexin(a, b)
4-element Array{Union{Nothing, Int64},1}:
 3
 4
 8
 5
5 Likes

Ok… cool, but is there is also a method returning the indexes of values close to the given ones like :

b= [-1, 0, 1.2, 2.1, 3.5, 4.87, 5.1, 6, 7, 8]
a = [1, 2, 6, 3]
julia> indexnear(a, b)
4-element Array{Union{Nothing, Int64},1}:
 3
 4
 8
 5

here i can simply use (m-> findfirst(x -> x >= m ,b)).(a)
how ever this solution is far from being perfect

EDIT: Find indexes of elements in array b close to those in array a
With the help of this thread
we can do:

(m-> findmin(abs.(a.-m))[2]).(b)

1 Like

Here’s one I found funny,
2 ^ 64

1 Like

Why is it funny?

2 ^ 64

First of all the number 2 has type Int64
when you raise it to 64th power, it becomes a large number
but the first bit is about whether it is a negative number

so it represent a negative number in Int64

Also the largest number in Int64 is 2^64 - 1
just like the largest number in a byte is 2^8 - 1 (aka 255)

When in doubt use BigInt

julia> big(2)^64
18446744073709551616
4 Likes

I am in favour of a solutions thread. That is solutions you discovered on your own and think they are useful enough to share with the community and spark further discussions. It would superset one liners.

1 Like