Fun One Liners

oschulz · September 16, 2019, 11:02am

Hm, If we had a way to mark a function as pure (no side effects, I don’t mean Base.@pure), the Julia compiler could potentially recognize such duplications and optimize them away. I assume this will have been considered already multiple times by core (and other) Julia developers - but does anybody know if there’s concrete work going in in that direction, currently?

Tamas_Papp · September 16, 2019, 11:09am

There was

https://github.com/rdeits/CommonSubexpressions.jl

but I am not sure this is being considered at the language level.

IMO just assigning the common part to a variable is much cleaner anyway (in terms of code readability).

oschulz · September 16, 2019, 11:13am

I agree, in many cases it is, especially with more heavyweight functions.

But with more lightweight functions, e.g. used several times in a (not too long) mathematical expression, manual re-use of results can make the code less readable, as it increases the difference between the code and the “natural” representation. Of course if the functions are very lightweight, inlining and @fastmath can sometimes do the job automatically, but in many cases they can’t.

chakravala · September 16, 2019, 11:17am

You can also use my pkg Reduce.jl for that if you want symbolic optimized common sub-expressions

julia> using Reduce; load_package(:scope)

julia> Algebra.optimize(:(z = a^2*b^2+10*a^2*m^6+a^2*m^2+2*a*b*m^4+2*b^2*m^6+b^2*m^2))
quote
    g40 = b * a
    g44 = m * m
    g41 = g44 * b * b
    g42 = g44 * a * a
    g43 = g44 * g44
    z = g41 + g42 + g40 * (2g43 + g40) + g43 * (2g41 + 10g42)
end

or if you really want @pure you can just locally define a new method for that like I described here

Marking more standard library methods as pure to enable LLVM optimizations?

module Grassmann
@pure binomial(n::Int,k::Int) = Base.binomial(n,k)
end
This replaces the Base.binomial method with a local method that is pure.

anon92994695 · September 16, 2019, 11:24am

I am a little stuck on this because I agree with both of you.

In machine learning and physics there are often large block matrices where certain computations are repeated. So defining a bunch of ‘extra’ variables to save on compute is a simple way to optimize. Meanwhile having it internally handled would be awesome - would lead to some expressions reading exactly how they do mathematically at no cost post compilation. However, end users may want to construct blocks with nonpure functions(unbeknownst to them) then wonder why their code is so inefficient.

oschulz · September 16, 2019, 11:25am

Wasn’t there some warning against @pure functions using non-@pure functions?

chakravala · September 16, 2019, 11:26am

No, but @pure is not very well documented and can cause your program to segfault, so only use it if you are extremely confident that you fully understand what you are doing with it.

Also, it is not exported from Base and could be pulled from under the rug at any time and change behavior.

Tamas_Papp · September 16, 2019, 11:27am

It would be interesting to see an example.

I have the opposite view (but I recognize it is possible for rational people to disagree about these things, it is mostly a matter of taste). To make this concrete, consider the

e^{-\frac{1}{2} (X - \mu)^T \Sigma^{-1} (X - \mu)}

in the PDF of a normal distribution. For practical purposes, I would always introduce z = X - \mu and code it that way. I find code like this more maintainable.

Tamas_Papp · September 16, 2019, 11:29am

Cf Bug in doc system? And question on "pure" [in docs]; meaning Julia 1.0 is close? - #5 by mbauman

oschulz · September 16, 2019, 12:38pm

That’s a good example - I would often prefer the “textbook” mathematical form in the code in such cases - having the code closer to the math helps readability, in my opinion. I don’t see why this would be more readable if broken up into several lines.

Though in this specific case, automatic elimination of common sub-expressions wouldn’t be enough - I would rewrite this completely to also avoid the creation of short-lived temporary arrays (esp. if \Sigma is smallish).

Tamas_Papp · September 16, 2019, 1:06pm

In practical code I would try to work with (ie store) the Cholesky decomposition \Sigma = LL^T, eg

exp(0.5 * sum(abs2, L \ (x .- mu)))

I think that most “textbook” formulas facilitate proof techniques, not computation (unless, of course, the textbook is on numerical methods ) So preserving them in code is not always a good choice. YMMV.

oschulz · September 16, 2019, 1:13pm

In practical code I would try to work with (ie store) the Cholesky decomposition

Yes, obviously, in this specific case.

I think that most “textbook” formulas facilitate proof techniques … So preserving them in code is not always a good choice

Certainly - not always. An you’re right, in cases like the above, as soon as a bit of linear algebra is concerned, the actual implementation one would choose is often very different from the textbook - and that’s a bit much to ask from the compiler

However, there are certainly also many cases where the only optimization that will happen is manual extraction of common sub-expression into variables. That, a compiler can do.

Of course it’s not just adding an “ispure” tag to a function - with multiple dispatch, the purity may be hard for the function’s author to guarantee, depending on what code the function uses internally. Still conventions like bang-functions may be sufficient to make this workable.

YMMV.

Indeed - i would be surprised if there wouldn’t be quite a lot of cases in which optimizations based on compiler knowledge about purity of functions would be beneficial.

My posting wasn’t intended as a feature request in any way, I was just wondering if there was work going on in this direction.

tkf · September 17, 2019, 12:37am

I wouldn’t call this “not very well documented” https://docs.julialang.org/en/latest/base/base/#Base.@pure :

@pure gives the compiler a hint for the definition of a pure function, helping for type inference.

A pure function can only depend on immutable information. This also means a @pure function cannot use any global mutable state, including generic functions. Calls to generic functions depend on method tables which are mutable global state. Use with caution, incorrect @pure annotation of a function may introduce hard to identify bugs. Double check for calls to generic functions. This macro is intended for internal compiler use and may be subject to changes.

In particular, I think it is very clear that you can’t use generic functions inside @pure functions. Please notice that this is incompatible with your usage:

Marking more standard library methods as pure to enable LLVM optimizations?

module Grassmann
@pure binomial(n::Int,k::Int) = Base.binomial(n,k)
end
This replaces the Base.binomial method with a local method that is pure.

I think it’s OK to depend on implementation details in some private code or maybe even in public code if you are aware that is an implementation detail and hence cannot rely on the semver promises. For example, ForwardDiff.jl uses Threads.atomic_add! inside @generated generator which also is documented that it must be pure. Also, there are (many?) packages using @generated to hoist out argument checking to compile time. Since throw is a side-effect, this is arguably not a valid use case. CUDAnative.jl is mentioning that it can be a real trouble if you want to use it with GPU: https://juliagpu.gitlab.io/CUDAnative.jl/man/hacking/#Generated-functions-1

@NHDaly’s JuliaCon talk is a great summary of the current status on this topic and explaining why you can’t use @pure or @generated in this way safely:

Personally, I often just lift values to type domain as soon as possible and compute things using recursion.

greg_plowman · September 18, 2019, 7:02am

MrUrq:

Simulate the frog problem shown recently on standupmaths with one line and
frog(n,cur_lily,steps) = (cur_lily == n) ? steps : frog(n,rand(cur_lily+1:n),steps+1)
estimate the mean number of steps needed to jump a 10 step wide river with another
mean(frog(10,0,0) for _ in 1:1_000_000)

Ha, I like the frog problem.

Here’s a progression of solutions:

Experimental solution:

frog2(n) = n==0 ? 0 : frog2(rand(0:n-1)) + 1
mean(frog2(10) for _ in 1:10^8)

Analytic solution:

frog3(n) = n==0 ? 0 : sum(1 + frog3(n-i) for i = 1:n) / n

Or with exact rational result:

frog3(n) = n==0 ? 0 : sum(1 + frog3(n-i) for i = 1:n) // n

And for the mathematicians:

frog4(n) = sum(1/i for i = 1:n)

MatFi · September 19, 2019, 2:09pm

Find indexes of elements in array b matching those of array a

I wonder if there is a built in method for this?
(m-> findfirst(x -> x = m ,b)).(a)

b= [ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8]
a = [1, 2, 6, 3]

julia> (m-> findfirst(x -> x >= m ,b)).(a)
4-element Array{Int64,1}:
3
4
8
5

I use this to bulk plot PDE solutions at particular times t as f(x,t) vs x

fredrikekre · September 19, 2019, 2:16pm

indexin:

julia> b = [ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8];
       a =  [1, 2, 6, 3];

julia> indexin(a, b)
4-element Array{Union{Nothing, Int64},1}:
 3
 4
 8
 5

MatFi · September 19, 2019, 2:23pm

Ok… cool, but is there is also a method returning the indexes of values close to the given ones like :

b= [-1, 0, 1.2, 2.1, 3.5, 4.87, 5.1, 6, 7, 8]
a = [1, 2, 6, 3]
julia> indexnear(a, b)
4-element Array{Union{Nothing, Int64},1}:
 3
 4
 8
 5

here i can simply use (m-> findfirst(x -> x >= m ,b)).(a)
how ever this solution is far from being perfect

EDIT: Find indexes of elements in array b close to those in array a
With the help of this thread
we can do:

(m-> findmin(abs.(a.-m))[2]).(b)

anon92994695 · October 16, 2019, 1:18am

Here’s one I found funny,
2 ^ 64

StevenSiew · October 16, 2019, 7:06am

Why is it funny?

2 ^ 64

First of all the number 2 has type Int64
when you raise it to 64th power, it becomes a large number
but the first bit is about whether it is a negative number

so it represent a negative number in Int64

Also the largest number in Int64 is 2^64 - 1
just like the largest number in a byte is 2^8 - 1 (aka 255)

When in doubt use BigInt

julia> big(2)^64
18446744073709551616

dataDiver · October 16, 2019, 10:02am

I am in favour of a solutions thread. That is solutions you discovered on your own and think they are useful enough to share with the community and spark further discussions. It would superset one liners.

Topic		Replies	Views
Sum operations between arrays Performance	21	5628	April 7, 2020
Julia-ism for two-dimensional map? New to Julia	22	3091	May 1, 2018
Seven Lines of Julia (examples sought) General Usage	180	28393	August 1, 2025
Efficient creation of power series matrix or array of arrays Performance question	23	4922	March 29, 2019
Usage of lazy vector instead of function Performance	22	954	November 7, 2022

Fun One Liners

Related topics