A blog post involving Julia

stla · October 22, 2020, 10:06am

Hello,

I’ve just blogged a post involving Julia. It compares the speed of the same algorithm in C++, Haskell, and Julia. Haskell won vs Julia 1.2.0, but now it is beaten by Julia 1.5.2.

Cheers.

lungben · October 22, 2020, 10:56am

Interesting comparison!

I think there are still some micro-optimizations possible in Julia, maybe you can beat the 1s

Examples:

NewLast = hcat(NewLast, [fin+j, manque-j, j])

This line copies the whole array in each step, this could be done more efficiently by preallocating the complete output array and writing the results in the elements of the corresponding indices.
Furthermore, you may use the @inbounds macro to switch off bounds checks.
Maybe @avx from LoopVectorization.jl could give further speedup.

stla · October 22, 2020, 12:44pm

Thanks. Indeed, I’m not an expert in Julia (neither in C++ and Haskell ^^). I will try to understand your comments later.
By the way it would be nice to have a package with this implementation of the hypergeometric function of a matrix argument. I don’t know how to proceed. If someone is interested…

DNF · October 22, 2020, 2:00pm

Did you look at https://github.com/JuliaMath/HypergeometricFunctions.jl ?

The documentation is, well, I couldn’t find any, but if you are familiar with hypergeometric functions in general, you may be able to figure it out, or ask the developers. I’m guessing that performance is pretty good, based on who the developers are.

Might be interesting to compare.

DNF · October 22, 2020, 2:43pm

A couple of comments about your code, if you are interested:

You really like type annotations, like most people who start out with Julia tend to. But I think it makes your code a bit hard to read, and perhaps needlessly restrictive. I would reduce the annotations, and just see if things work out by themselves, then selectively add back annotations when you find them necessary.
Nested functions make your code harder to understand, and it’s more difficult to debug and optimize them separately. Maybe you could pull the nested functions out, and pass the arguments in and out explicitly?
A general performance tip is to reduce allocations. That could mean to re-use arrays, or to avoid allocations by doing things like count(>(0), mu) instead of sum(mu .> 0), as an example (count is basically the same as sum here, but I like the name better), or avoid creating new arrays over and over, as in mup = mup[mup .> 0].

Ahmed_Salih · October 22, 2020, 5:45pm

Could you explain in a bit more detail what the difference between your sum and count approaches are at point 3?

I just tried:

mu = rand(1000)

 @time sum(mu .> 0)
  0.000011 seconds (6 allocations: 4.500 KiB)
1000

@time count(>(0),mu)
  0.000008 seconds (4 allocations: 80 bytes)
1000

Which is quite a big difference. Why doesn’t the “sum” use “count” internally then?

Julia 1.5.0

Kind regards

Mason · October 22, 2020, 5:49pm

When you write sum(mu .> 0), you have to create a temporary array for the result of mu .> 0. You can avoid this by doing sum(>(0), mu) which should bring the result closer to count, but count will still be faster.

Note also that you should really use BenchmarkTools.jl and it’s @btime macro rather than @time for reliable results.

julia> using BenchmarkTools

julia> let μ = rand(1000)
           @btime sum($μ .> 0)
           @btime sum(>(0), $μ)
           @btime count(>(0), $μ)
       end
  704.414 ns (3 allocations: 4.42 KiB)
  364.611 ns (0 allocations: 0 bytes)
  74.871 ns (0 allocations: 0 bytes)
1000

Ahmed_Salih · October 22, 2020, 5:56pm

Thanks, sometimes I forget about btime!

Does anyone of you know if there is a blog post about all of these tricks? I really love writing high performant code but these options always slip out of my mind, so would like to know if such a ressource exists.

I do not understand why sum(>(0),$mu) still gives a lot less speed though, since it does not have to make the array? Is it an if-statement which slows it down?

Kind regards

Mason · October 22, 2020, 6:18pm

I don’t think there’s an obvious answer to this. sum is generally just a more complicated function than count.

sum for these arguments ends up dispatches to mapreduce and doing all sorts of complicated things that are important, whereas count just dispatches to

function _simple_count(pred, itr)
    n = 0
    for x in itr
        n += pred(x)::Bool
    end
    return n
end

which presumably is easier to optimize in this case.

To make a bad analogy, asking why sum is slower than count is a little like asking why x + sin(x)^2 + cos(x)^2 is slower than x + 1.0. Sure, they are equivalent, but the code for computing the trigonometric functions is more complicated because it’s designed for something else.

Ahmed_Salih · October 22, 2020, 7:13pm

Thanks for your effort!

You are probably right in that it can be quite difficult to exactly decode where count is winning out on sum, due to sum’s additional complexity.

Kind regards

stla · October 23, 2020, 7:02am

Yes. My code is for the hypergeometric function of a matrix argument, which is much more general.
You don’t see any matrix in the code, yeah ^^ That’s because the value of this hypergeometric function only depends on the eigen values of the matrix, and here the user has to give the eigen values, not the matrix.

Thanks for the comments, I will give a try.

DNF · October 23, 2020, 7:28am

Yes, that’s interesting. It occurred to me that HypergeometricFunctions.jl might not support matrices, but it was honestly a bit hard to tell.

Perhaps the maintainers would be interested in a contribution of that kind? There may be some infrastructure you could build on, and generalizing the package would probably be good.

Skoffer · October 23, 2020, 7:32am

If you want to make it a package (and not just extend any other existing package), then there are few options

This nice presentation from Chris Rackauckas: Developing Julia Packages - Stochastic Lifestyle It is somewhat outdated, but still covers most of the necessary things.
These tutorials from Aurelio Amerio “From zero to Julia!”: Working with Packages - TechyTok and Package Registration and Tests - TechyTok
If you still experience any difficulties, you can ask questions here on discourse, or you can join Zulip or Slack chats for a less formal discussions. Some things are easier to solve in quick question/answer manner.
If you want a more personal approach you can contact me (Andrey Oskin on Zulip). To be honest, I know very little about hypergeometric functions but I’ll be glad to help you set up and register the package.

stla · October 23, 2020, 7:47am

@Skoffer

Thanks a lot.

You don’t need to know about the hypergeometric functions. I just implemented the algorithm given in this paper. It is not easy. I also implemented it in R(cpp) and intensively tested it, and I get the same results with Julia.

DNF · October 23, 2020, 7:53am

Sounds great!

You can make your very own package, which is probably a good learning experience; but I think it would also be a good alternative to open an issue at the HypergeometricFunctions.jl repository and ask whether they would be interested in this contribution, or if it belongs in a separate package.

stla · October 23, 2020, 9:25am

Cool. Thanks to your comments I improved the code. The duration passed from 3 seconds to < 2 seconds. Now I use count instead of sum and filter! instead of mup = mup[mup .> 0].

Topic		Replies	Views
How to improve performance of sum() Performance question , fast-math , tullio , loopvectorization , sum	19	4845	July 19, 2021
Why the function sum1 is faster than builtin sum Performance question , profiling	14	1039	September 21, 2023
Announcing my first Julia package, 'HypergeoMat' Package Announcements math	12	1398	July 18, 2022
Sum with generic implementation is slower than double sum Performance	2	254	May 9, 2023
Compare julia sum to a cpp implementation - julia is extremely slow?! Performance question	35	1794	October 7, 2019

A blog post involving Julia

Related topics