Julia v1.3 runs this script at least 10x slower than Julia v1.1 on my computer

Summary: This memory-hungry script runs more than 10x slower on my computer using any of the newer versions of Julia while it ran fine on Julia 1.1.1 and even early DEV version of Julia 1.3

Environment:
Xubuntu 16.04.6 x86_64
Linux kernel 4.15.0-58-generic
2 cores: “Intel® Celeron® CPU N2830 @ 2.16GHz”
System memory: 1879MiB, per lshw

# peuler92.jl
# Project Euler #92 - https://projecteuler.net/problem=92
# Square digit chain: Given z in Z+, sum the squares of the digits of z.
#   Use the sum as the new z & repeat. If z becomes 1 or 89, the chain will
#   loop back to that value. Claim is that *all* z will chain to 1 or 89.
# Wanted: Number of chains w/ start vals in [1, 10 000 000) that loop on 89

sumsqdig(n::Int) = sum(i*i for i in digits(n))

function digchain(n::Int)
    while true
        n == 89 && return true
        n == 1  && return false
        n = sumsqdig(n)
    end
end

eu92(N) = sum(digchain(i) for i = 1:(N-1))

# Forget trying the full range with latest Julia: 5_000_000 is sufficient to
# show the difference

Let me make an observation before someone else does. :slight_smile:

"Use a loop with divrem in sumsqdig". This is actually a very good optimization. So, “problem solved”? Not so, for a least three reasons.

  1. The memory hungry use of many sums of small digits vectors is a natural solution for exploratory programming, i.e. programming to explore the problem space and make sure a basic algorithm gives the correct answer.
  2. There are many programs where the use of many small vectors is the appropriate solution.
  3. While my small (RAM) & slow laptop is almost certainly unusual in many settings, there are certainly others who are limited in what they can use. Also, embedded systems (which are notoriously stingy when it comes to RAM) are probably not a primary goal for Julia, but do we want to completely exclude this interesting problem domain?

Here’s some example runs on my elderly laptop. These are obviously not “proof” in a benchmarking sense, but the difference is so large it didn’t seem necessary. Also, all these runs where done one after the other to minimize environmental differences.

$ julia-1.1.1 --startup-file=no --banner=no
julia> include("peuler92.jl")
eu92 (generic function with 1 method)

julia> eu92(100_000) # Compilation
85623

julia> @time eu92(5_000_000)
 10.041794 seconds (74.40 M allocations: 3.991 GiB, 3.28% gc time)
4295843

julia> 

$ julia-1.3.0-rc1 --startup-file=no --banner=no
julia> include("peuler92.jl")
eu92 (generic function with 1 method)

julia> eu92(100_000) # Compilation
85623

julia> @time eu92(5_000_000)
148.563905 seconds (138.79 M allocations: 4.950 GiB, 0.87% gc time)
4295843

julia> 

– Actually, I can use some saved nightlies to narrow the timeframe when the change to Julia occurred

$ julia-13dev164 --startup-file=no --banner=no
julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("9b5fea51ca639f62bc65e5d92aced79bdb28b651", "9b5fea51ca", "master", 164, "2019-05-04 12:32 UTC", false, 0, 1.556973126e9)

julia> include("peuler92.jl")
eu92 (generic function with 1 method)

julia> eu92(100_000) # Compilation
85623

julia> @time eu92(5_000_000)
  8.304364 seconds (69.40 M allocations: 3.916 GiB, 4.38% gc time)
4295843

julia> 

$ julia-13dev308 --startup-file=no --banner=no
julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("6d23d520dfc666d307951b020f5204fbaf5c4291", "6d23d520df", "master", 308, "2019-05-29 15:38 UTC", false, 0, 1.559144318e9)

julia> include("peuler92.jl")
eu92 (generic function with 1 method)

julia> eu92(100_000) # Compilation
85623

julia> @time eu92(5_000_000)
139.023717 seconds (138.79 M allocations: 4.950 GiB, 0.34% gc time)
4295843

So, what happened to Julia in mid-May, between DEV.164 & DEV.308? Was there a
deliberate design decision to be less agressive in garbage collection? (My
best guess.) If so, please consider a command-line option so the user can tune
gc agressiveness, somewhat like the --optimize option.

13 Likes

This is a great summary: a clear, small example; precise information about what changed and even an analysis of the dates. Thank you!

I can reproduce with 1.1 vs a recent 1.3; looking at @code_warntype (optimize=true) it seems the compiler is making very different inlining decisions. This might be the reason for the change in time and number of allocations rather than anything to do with the GC as such.

It’s also interesting that a difference can be observed in sumsqdig without the other machinery:

julia> VERSION
v"1.3.0-alpha.156"

julia> @btime sumsqdig(5000_000)
  661.931 ns (4 allocations: 192 bytes)
25
julia> VERSION
v"1.1.0"

julia> @btime sumsqdig(5000_000)
  91.624 ns (3 allocations: 176 bytes)

Seeing that leads to finding https://github.com/julialang/julia/pull/31722 which seems rather likely to have caused this (possibly only indirectly, by affecting the compiler’s inlining heuristics).

13 Likes

Cross reference to the github bug report: https://github.com/JuliaLang/julia/issues/33014

2 Likes

@Chris_Foster, thanks for the compliments & for looking into this. I hope my report helps resolve the problem.