Summary: This memory-hungry script runs more than 10x slower on my computer using any of the newer versions of Julia while it ran fine on Julia 1.1.1 and even early DEV version of Julia 1.3
Environment:
Xubuntu 16.04.6 x86_64
Linux kernel 4.15.0-58-generic
2 cores: “Intel(R) Celeron(R) CPU N2830 @ 2.16GHz”
System memory: 1879MiB, per lshw
# peuler92.jl
# Project Euler #92 - https://projecteuler.net/problem=92
# Square digit chain: Given z in Z+, sum the squares of the digits of z.
# Use the sum as the new z & repeat. If z becomes 1 or 89, the chain will
# loop back to that value. Claim is that *all* z will chain to 1 or 89.
# Wanted: Number of chains w/ start vals in [1, 10 000 000) that loop on 89
sumsqdig(n::Int) = sum(i*i for i in digits(n))
function digchain(n::Int)
while true
n == 89 && return true
n == 1 && return false
n = sumsqdig(n)
end
end
eu92(N) = sum(digchain(i) for i = 1:(N-1))
# Forget trying the full range with latest Julia: 5_000_000 is sufficient to
# show the difference
Let me make an observation before someone else does.
“Use a loop with divrem
in sumsqdig
”. This is actually a very good optimization. So, “problem solved”? Not so, for a least three reasons.
- The memory hungry use of many sums of small digits vectors is a natural solution for exploratory programming, i.e. programming to explore the problem space and make sure a basic algorithm gives the correct answer.
- There are many programs where the use of many small vectors is the appropriate solution.
- While my small (RAM) & slow laptop is almost certainly unusual in many settings, there are certainly others who are limited in what they can use. Also, embedded systems (which are notoriously stingy when it comes to RAM) are probably not a primary goal for Julia, but do we want to completely exclude this interesting problem domain?
Here’s some example runs on my elderly laptop. These are obviously not “proof” in a benchmarking sense, but the difference is so large it didn’t seem necessary. Also, all these runs where done one after the other to minimize environmental differences.
$ julia-1.1.1 --startup-file=no --banner=no
julia> include("peuler92.jl")
eu92 (generic function with 1 method)
julia> eu92(100_000) # Compilation
85623
julia> @time eu92(5_000_000)
10.041794 seconds (74.40 M allocations: 3.991 GiB, 3.28% gc time)
4295843
julia>
$ julia-1.3.0-rc1 --startup-file=no --banner=no
julia> include("peuler92.jl")
eu92 (generic function with 1 method)
julia> eu92(100_000) # Compilation
85623
julia> @time eu92(5_000_000)
148.563905 seconds (138.79 M allocations: 4.950 GiB, 0.87% gc time)
4295843
julia>
– Actually, I can use some saved nightlies to narrow the timeframe when the change to Julia occurred
$ julia-13dev164 --startup-file=no --banner=no
julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("9b5fea51ca639f62bc65e5d92aced79bdb28b651", "9b5fea51ca", "master", 164, "2019-05-04 12:32 UTC", false, 0, 1.556973126e9)
julia> include("peuler92.jl")
eu92 (generic function with 1 method)
julia> eu92(100_000) # Compilation
85623
julia> @time eu92(5_000_000)
8.304364 seconds (69.40 M allocations: 3.916 GiB, 4.38% gc time)
4295843
julia>
$ julia-13dev308 --startup-file=no --banner=no
julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("6d23d520dfc666d307951b020f5204fbaf5c4291", "6d23d520df", "master", 308, "2019-05-29 15:38 UTC", false, 0, 1.559144318e9)
julia> include("peuler92.jl")
eu92 (generic function with 1 method)
julia> eu92(100_000) # Compilation
85623
julia> @time eu92(5_000_000)
139.023717 seconds (138.79 M allocations: 4.950 GiB, 0.34% gc time)
4295843
So, what happened to Julia in mid-May, between DEV.164 & DEV.308? Was there a
deliberate design decision to be less agressive in garbage collection? (My
best guess.) If so, please consider a command-line option so the user can tune
gc agressiveness, somewhat like the --optimize option.