# Mysterious allocations in nested scalar loops

I am writing some code to solve Project Euler #60. Given an upper bound N, the program uses 5 nested for loops to find the smallest sum of quintuples (p_1, p_2, p_3, p_4, p_5) of prime numbers, p_i \leqslant N such that concatenating any two of them in any order produces a prime. What surprises me is that there is only scalar code in the loops, yet benchmarking shows a bazillion allocations.

``````julia> @btime euler60(10_000)
712.060 ms (2325284 allocations: 35.49 MiB)
26033
``````
Full code
``````using Primes

function num_digits(n)
d = 0
while n > 0
d += 1
n Γ·= 10
end
d
end

function concatenate(a, b)
d = num_digits(b)
10^d * a + b
end

# check if a prime p can be extended to a tuple of already found primes
for q in values
if !isprime(concatenate(p, q)) || !isprime(concatenate(q, p))
return false
end
end
true
end

function euler60(upper_bound)
prime_list = Primes.primes(3, upper_bound)
n = length(prime_list)
min_sum = typemax(Int)

@inbounds for i1 in 1:(n - 4)
p1 = prime_list[i1]
5p1 > min_sum && break
for i2 in (i1 + 1):(n - 3)
p2 = prime_list[i2]
p1 + 4p2 >= min_sum && break

for i3 in (i2 + 1):(n - 2)
p3 = prime_list[i3]
p1 + p2 + 3p3 >= min_sum && break

for i4 in (i3 + 1):(n - 1)
p4 = prime_list[i4]
p1 + p2 + p3 + 2p4 >= min_sum && break
is_admissible((p1, p2, p3), p4) || continue

for i5 in (i4 + 1):n
p5 = prime_list[i5]
s = p1 + p2 + p3 + p4 + p5
s >= min_sum && break
is_admissible((p1, p2, p3, p4), p5) || continue

# println("primes: \$p1 \$p2 \$p3 \$p4 \$p5, sum = \$s")
min_sum = s
end
end
end
end
end
min_sum
end
``````

There doesnβt seem to be any type instabilities either.

@code_warntype output
``````julia> @code_warntype euler60(10_000)
MethodInstance for euler60(::Int64)
from euler60(upper_bound) @ Main ~/project-euler/euler60.jl:72
Arguments
#self#::Core.Const(euler60)
upper_bound::Int64
Locals
@_3::Union{Nothing, Tuple{Int64, Int64}}
val::Nothing
min_sum::Int64
n::Int64
prime_list::Vector{Int64}
@_8::Union{Nothing, Tuple{Int64, Int64}}
i1::Int64
p1::Int64
@_11::Union{Nothing, Tuple{Int64, Int64}}
i2::Int64
p2::Int64
@_14::Union{Nothing, Tuple{Int64, Int64}}
i3::Int64
p3::Int64
@_17::Union{Nothing, Tuple{Int64, Int64}}
i4::Int64
p4::Int64
i5::Int64
s::Int64
p5::Int64
Body::Int64
1 ββ        Core.NewvarNode(:(val))
β    %2   = Primes.primes::Core.Const(Primes.primes)
β           (prime_list = (%2)(3, upper_bound))
β           (n = Main.length(prime_list))
β           (min_sum = Main.typemax(Main.Int))
β           nothing
β    %7   = (n - 4)::Int64
β    %8   = (1:%7)::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])
β           (@_3 = Base.iterate(%8))
β    %10  = (@_3 === nothing)::Bool
β    %11  = Base.not_int(%10)::Bool
ββββ        goto #39 if not %11
2 ββ        Core.NewvarNode(:(@_8))
β    %14  = @_3::Tuple{Int64, Int64}
β           (i1 = Core.getfield(%14, 1))
β    %16  = Core.getfield(%14, 2)::Int64
β           (p1 = Base.getindex(prime_list, i1))
β    %18  = (5 * p1)::Int64
β    %19  = (%18 > min_sum)::Bool
ββββ        goto #4 if not %19
3 ββ        goto #39
4 ββ %22  = (i1 + 1)::Int64
β    %23  = (n - 3)::Int64
β    %24  = (%22:%23)::UnitRange{Int64}
β           (@_8 = Base.iterate(%24))
β    %26  = (@_8 === nothing)::Bool
β    %27  = Base.not_int(%26)::Bool
ββββ        goto #37 if not %27
5 ββ        Core.NewvarNode(:(@_11))
β    %30  = @_8::Tuple{Int64, Int64}
β           (i2 = Core.getfield(%30, 1))
β    %32  = Core.getfield(%30, 2)::Int64
β           (p2 = Base.getindex(prime_list, i2))
β    %34  = p1::Int64
β    %35  = (4 * p2)::Int64
β    %36  = (%34 + %35)::Int64
β    %37  = (%36 >= min_sum)::Bool
ββββ        goto #7 if not %37
6 ββ        goto #37
7 ββ %40  = Main.is_admissible(p1, p2)::Bool
ββββ        goto #9 if not %40
8 ββ        goto #10
9 ββ        goto #35
10 β %44  = (i2 + 1)::Int64
β    %45  = (n - 2)::Int64
β    %46  = (%44:%45)::UnitRange{Int64}
β           (@_11 = Base.iterate(%46))
β    %48  = (@_11 === nothing)::Bool
β    %49  = Base.not_int(%48)::Bool
ββββ        goto #35 if not %49
11 β        Core.NewvarNode(:(@_14))
β    %52  = @_11::Tuple{Int64, Int64}
β           (i3 = Core.getfield(%52, 1))
β    %54  = Core.getfield(%52, 2)::Int64
β           (p3 = Base.getindex(prime_list, i3))
β    %56  = p1::Int64
β    %57  = p2::Int64
β    %58  = (3 * p3)::Int64
β    %59  = (%56 + %57 + %58)::Int64
β    %60  = (%59 >= min_sum)::Bool
ββββ        goto #13 if not %60
12 β        goto #35
13 β %63  = Core.tuple(p1, p2)::Tuple{Int64, Int64}
ββββ        goto #15 if not %64
14 β        goto #16
15 β        goto #33
16 β %68  = (i3 + 1)::Int64
β    %69  = (n - 1)::Int64
β    %70  = (%68:%69)::UnitRange{Int64}
β           (@_14 = Base.iterate(%70))
β    %72  = (@_14 === nothing)::Bool
β    %73  = Base.not_int(%72)::Bool
ββββ        goto #33 if not %73
17 β        Core.NewvarNode(:(@_17))
β    %76  = @_14::Tuple{Int64, Int64}
β           (i4 = Core.getfield(%76, 1))
β    %78  = Core.getfield(%76, 2)::Int64
β           (p4 = Base.getindex(prime_list, i4))
β    %80  = p1::Int64
β    %81  = p2::Int64
β    %82  = p3::Int64
β    %83  = (2 * p4)::Int64
β    %84  = (%80 + %81 + %82 + %83)::Int64
β    %85  = (%84 >= min_sum)::Bool
ββββ        goto #19 if not %85
18 β        goto #33
19 β %88  = Core.tuple(p1, p2, p3)::Tuple{Int64, Int64, Int64}
ββββ        goto #21 if not %89
20 β        goto #22
21 β        goto #31
22 β %93  = (i4 + 1)::Int64
β    %94  = (%93:n)::UnitRange{Int64}
β           (@_17 = Base.iterate(%94))
β    %96  = (@_17 === nothing)::Bool
β    %97  = Base.not_int(%96)::Bool
ββββ        goto #31 if not %97
23 β %99  = @_17::Tuple{Int64, Int64}
β           (i5 = Core.getfield(%99, 1))
β    %101 = Core.getfield(%99, 2)::Int64
β           (p5 = Base.getindex(prime_list, i5))
β           (s = p1 + p2 + p3 + p4 + p5)
β    %104 = (s >= min_sum)::Bool
ββββ        goto #25 if not %104
24 β        goto #31
25 β %107 = Core.tuple(p1, p2, p3, p4)::NTuple{4, Int64}
ββββ        goto #27 if not %108
26 β        goto #28
27 β        goto #29
28 β %112 = Base.string("primes: ", p1, " ", p2, " ", p3, " ", p4, " ", p5, ", sum = ", s)::String
β           Main.println(%112)
ββββ        (min_sum = s)
29 β        (@_17 = Base.iterate(%94, %101))
β    %116 = (@_17 === nothing)::Bool
β    %117 = Base.not_int(%116)::Bool
ββββ        goto #31 if not %117
30 β        goto #23
31 β        (@_14 = Base.iterate(%70, %78))
β    %121 = (@_14 === nothing)::Bool
β    %122 = Base.not_int(%121)::Bool
ββββ        goto #33 if not %122
32 β        goto #17
33 β        (@_11 = Base.iterate(%46, %54))
β    %126 = (@_11 === nothing)::Bool
β    %127 = Base.not_int(%126)::Bool
ββββ        goto #35 if not %127
34 β        goto #11
35 β        (@_8 = Base.iterate(%24, %32))
β    %131 = (@_8 === nothing)::Bool
β    %132 = Base.not_int(%131)::Bool
ββββ        goto #37 if not %132
36 β        goto #5
37 β        (@_3 = Base.iterate(%8, %16))
β    %136 = (@_3 === nothing)::Bool
β    %137 = Base.not_int(%136)::Bool
ββββ        goto #39 if not %137
38 β        goto #2
39 β        (val = nothing)
β           nothing
β           val
ββββ        return min_sum
``````

System information

``````julia> versioninfo()
Julia Version 1.10.0-DEV.15
Commit 3e7d796c17 (2022-11-16 23:01 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: 16 Γ 12th Gen Intel(R) Core(TM) i5-12500H
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, alderlake)
Threads: 16 on 16 virtual cores
``````

When I run your code I get:

``````julia> euler60(10_000)
ERROR: MethodError: no method matching is_admissible(::Int64, ::Int64)
``````

So you may have some stray method definition lying around.

The reason for the error is in this line:

which should be

``````is_admissible((p1,), p2) || continue  # note extra comma to get a tuple
``````

After fixing this, I still get the same amount of allocations, though.

Another note: this seems to beg for a recursive solution. One loop depth per tuple length seems untenable. What will you do for an octuple, or a quinvigintuple?

1 Like

Itβs probably the `isprime` function. It allocates for some large integers.

julia> @btime isprime(65521)
3.268 ns (0 allocations: 0 bytes)
true

julia> @btime isprime(65537)
212.412 ns (1 allocation: 16 bytes)
true

julia> @btime isprime(6553765601)
4.250 ΞΌs (0 allocations: 0 bytes)
true

julia> @btime isprime(1000723)
474.679 ns (1 allocation: 16 bytes)
true

1 Like

I did some backtracking at first but donβt know how to efficiently prune the search tree (for example if `5p1 > min_sum` I want to stop immediately instead of unwinding the recursion stack).

The main difficulty here is that you donβt know how large the largest number can get . It could hypothetically be the case that 3, 7, 109, 673, 19xxx produces a smaller sum. So to be absolutely sure you need to check with the upper bound 26,000, which is much faster with the nested for loops approach than naive recursion.

Thanks, that seems to be the culprit. I wonder if those allocations negatively affect performance too much.

Seems like this could be fixed with some refactoring, if someone wants to make a PR to Primes.jl:

1 Like