Too many allocations for this couple of small functions

Hi there, newbie here

solved Project Euler #4 problem with Julia and, althogh the timing is good, I got too many allocations (around 100k) for such a small program.
I’ve tried with explicit types in the variables and didn’t get any improvements, so obviously I’m doing something wrong.

Here is the code

> function ispalindrome(n)
>     for i in zip(digits(n), reverse!(digits(n)))
>         if i[1] ≠ i[end]
>             return false
>         end
>     end
>     return true
> end
> 
> 
> function Problem4()
>     #=
>     A palindromic number reads the same both ways. The largest palindrome made
>     from the product of two 2-digit numbers is 9009 = 91 x 99.
> 
>     Find the largest palindrome made from the product of two 3-digit numbers. 
>     =#
> 
>     ans = 10_001
>     for i = 100:999, j = i:999
>         if (i*j > ans) && ispalindrome(i*j)
>             ans = i*j
>         end
>     end
>     return ans
> 
> end
>
> @btime Problem4()
> println("Result: ", Problem4())

The digits function creates an array each time you call it. That storage space requires the expensive allocation.

And you perform it twice for each n, which is an additional expense.

I would suggest:

  1. If you create such an array, create it only once and use something like digit_array[index] != digit_array[end+1-index] when traversing it.
  2. Try to not even create the array, rather make your own “get_nth_digit” function that uses modulo division and remainders.
  3. If you prefer to create such an array in advance (because it is much simpler than creating a “get_nth_digit”), create a “scratch space” buffer array that gets overwritten and reused on each iteration. Such buffers are a common way to minimize allocations, because overwriting an already allocated array is drastically faster than searching for enough free space on the heap to allocate a new array. You might need to write a new function digits!(buffer,n) which writes the digits of n in the already allocated array buffer.

3 would probably be my preferred solution over 2.

A side note: explicit types in the arguments of functions do not provide speedup in Julia (with some caveats to that statement). Types in function signatures mainly serve to organize how multiple dispatch would be happening (when you have multiple definitions of the same function with different type signatures). If the type signature is not specified, Julia would still compile a specific implementation for the type being encountered, so things will still be fast.

On the other hand, types specified in the definition of structs can be very important for performance, as having a full type specification at compile time permits the compiler to not use “boxed” objects (objects with unknown type that needs to be checked at runtime).

5 Likes

I would go with a get_first_digit and get_last_digit, compare them, and then recurse on the remaining digits. If you use divrem and precalculate the length of the number (in digits), you can do this with zero allocations. It will also bail out early, so if the first and last digits are different, ispalindrome runs in nearly zero time.

Furthermore, in the main loop of Problem4, make sure to iterate from large numbers to small ones instead of from small to large. Then bail out early from each loop.

You’ve already got everything you need I believe but just a small addition to make the issue obvious:

julia> f(x) = (digits(x); reverse!(digits(x)))
f (generic function with 1 method)

julia> @btime f(12345)
  72.485 ns (2 allocations: 192 bytes)

That’s often how I go about performance tuning - start with the smallest, simplest bit of the code that you can isolate, benchmark it, think about whether it can be improved/allocations reduced.

2 Likes

Here’s an implementation of ispalindrome for comparison

ispalindrome(x::Integer, numdigits=ndigits(x)) = _ispalindrome(x, 10^(numdigits-2))
function _ispalindrome(x::Integer, divider)
    x < 10 && return iszero(x) || iszero(divider)
    (x, digit1) = divrem(x, 10)
    (digit2, x) = divrem(x, divider)
    return digit1 == digit2 && _ispalindrome(x, div(divider, 100))
end

If I try that on 12345 I get

julia> @btime ispalindrome(12345)
  9.710 ns (0 allocations: 0 bytes)
false

Trying it on an actual palindrome, it’s a bit slower:

julia> @btime ispalindrome(12321)
  21.386 ns (0 allocations: 0 bytes)
true

But if you know the length of your number (which you actually do in your application), there’s a significant speedup:

julia> @btime ispalindrome(12321, 5)   # actual palindrome
  10.711 ns (0 allocations: 0 bytes)
true

julia> @btime ispalindrome(12345, 5)  # early bailout
  1.900 ns (0 allocations: 0 bytes)
false

The vast majority of your tests will be false, and you can bail out early, meaning that you can do palindrome testing very fast indeed. One caveat: Benchmark results around ~1ns can be unreliable, but I think it’s safe to say we are close to optimal.

Edit: For longer integers, using the longest possible Int64 palindrome:

julia> @btime ispalindrome(1234567890987654321)
  74.743 ns (0 allocations: 0 bytes)
true

julia> @btime ispalindrome(1234567890987654321, 19)  # supplying numdigits
  60.183 ns (0 allocations: 0 bytes)
true

With early bailout for non-palindrome:

julia> @btime ispalindrome(1234567890987654327)
  16.000 ns (0 allocations: 0 bytes)
false

julia> @btime ispalindrome(1234567890987654327, 19)  # supplying numdigits
  1.200 ns (0 allocations: 0 bytes)
false
2 Likes