Compare permutaion of two numbers efficiently?

Seif_Shebl · August 3, 2020, 2:53pm

I’m trying to compare two numbers to check if they have identical digits but in different orders. I extracted the digits and sorted them, then I compare the two extracted vectors to check equality. This works fine, but I feel there is a more efficient way, any ideas? Thanks.

function isPerm(n,phi)
    n_digits = sort(digits(n))
    phi_digits = sort(digits(phi))
    n_digits == phi_digits
end

x = rand(9485221:9485229,1000);
y = rand(9485221:9485229,1000);

using BenchmarkTools

@benchmark isPerm.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  566.92 KiB
  allocs estimate:  4003
  --------------
  minimum time:     320.700 μs (0.00% GC)
  median time:      327.600 μs (0.00% GC)
  mean time:        337.926 μs (1.59% GC)
  maximum time:     1.079 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

Tamas_Papp · August 3, 2020, 3:12pm

Algorithmically, this should be fine; perhaps you can bail early if the numbers have a different number of digits because then the condition cannot hold. This in turn can be checked with eg ndigits before calling digits. Benchmark, YMMV depending on how of then you expect this.

Sukera · August 3, 2020, 4:06pm

julia> x = rand(9485221:9485229,1000);

julia> y = rand(9485221:9485229,1000);

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  160.67 KiB
  allocs estimate:  1003
  --------------
  minimum time:     62.100 μs (0.00% GC)
  median time:      63.400 μs (0.00% GC)
  mean time:        76.013 μs (3.00% GC)
  maximum time:     1.122 ms (87.89% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark isPerm_old.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  566.92 KiB
  allocs estimate:  4003
  --------------
  minimum time:     303.400 μs (0.00% GC)
  median time:      310.600 μs (0.00% GC)
  mean time:        337.907 μs (2.39% GC)
  maximum time:     1.221 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

Doesn’t look too much better, right? But don’t worry, that’s because your initial numbers are small:

julia> x = rand(94852219876975:9485229765865123,1000);

julia> y = rand(94852219876975:9485229765865123,1000);

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  160.67 KiB
  allocs estimate:  1003
  --------------
  minimum time:     81.700 μs (0.00% GC)
  median time:      83.000 μs (0.00% GC)
  mean time:        94.922 μs (2.26% GC)
  maximum time:     878.900 μs (86.01% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark isPerm_old.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  819.28 KiB
  allocs estimate:  4020
  --------------
  minimum time:     807.300 μs (0.00% GC)
  median time:      814.300 μs (0.00% GC)
  mean time:        857.359 μs (1.36% GC)
  maximum time:     2.376 ms (41.96% GC)
  --------------
  samples:          5811
  evals/sample:     1

Here’s the code:

function isPerm2(a,b)
    test = fill(0, 10)
    while a > 0 && b > 0
        a, a_ind = divrem(a, 10)
        b, b_ind = divrem(b, 10)
        test[a_ind + 1] += 1
        test[b_ind + 1] -= 1
    end
    a != 0 || b != 0 && return false
    !any(!=(0), test)
end

Now if only the allocation of the test array was elided…

I just noticed, this doesn’t take negative numbers into account

Seif_Shebl · August 3, 2020, 4:06pm

Actually, checking the lengths early had no noticable difference, I guess ndigits is not very light?. I’m thinking of an adhoc algorithm to check this equality using a loop or something that can exit early with the first difference. This function is called repeatedly in a hot loop.

Sukera · August 3, 2020, 4:12pm

That’s hard to do! The first difference isn’t enough, since that might be fixed later on. You could check the remaining number of digits against the remaining number of inconsistencies though.

Seif_Shebl · August 3, 2020, 4:18pm

Thanks, that definitely is significantly faster than original function. And the numbers are guaranteed to be positive integers, so no problem for negatives. Only thing I don’t like is the test array with the fixed length of 10.

Sukera · August 3, 2020, 4:21pm

Hm, I’m not even sure the allocations come from the test array. Its statically sized and should be elided.

The size is 10 since there are 10 digits - the algorithm works by getting the current digit via modulo and indexing into the array.

DNF · August 3, 2020, 4:30pm

If all your test numbers have the same number of digits (as they did in your benchmark), there’s no gain.

DNF · August 3, 2020, 4:31pm

This was really hard to read for some reason. You mean

all(==(0), test)

right? (or all(iszero, test))

Sukera · August 3, 2020, 4:37pm

No, that was deliberate - in theory, any can bail early, whereas all has to check… well all elements The array is only 10 elements large, so it shouldn’t matter too much, but I did see smaller average maximum time compared to the same code with all. We don’t care if there’s more than one digit that caused a mismatch, we just care for any mismatch to occur.

But yes, the result of !any(!=(0), test) the same as with all(==(0), test).

DNF · August 3, 2020, 4:44pm

Are you sure? all should bail immediately if it finds one false.

DNF · August 3, 2020, 4:47pm

What happens if you use StaticArrays.MVector?

Seif_Shebl · August 3, 2020, 4:50pm

Yes, StaticArrays.MVector is 2X faster on my machine. Surprisingly, allocations are the same.

Sukera · August 3, 2020, 4:55pm

Well, it’s what I see on my machine

High Maximum with all

julia> function isPerm2(a,b)
           test = zeros(UInt, 10)
           while a > 0 && b > 0
               a, a_ind = divrem(a, 10)
               b, b_ind = divrem(b, 10)
               test[a_ind + 1] += one(eltype(test))
               test[b_ind + 1] -= one(eltype(test))
           end
           a != 0 || b != 0 && return false
           all(==(0), test)
       end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  160.67 KiB
  allocs estimate:  1003
  --------------
  minimum time:     82.400 μs (0.00% GC)
  median time:      83.600 μs (0.00% GC)
  mean time:        96.823 μs (3.53% GC)
  maximum time:     2.159 ms (83.55% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  160.67 KiB
  allocs estimate:  1003
  --------------
  minimum time:     82.500 μs (0.00% GC)
  median time:      83.900 μs (0.00% GC)
  mean time:        97.933 μs (3.68% GC)
  maximum time:     1.457 ms (90.87% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  160.67 KiB
  allocs estimate:  1003
  --------------
  minimum time:     82.700 μs (0.00% GC)
  median time:      89.400 μs (0.00% GC)
  mean time:        117.816 μs (4.09% GC)
  maximum time:     4.955 ms (93.30% GC)
  --------------
  samples:          10000
  evals/sample:     1

Sukera · August 3, 2020, 4:56pm

Faster overall, but this comes at a cost of larger maximum time, significantly so:

julia> function isPerm2(a,b)
           test = zeros(MVector{10,UInt})
           while a > 0 && b > 0
               a, a_ind = divrem(a, 10)
               b, b_ind = divrem(b, 10)
               test[a_ind + 1] += one(eltype(test))
               test[b_ind + 1] -= one(eltype(test))
           end
           a != 0 || b != 0 && return false
           !any(!=(0), test)
       end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  98.17 KiB
  allocs estimate:  1003
  --------------
  minimum time:     56.600 μs (0.00% GC)
  median time:      58.000 μs (0.00% GC)
  mean time:        69.160 μs (7.61% GC)
  maximum time:     2.781 ms (95.85% GC)
  --------------
  samples:          10000
  evals/sample:     1

Sukera · August 3, 2020, 5:06pm

@inbounds to the rescue!

Version with any

julia> function isPerm2(a,b)
           test = zeros(MVector{10,UInt})
           @inbounds while a > 0 && b > 0
               a, a_ind = divrem(a, 10)
               b, b_ind = divrem(b, 10)
               test[a_ind + 1] += one(eltype(test))
               test[b_ind + 1] -= one(eltype(test))
           end
           a != 0 || b != 0 && return false
           !any(!=(0), test)
       end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  4.42 KiB
  allocs estimate:  3
  --------------
  minimum time:     52.900 μs (0.00% GC)
  median time:      53.200 μs (0.00% GC)
  mean time:        55.344 μs (0.00% GC)
  maximum time:     227.600 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

And now, the difference between the extra negation from any and no negation from all is important:

Version with all

julia> function isPerm2(a,b)
           test = zeros(MVector{10,UInt})
           @inbounds while a > 0 && b > 0
               a, a_ind = divrem(a, 10)
               b, b_ind = divrem(b, 10)
               test[a_ind + 1] += one(eltype(test))
               test[b_ind + 1] -= one(eltype(test))
           end
           a != 0 || b != 0 && return false
           all(==(0), test)
       end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  4.42 KiB
  allocs estimate:  3
  --------------
  minimum time:     52.800 μs (0.00% GC)
  median time:      53.200 μs (0.00% GC)
  mean time:        54.569 μs (0.00% GC)
  maximum time:     158.900 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

Notably, @inbounds doesn’t help if the array isn’t an MVector:

Regular zeros call

julia> function isPerm2(a,b)
           test = zeros(UInt, 10)
           @inbounds while a > 0 && b > 0
               a, a_ind = divrem(a, 10)
               b, b_ind = divrem(b, 10)
               test[a_ind + 1] += one(eltype(test))
               test[b_ind + 1] -= one(eltype(test))
           end
           a != 0 || b != 0 && return false
           all(==(0), test)
       end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.($x,$y)
BenchmarkTools.Trial:
  memory estimate:  160.67 KiB
  allocs estimate:  1003
  --------------
  minimum time:     78.500 μs (0.00% GC)
  median time:      79.800 μs (0.00% GC)
  mean time:        93.693 μs (4.20% GC)
  maximum time:     3.420 ms (93.06% GC)
  --------------
  samples:          10000
  evals/sample:     1

For the ultimate speed, switch from broadcasting to map to save 2 more allocations, leaving only the allocation of the result array:

Map for speed

julia> function isPerm2(a,b)
           test = zeros(MVector{10,UInt})
           @inbounds while a > 0 && b > 0
               a, a_ind = divrem(a, 10)
               b, b_ind = divrem(b, 10)
               test[a_ind + 1] += one(eltype(test))
               test[b_ind + 1] -= one(eltype(test))
           end
           a != 0 || b != 0 && return false
           all(==(0), test)
       end
isPerm2 (generic function with 1 method)

julia> @benchmark map(a -> isPerm2(a...), zip($x,$y))
BenchmarkTools.Trial:
  memory estimate:  1.06 KiB
  allocs estimate:  1
  --------------
  minimum time:     51.600 μs (0.00% GC)
  median time:      51.900 μs (0.00% GC)
  mean time:        54.417 μs (0.00% GC)
  maximum time:     142.000 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

DNF · August 3, 2020, 5:18pm

I don’t buy it. There’s something fishy going on. all shouldn’t be slower than any. They definitely have the same bailout behaviour.

Edit: I can’t see any difference between any and all in your benchmark.

Sukera · August 3, 2020, 5:22pm

It’s the maximum time - 227µs vs 158µs. In the latest version with @inbounds, I think that’s due to the extra ! the any version has to do.

In the versions before that, I don’t have better statistics myself It’s just what I saw across repeated runs, I don’t have an explanation for that either.

DNF · August 3, 2020, 5:24pm

That’s just some random fluctuation from the OS or whatever, no? Anyway, in that case, it’s the all version that is faster.

DNF · August 3, 2020, 5:32pm

For clarity, you can write the return statement as

return iszero(a) && iszero(b) && all(iszero, test)

or

return a == 0 && b == 0 && all(==(0), test)

if you prefer.

Topic		Replies	Views
Ironic observation about `sort` and `sortperm` speed for "small integers" vs R Performance sort , sortperm , r	32	4669	February 4, 2018
Fast permutation vector: code needed Performance sorting	38	642	March 21, 2023
Repeatedly multiplying a Float64 is fastest in chunks of <51 terms Performance	11	1183	March 28, 2020
Combinatorics.permutations creating a lot of allocations Performance question	6	149	July 14, 2025
10x faster sortperm() Performance sortperm	10	2704	October 29, 2024

Compare permutaion of two numbers efficiently?

Related topics