# Compare permutaion of two numbers efficiently?

I’m trying to compare two numbers to check if they have identical digits but in different orders. I extracted the digits and sorted them, then I compare the two extracted vectors to check equality. This works fine, but I feel there is a more efficient way, any ideas? Thanks.

``````function isPerm(n,phi)
n_digits = sort(digits(n))
phi_digits = sort(digits(phi))
n_digits == phi_digits
end

x = rand(9485221:9485229,1000);
y = rand(9485221:9485229,1000);

using BenchmarkTools

@benchmark isPerm.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  566.92 KiB
allocs estimate:  4003
--------------
minimum time:     320.700 μs (0.00% GC)
median time:      327.600 μs (0.00% GC)
mean time:        337.926 μs (1.59% GC)
maximum time:     1.079 ms (0.00% GC)
--------------
samples:          10000
evals/sample:     1
``````

Algorithmically, this should be fine; perhaps you can bail early if the numbers have a different number of digits because then the condition cannot hold. This in turn can be checked with eg `ndigits` before calling `digits`. Benchmark, YMMV depending on how of then you expect this.

``````julia> x = rand(9485221:9485229,1000);

julia> y = rand(9485221:9485229,1000);

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  160.67 KiB
allocs estimate:  1003
--------------
minimum time:     62.100 μs (0.00% GC)
median time:      63.400 μs (0.00% GC)
mean time:        76.013 μs (3.00% GC)
maximum time:     1.122 ms (87.89% GC)
--------------
samples:          10000
evals/sample:     1

julia> @benchmark isPerm_old.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  566.92 KiB
allocs estimate:  4003
--------------
minimum time:     303.400 μs (0.00% GC)
median time:      310.600 μs (0.00% GC)
mean time:        337.907 μs (2.39% GC)
maximum time:     1.221 ms (0.00% GC)
--------------
samples:          10000
evals/sample:     1
``````

Doesn’t look too much better, right? But don’t worry, that’s because your initial numbers are small:

``````julia> x = rand(94852219876975:9485229765865123,1000);

julia> y = rand(94852219876975:9485229765865123,1000);

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  160.67 KiB
allocs estimate:  1003
--------------
minimum time:     81.700 μs (0.00% GC)
median time:      83.000 μs (0.00% GC)
mean time:        94.922 μs (2.26% GC)
maximum time:     878.900 μs (86.01% GC)
--------------
samples:          10000
evals/sample:     1

julia> @benchmark isPerm_old.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  819.28 KiB
allocs estimate:  4020
--------------
minimum time:     807.300 μs (0.00% GC)
median time:      814.300 μs (0.00% GC)
mean time:        857.359 μs (1.36% GC)
maximum time:     2.376 ms (41.96% GC)
--------------
samples:          5811
evals/sample:     1
``````

Here’s the code:

``````function isPerm2(a,b)
test = fill(0, 10)
while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += 1
test[b_ind + 1] -= 1
end
a != 0 || b != 0 && return false
!any(!=(0), test)
end
``````

Now if only the allocation of the `test` array was elided…

I just noticed, this doesn’t take negative numbers into account 2 Likes

Actually, checking the lengths early had no noticable difference, I guess `ndigits` is not very light?. I’m thinking of an adhoc algorithm to check this equality using a loop or something that can exit early with the first difference. This function is called repeatedly in a hot loop.

That’s hard to do! The first difference isn’t enough, since that might be fixed later on. You could check the remaining number of digits against the remaining number of inconsistencies though.

Thanks, that definitely is significantly faster than original function. And the numbers are guaranteed to be positive integers, so no problem for negatives. Only thing I don’t like is the `test` array with the fixed length of 10.

Hm, I’m not even sure the allocations come from the `test` array. Its statically sized and should be elided.

The size is 10 since there are 10 digits - the algorithm works by getting the current digit via modulo and indexing into the array.

If all your test numbers have the same number of digits (as they did in your benchmark), there’s no gain.

3 Likes

This was really hard to read for some reason. You mean

``````all(==(0), test)
``````

right? (or `all(iszero, test)`)

1 Like

No, that was deliberate - in theory, `any` can bail early, whereas `all` has to check… well `all` elements The array is only 10 elements large, so it shouldn’t matter too much, but I did see smaller average maximum time compared to the same code with `all`. We don’t care if there’s more than one digit that caused a mismatch, we just care for `any` mismatch to occur.

But yes, the result of `!any(!=(0), test)` the same as with `all(==(0), test)`.

Are you sure? `all` should bail immediately if it finds one false.

2 Likes

What happens if you use `StaticArrays.MVector`?

2 Likes

Yes, `StaticArrays.MVector` is 2X faster on my machine. Surprisingly, allocations are the same.

Well, it’s what I see on my machine High Maximum with all
``````julia> function isPerm2(a,b)
test = zeros(UInt, 10)
while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += one(eltype(test))
test[b_ind + 1] -= one(eltype(test))
end
a != 0 || b != 0 && return false
all(==(0), test)
end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  160.67 KiB
allocs estimate:  1003
--------------
minimum time:     82.400 μs (0.00% GC)
median time:      83.600 μs (0.00% GC)
mean time:        96.823 μs (3.53% GC)
maximum time:     2.159 ms (83.55% GC)
--------------
samples:          10000
evals/sample:     1

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  160.67 KiB
allocs estimate:  1003
--------------
minimum time:     82.500 μs (0.00% GC)
median time:      83.900 μs (0.00% GC)
mean time:        97.933 μs (3.68% GC)
maximum time:     1.457 ms (90.87% GC)
--------------
samples:          10000
evals/sample:     1

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  160.67 KiB
allocs estimate:  1003
--------------
minimum time:     82.700 μs (0.00% GC)
median time:      89.400 μs (0.00% GC)
mean time:        117.816 μs (4.09% GC)
maximum time:     4.955 ms (93.30% GC)
--------------
samples:          10000
evals/sample:     1
``````

Faster overall, but this comes at a cost of larger maximum time, significantly so:

``````julia> function isPerm2(a,b)
test = zeros(MVector{10,UInt})
while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += one(eltype(test))
test[b_ind + 1] -= one(eltype(test))
end
a != 0 || b != 0 && return false
!any(!=(0), test)
end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  98.17 KiB
allocs estimate:  1003
--------------
minimum time:     56.600 μs (0.00% GC)
median time:      58.000 μs (0.00% GC)
mean time:        69.160 μs (7.61% GC)
maximum time:     2.781 ms (95.85% GC)
--------------
samples:          10000
evals/sample:     1
``````

`@inbounds` to the rescue!

Version with any
``````julia> function isPerm2(a,b)
test = zeros(MVector{10,UInt})
@inbounds while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += one(eltype(test))
test[b_ind + 1] -= one(eltype(test))
end
a != 0 || b != 0 && return false
!any(!=(0), test)
end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  4.42 KiB
allocs estimate:  3
--------------
minimum time:     52.900 μs (0.00% GC)
median time:      53.200 μs (0.00% GC)
mean time:        55.344 μs (0.00% GC)
maximum time:     227.600 μs (0.00% GC)
--------------
samples:          10000
evals/sample:     1
``````

And now, the difference between the extra negation from `any` and no negation from `all` is important:

Version with all
``````julia> function isPerm2(a,b)
test = zeros(MVector{10,UInt})
@inbounds while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += one(eltype(test))
test[b_ind + 1] -= one(eltype(test))
end
a != 0 || b != 0 && return false
all(==(0), test)
end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  4.42 KiB
allocs estimate:  3
--------------
minimum time:     52.800 μs (0.00% GC)
median time:      53.200 μs (0.00% GC)
mean time:        54.569 μs (0.00% GC)
maximum time:     158.900 μs (0.00% GC)
--------------
samples:          10000
evals/sample:     1
``````

Notably, `@inbounds` doesn’t help if the array isn’t an `MVector`:

Regular zeros call
``````julia> function isPerm2(a,b)
test = zeros(UInt, 10)
@inbounds while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += one(eltype(test))
test[b_ind + 1] -= one(eltype(test))
end
a != 0 || b != 0 && return false
all(==(0), test)
end
isPerm2 (generic function with 1 method)

julia> @benchmark isPerm2.(\$x,\$y)
BenchmarkTools.Trial:
memory estimate:  160.67 KiB
allocs estimate:  1003
--------------
minimum time:     78.500 μs (0.00% GC)
median time:      79.800 μs (0.00% GC)
mean time:        93.693 μs (4.20% GC)
maximum time:     3.420 ms (93.06% GC)
--------------
samples:          10000
evals/sample:     1
``````

For the ultimate speed, switch from broadcasting to `map` to save 2 more allocations, leaving only the allocation of the result array:

Map for speed
``````julia> function isPerm2(a,b)
test = zeros(MVector{10,UInt})
@inbounds while a > 0 && b > 0
a, a_ind = divrem(a, 10)
b, b_ind = divrem(b, 10)
test[a_ind + 1] += one(eltype(test))
test[b_ind + 1] -= one(eltype(test))
end
a != 0 || b != 0 && return false
all(==(0), test)
end
isPerm2 (generic function with 1 method)

julia> @benchmark map(a -> isPerm2(a...), zip(\$x,\$y))
BenchmarkTools.Trial:
memory estimate:  1.06 KiB
allocs estimate:  1
--------------
minimum time:     51.600 μs (0.00% GC)
median time:      51.900 μs (0.00% GC)
mean time:        54.417 μs (0.00% GC)
maximum time:     142.000 μs (0.00% GC)
--------------
samples:          10000
evals/sample:     1
``````
1 Like

I don’t buy it. There’s something fishy going on. `all` shouldn’t be slower than `any`. They definitely have the same bailout behaviour.

Edit: I can’t see any difference between any and all in your benchmark.

It’s the maximum time - 227µs vs 158µs. In the latest version with `@inbounds`, I think that’s due to the extra `!` the `any` version has to do.

In the versions before that, I don’t have better statistics myself It’s just what I saw across repeated runs, I don’t have an explanation for that either.

That’s just some random fluctuation from the OS or whatever, no? Anyway, in that case, it’s the `all` version that is faster.

For clarity, you can write the return statement as

``````return iszero(a) && iszero(b) && all(iszero, test)
``````

or

``````return a == 0 && b == 0 && all(==(0), test)
``````

if you prefer.

1 Like