Slow permutations?

natemcintosh · May 13, 2020, 8:27pm

I have a an array of strings, of length ~2,000. I would like to produce all permutations of lengths 1 to n (n is always going to be less than or equal to 2).

The problem: this is taking much longer than I think it should. Am I missing an obvious optimization?

My current attempt is

import Combinatorics

"Return all permutations of sizes 1:`max_length`"
function power_permutations(iterable, max_length)    
    Iterators.flatten(Combinatorics.permutations(iterable, r) for r in 1:max_length)
end

and simply running through it we get

using BenchmarkTools

@benchmark foreach(identity, power_permutations(collect(1:2_000), 2))

BenchmarkTools.Trial: 
  memory estimate:  60.80 GiB
  allocs estimate:  20000015
  --------------
  minimum time:     17.592 s (36.62% GC)
  median time:      17.592 s (36.62% GC)
  mean time:        17.592 s (36.62% GC)
  maximum time:     17.592 s (36.62% GC)
  --------------
  samples:          1
  evals/sample:     1

Woah, that is a lot of memory.

Let’s compare this to a similar version in Python.

import itertools

def power_permutations(iterable, max_length):
    "Return all permutations of sizes 1:`max_length`"
    for r in range(max_length + 1):
        yield from itertools.permutations(iterable, r)

%timeit len(list(power_permutations(list(range(2000)), 2)))                                                                                       
484 ms ± 8.73 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

hmmm, much faster than the Julia version.

Let’s try again with arrays of strings. In Julia:

@benchmark foreach(identity, power_permutations(collect(Iterators.repeated("hello",2_000)), 2))
BenchmarkTools.Trial: 
  memory estimate:  60.80 GiB
  allocs estimate:  20000015
  --------------
  minimum time:     17.404 s (36.79% GC)
  median time:      17.404 s (36.79% GC)
  mean time:        17.404 s (36.79% GC)
  maximum time:     17.404 s (36.79% GC)
  --------------
  samples:          1
  evals/sample:     1

And in Python:

%timeit len(list(power_permutations(["hello"]*2000, 2)))                                                                                          
473 ms ± 3.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

We see more or less the same performance.

Am I missing something obvious here? Is this a shortcoming of the permutations() function in JuliaMath/Combinatorics?

tamasgal · May 13, 2020, 9:02pm

I thought making it a bit more type safe (without checking if it was type safe in first place) and provide a size hint for the output vector would make a big difference, but I only got it down from 18 seconds to 12 seconds, which means the allocations are clearly coming from Combinatorics.permutations:

function power_permutations(iterable, max_length)
    out = Vector{Vector{eltype(iterable)}}()
    sizehint!(out, length(out)^max_length)
    for r ∈ 1:max_length
        for permutation ∈ Combinatorics.permutations(iterable, r)
            push!(out, permutation)
        end
    end
    out
end

julia> @benchmark foreach(identity, power_permutations(collect(1:2_000), 2))
BenchmarkTools.Trial: 
  memory estimate:  60.83 GiB
  allocs estimate:  20000027
  --------------
  minimum time:     12.716 s (12.95% GC)
  median time:      12.716 s (12.95% GC)
  mean time:        12.716 s (12.95% GC)
  maximum time:     12.716 s (12.95% GC)
  --------------
  samples:          1
  evals/sample:     1

tkluck · May 13, 2020, 9:03pm

It looks like Combinatorics uses a pretty inefficient algorithm for generating permutations. There’s allocations every iteration here and here.

It’s probably much better to use Heap’s algorithm; I have an unpackaged version here.

rdeits · May 13, 2020, 9:07pm

I looked at the code for permutations in Combinatorics.jl, and I don’t completely understand what it’s doing. So here’s a pretty simple implementation that seems to perform well for small numbers of permutations. It’ll probably scale badly as the number of permutations increases, but it seems to be reasonably fast on your example:

struct PermutationIterator{T}
    data::T
    length::Int
end

function has_repeats(state)
    for outer in 1:length(state)
        for inner in (outer + 1):length(state)
            if state[outer] == state[inner]
                return true
            end
        end
    end
    return false
end

function increment!(state, max)
    state[end] += 1
    for i in length(state):-1:2
        if state[i] > max
            state[i] = 1
            state[i - 1] += 1
        end
    end
end

function next_permutation!(state, max)
    while true
        increment!(state, max)
        if !has_repeats(state)
            break
        end
    end
end

function Base.iterate(p::PermutationIterator, state = ones(Int, p.length))
    next_permutation!(state, length(p.data))
    if state[1] > length(p.data)
        return nothing
    end
    [p.data[i] for i in state], state
end

Base.length(p::PermutationIterator) = Int(div(factorial(big(length(p.data))), factorial(big(length(p.data) - p.length))))
Base.eltype(p::PermutationIterator) = Vector{eltype(p.data)}

permutations(x, length) = PermutationIterator(x, length)

Compared with Combinatorics.permutations, it’s about 40x faster:

julia> @btime foreach(identity, permutations(1:2000, 2))
  230.554 ms (11994001 allocations: 671.05 MiB)

julia> @btime foreach(identity, Combinatorics.permutations(1:2000, 2))
  9.889 s (19990001 allocations: 60.77 GiB)

It should be possible to do a lot better than what I’ve posted here, particularly if we switch from an iterator producing Vector elements to one that produces Tuple elements

StefanKarpinski · May 13, 2020, 9:18pm

Would be great to at least open an issue on Combinatorics about the slowness. Even if it’s relatively easy to write a bespoke permutation generator, having a fast one in Combinatorics by default would be much better than the current state.

natemcintosh · May 14, 2020, 12:53pm

done

natemcintosh · July 29, 2022, 7:56pm

I finally learned enough that I was able to make a pull request. All tests pass

Topic		Replies	Views
Combinatorics.permutations creating a lot of allocations Performance question	6	146	July 14, 2025
More efficient way of getting permutations New to Julia	3	362	August 13, 2021
Get all Permutations of Array? New to Julia	2	8434	October 18, 2020
With_repetitions_permutations v0.6 VS v1 General Usage question	12	1748	February 1, 2019
In what format does the "permutations" command in the Combinatorics library compute the permutations? General Usage package , combinatorics	2	636	January 12, 2022

Slow permutations?

Related topics