Converting CartesianIndices to a vector of vectors

Gus_Hart · August 14, 2023, 4:18pm

I’m using CartesianIndices to generate lists resembling base-k numbers. I want the convert the output into a “flat” vector of vectors, irrespective of the number of “digits” passed to CartesianIndices. The lists will generally be short (less than a thousand or so) and I’ll be doing this many, many times for all kinds of cases. I want to do it in a good, Julian way, but I feel that my best solution is a bit roundabout. Here’s my solution:

[[j for j in Tuple(i)] for i in CartesianIndices((2,2,2))][1:end]|>sort

8-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 2]
 [1, 2, 1]
 [1, 2, 2]
 [2, 1, 1]
 [2, 1, 2]
 [2, 2, 1]
 [2, 2, 2]

Any suggested improvements?

mbauman · August 14, 2023, 4:29pm

Something like this?

julia> vec(collect.(Tuple.(CartesianIndices((2,2,2))))) |> sort
8-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 2]
 [1, 2, 1]
 [1, 2, 2]
 [2, 1, 1]
 [2, 1, 2]
 [2, 2, 1]
 [2, 2, 2]

frylock · August 14, 2023, 4:57pm

Yours:

using BenchmarkTools
@btime [[j for j in Tuple(i)] for i in CartesianIndices((2,2,2))][1:end]|>sort
  355.623 ns (11 allocations: 992 bytes)

@mbauman’s:

@btime sort(vec(collect.(Tuple.(CartesianIndices((2,2,2))))))
  333.601 ns (12 allocations: 960 bytes)

I tried this:

@btime [Tuple(c) for c in CartesianIndices((2, 2, 2))][:] |> sort
  134.421 ns (3 allocations: 768 bytes)

I also tried this, but it is slower:

using Base.Iterators
@btime collect(Tuple(product(1:2, 1:2, 1:2))) |> sort
  458.046 ns (12 allocations: 1.20 KiB)

frylock · August 14, 2023, 5:10pm

Also, taking into count that the CartesianIndices seems to increment the left-most index most rapidly:

@btime [reverse(Tuple(c)) for c in CartesianIndices((2, 2, 2))][:]
  68.113 ns (2 allocations: 512 bytes)
 (1, 1, 1)
 (1, 1, 2)
 (1, 2, 1)
 (1, 2, 2)
 (2, 1, 1)
 (2, 1, 2)
 (2, 2, 1)
 (2, 2, 2)

Gus_Hart · August 14, 2023, 5:20pm

The reverse is a good idea. The main speed difference between yours and @mbauman is the missing collect. His returns vectors but yours tuples. That makes me wonder if I really need vectors, but right now that is how my code works.

Gus_Hart · August 14, 2023, 5:21pm

Lots of nice improvements. You eliminated two list comprehensions with the dots. I don’t know why I didn’t see that…

frylock · August 14, 2023, 5:22pm

I was wondering about my use of Tuple vs. Vector … both of you had pretty fast implementations to start with though. I tried Tuple in order to flatten the CartesianIndices … perhaps there is a better way to do that, that would also allow it to be done with Vector.

Gus_Hart · August 14, 2023, 5:24pm

Getting rid of the sort really helps. Thanks. I just tried several other versions, and getting rid of sort helps more anything else.

Gus_Hart · August 14, 2023, 5:32pm

Why is it 5x slower in a function?!

julia> @btime [collect(reverse(Tuple(c))) for c in CartesianIndices(ntuple(i->2,3))][:]
  280.923 ns (10 allocations: 880 bytes)
8-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 2]
 [1, 2, 1]
 [1, 2, 2]
 [2, 1, 1]
 [2, 1, 2]
 [2, 2, 1]
 [2, 2, 2]

versus

function getSvecCands(k,n)
    return vec(collect.(reverse.(Tuple.(CartesianIndices(ntuple(i->k-1,n))))))
end

julia> @btime getSvecCands(3,3)
  1.433 μs (19 allocations: 1.16 KiB)
8-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 2]
 [1, 2, 1]
 [1, 2, 2]
 [2, 1, 1]
 [2, 1, 2]
 [2, 2, 1]
 [2, 2, 2]

stevengj · August 14, 2023, 5:38pm

Probably because here n is a runtime value, not a literal constant: that makes ntuple type-unstable. If you want tuples to be fast, you need to make sure the compiler knows the length — either because it is a literal constant, or because you passed it as a type (via Val(n)).

(If you want arrays of small fixed-length vectors, it’s always going to be better to use either arrays of tuples or arrays of StaticArrays, provided that the length of the vectors is known at compile time.)

Gus_Hart · August 14, 2023, 5:46pm

Ah, brilliant. Thanks for the education. I’m still learning…

julia> @btime getSvecCands(3,Val(3))
  238.700 ns (10 allocations: 880 bytes)
8-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 2]
 [1, 2, 1]
 [1, 2, 2]
 [2, 1, 1]
 [2, 1, 2]
 [2, 2, 1]
 [2, 2, 2]

I also changed the function and the REPL version to be identical, which did make it more of an apples and apples comparison, but the change for 3 => Val(3) made the lion’s share of the difference.

DNF · August 14, 2023, 5:46pm

The most ‘Julian’ way is probably to not do it at all, or at least only lazily. The most efficient thing is to do nothing, avoid redundancies and allocations. I don’t know exactly what you want this for, but it smells a bit of ‘unnecessary wastefulness’

Maybe you can give some more context?

Gus_Hart · August 15, 2023, 1:21am

Hmm. That is good feedback.

Each one of my vectors represents a polynomial, a polynomial that is invariant under the action of a group (it has some symmetry in it, for the non-groupists). But some of these polynomials are equivalent by symmetry (they are on the same orbit). I need to go through each list, find the equivalent ones and delete them from the list.

So I was going to go through the list, apply the group action (as a permutation of the indices) and then keep a running list of the unique ones…now that you make me explain it…I might be able to use a custom hash function to turn the indices into a base-10 number and then just keep track of the unique polynomials in a hash table…