Elegant way to construct a set of multi-index


#1

Good evening,

I wish to build an array containing specific multi-index. More precisely, in the 2-D case the multi-index are given by (i, j) where 0 <= i , j <= N+1 such that i+j <= N+1. The way I do it is make a (N+1)x(N+1) array and cut it in two along its anti-diagonal, namely

βtab = CartesianIndices((N+1, N+1))
βtab = Tuple.( vec(rotl90( UpperTriangular( rotr90(βtab) ) )) )
filter!(x -> sum(x) ≠ 0, βtab)

Does Julia have features that could be used to implement this in a more elegant way ?


#3

I’d just use logical indexing and construct the logical array just like you explained it:

julia> N = 5; A = reshape(1:N^2, N, N);

julia> B = [i+j <= N+1 for i in 1:N, j in 1:N]
5×5 Array{Bool,2}:
 true   true   true   true   true
 true   true   true   true  false
 true   true   true  false  false
 true   true  false  false  false
 true  false  false  false  false

julia> A[B]
15-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 11
 12
 13
 16
 17
 21

#4

Second attempt.

julia> N = 5
julia> a=[ ]
julia> for j in 1:N+1
           a=[a;[(i,j) for i in j:N+1]]
       end
julia> a
1-element Array{Tuple{Int64,Int64},1}:
 (1, 1)
 (2, 1)
 (3, 1)
 (4, 1)
 (5, 1)
 (6, 1)
 (1, 2)
 (2, 2)
 (3, 2)
 (4, 2)
 (5, 2)
 (1, 3)
 (2, 3)
 (3, 3)
 (4, 3)
 (1, 4)
 (2, 4)
 (3, 4)
 (1, 5)
 (2, 5)
 (1, 6)

#5

Thank you for the suggestions! I should have been more precise, my goal is to implement it so that Julia can automatically treat the N-D case.
From both suggestions, it would mean that Julia can automatically increase the number of loop (since they rely on [something for i in I]).
I cannot seem to exploit the geometry of the problem. The code should link M+1 nodes with straight lines and fill the interior. That is, in 2D we link (0,0), (N,0) and (0,N) [triangle]. In 3D we link (0,0,0), (N,0,0), (0,N,0) and (0,0,N) [tetrahedron].


#6

Attempt 3 with caveats

  • it is not elegant but it gets the job done.
  • Your explanation of the geometry of the problem got me confused. I’m not sure if my solution addresses the correct problem.
macro makenodes(N,D)
    quote
        a = []
        @makenodes($N,$D,1)
        a
    end
end
macro makenodes(N, D, i)
    varname = Symbol(:d, i)
    if i > D
        # generate the statement inside the inner loop
        vars = ()
        for j in 1:D
            vars = (vars..., Symbol(:d,j))
        end
        expr1 = Expr(:call, :+, vars...)
        expr2 = Expr(:call, :push!, :a, Expr(:tuple, vars...))
        return Expr(:if, Expr(:call, :(<=), expr1, N+1), expr2)
    else
        # generate as many loops as there are dimensions
        return quote
            for $varname in 0:$(N+1)
                @makenodes($N, $D, $(i+1))
            end
        end
    end
 end
julia> @makenodes(5,3)
84-element Array{Any,1}:
 (0, 0, 0)
 (0, 0, 1)
 (0, 0, 2)
 (0, 0, 3)
 (0, 0, 4)
 (0, 0, 5)
 (0, 0, 6)
 (0, 1, 0)
 (0, 1, 1)
 (0, 1, 2)
 (0, 1, 3)
 (0, 1, 4)
 (0, 1, 5)
 ⋮        
 (3, 2, 1)
 (3, 3, 0)
 (4, 0, 0)
 (4, 0, 1)
 (4, 0, 2)
 (4, 1, 0)
 (4, 1, 1)
 (4, 2, 0)
 (5, 0, 0)
 (5, 0, 1)
 (5, 1, 0)
 (6, 0, 0)

#7

You could do something with the @nloops macro in Base.Cartesian, for example:

@generated function gentups(::Val{d}, N::Integer) where {d}
    quote
        L = NTuple{d,Int}[]
        $(Symbol("i_",d+1)) = 1
        @nloops $d i (j->i_{j+1}:N) begin
            push!(L, @ntuple $d j->i_j)
        end
        return L
    end
end
gentups(d::Integer, N::Integer) = gentups(Val(Int(d)), Int(N))

so that we get, for example:

julia> gentups(3, 5)
35-element Array{Tuple{Int64,Int64,Int64},1}:
 (1, 1, 1)
 (2, 1, 1)
 (3, 1, 1)
 (4, 1, 1)
 (5, 1, 1)
 (2, 2, 1)
 (3, 2, 1)
 (4, 2, 1)
 (5, 2, 1)
 (3, 3, 1)
 (4, 3, 1)
 (5, 3, 1)
 (4, 4, 1)
 ⋮        
 (5, 4, 2)
 (5, 5, 2)
 (3, 3, 3)
 (4, 3, 3)
 (5, 3, 3)
 (4, 4, 3)
 (5, 4, 3)
 (5, 5, 3)
 (4, 4, 4)
 (5, 4, 4)
 (5, 5, 4)
 (5, 5, 5)

(I’m not sure if this is the output you want, or if you also want all permutations of each tuple?)


#8

@jandehaan Oh nice. That seems to be it. We build all the m-tuple = (x_1, …, x_m) (where m = dimension) whose entry are the integer from 0 to N (you chose 0 to N+1 in your function) satisfying x_1 + … + x_m <= N.
The geometry approach did not seem to give anything pretty anyways.

@stevengj I have to look more precisely, but it does not seem to give the right tuples. If I set N= 2 and d = 3 I got an empty array. However, thank you for showing me the @nloops macro! That will definitely come in handy.


#9

I had a bug, which is now corrected above. It now gives:

julia> gentups(3,2)
4-element Array{Tuple{Int64,Int64,Int64},1}:
 (1, 1, 1)
 (2, 1, 1)
 (2, 2, 1)
 (2, 2, 2)

#10

Indeed, as you said, it is just missing the permutations to get the full structure! I am looking into adding this into your code.


#11

Ok so there was a slight issue with the code. From my understanding, your code does

for i_3 in 1:N
    for i_2 in i_3:N
        for i_1 in i_2:N

But, I think the loops needed are

for i_3 in 1:N
    for i_2 in 1:N-i_3+1
        for i_1 in 1:N-i_2-i_3+2
            println((i_1, i_2, i_3))
        end
    end
end

Would you know how to pass this to @nloops ?


#12

I think by far the easiest is

tf = [sum(I) <= N+1 for I in Iterators.product(ntuple(d->0:N+1, D)...)]

where D is the dimensionality.


#13

Now that is elegant!
To get the individual tuples:

julia> makenodes(N,D) = [I for I in Iterators.product(ntuple(d->0:N, D)...) if sum(I) <= N]
julia>makenodes(5,3)
84-element Array{Tuple{Int64,Int64,Int64},1}:
 (0, 0, 0)
 (1, 0, 0)
 (2, 0, 0)
 (3, 0, 0)
 (4, 0, 0)
 (5, 0, 0)
 (6, 0, 0)
 (0, 1, 0)
 (1, 1, 0)
 (2, 1, 0)
 (3, 1, 0)
 (4, 1, 0)
 (5, 1, 0)
 ⋮    

#14

Indeed, that is very nice! I am going to use it right. Thank you very much :slightly_smiling_face:

Just a note, with this procedure we end up building a very large object to only use a tiny part of it (especially as the dimension increases).
So the approach used by @stevengj should be the most memory efficient (provided one can fix the loop situation mentioned above). So if we take the 3D case it should be like

function gentuples(N)
    L = NTuple{3, Int}[]
    for i_3 in 1:N
        for i_2 in 1:N-i_3+1
            for i_1 in 1:N-i_2-i_3+2
                push!(L, tuple(i_1,i_2,i_3))
            end
        end
    end
    return L
end

I am surprised that the version proposed by @tim.holy (using the adapted form suggested by @jandehaan) is much faster than the loops and is even more memory efficient. What would explain this behaviour ?


#15

Depends on what you are benchmarking, but when you provide a constant D and N, the compiler will unroll accordingly. Eg look at

@code_typed makenodes(3, 5)

Note that when N and D are only known at runtime, you will not get this.


#16

I am not sure I know what I am looking at, I will do some readings about @code_typed. I simply used @time before calling the two functions (choosing N=500 for instance).
Could you clarify what happens if I do not declare N and D as constants and I call the function makenodes ?


#17

I am surprised your gentuples version isn’t fast in 3d. Are you sure you timed it correctly, not including the compilation time? (Using BenchmarkTools.jl is recommended.)

For one in arbitrary dimensionality, the question is whether the compiler knows the dimensionality. You can use the Val construct illustrated above (see the help ?Val) to pass them as compile-time constants.


#18

Maybe you want:

using Base.Cartesian
@generated function gentups(::Val{d}, N::Integer) where {d}
    quote
        L = NTuple{d,Int}[]
        $(Symbol("s_",d+1)) = 0
        @nloops $d i (j->1:(j == $d ? N : N-s_{j+1}+1)) (j->s_{j}=i_{j}+s_{j+1}) begin
            push!(L, @ntuple $d j->i_j)
        end
        return L
    end
end
gentups(d::Integer, N::Integer) = gentups(Val(Int(d)), Int(N))

If you do gentups(3, 5), all of the tuples sum to ≤ 6 as I think you want.


#19

@tim.holy using BenchmarkTools.jl gave better results indeed. I will do some reading about Val then! In all cases, your approach works really well.

@stevengj it does seem to work but I have to think twice before using it. Given a N I would expect to have, in particular, the tuples (1,...,1) all the way down to (N,1, ...,1). While

julia> gentups(2,3)
6-element Array{Tuple{Int64,Int64},1}:
 (1, 1)
 (2, 1)
 (3, 1)
 (1, 2)
 (2, 2)
 (1, 3)

gives the expected result, if I increase the dimension we get

gentups(3,3)
4-element Array{Tuple{Int64,Int64,Int64},1}:
 (1, 1, 1)
 (2, 1, 1)
 (1, 2, 1)
 (1, 1, 2)

so I need to also increase N to have what I want, that is

julia> gentups(3,4)
10-element Array{Tuple{Int64,Int64,Int64},1}:
 (1, 1, 1)
 (2, 1, 1)
 (3, 1, 1)
 (1, 2, 1)
 (2, 2, 1)
 (1, 3, 1)
 (1, 1, 2)
 (2, 1, 2)
 (1, 2, 2)
 (1, 1, 3)

May I ask some technical questions ?
Why did you choose to make this function into a macro ? I have done some readings about macros but it is still not clear. Is it just because it is convenient to use macro sometimes ?
Also, I am not sure what the line gentups(d::Integer, N::Integer) = gentups(Val(Int(d)), Int(N)) at the end really does.


#20

You’ll find more answers in the documentation on Base.Cartesian (it seems a little out of date, it is no longer “one of the few ways to write compact and performant multidimensional code.”)

The answer to your question about a macro is that the number of loops is a variable (d), and hence if you were typing it out manually you’d need to write different versions for different values of d. The macros of Base.Cartesian automate this process—macros operate on code itself rather than on values, and so they are good for this kind of application. The Val construct, and making the function @generated, ensures that d is only used at compile time; the compiled function will, in the end, look essentially exactly like the version you’d type out by hand for each dimensionality individually.

Even though I wrote Base.Cartesian, I recommend against using it unless you have a compelling reason—the code is harder to understand, and generated functions have their disadvantages. However, this might be such a case: it would depend on whether the comprehension solution is in practice a performance bottleneck for you. If it isn’t, I would personally recommend keeping it simple, but as you point out in principle you can beat it by limiting the looping to just the values you plan to keep, and the Base.Cartesian solution can be leveraged to do that.


#21

It seems what you want is for tuple sums to be ≤ N+d-1, for which the @nloops line in gentups would need to be tweaked:

using Base.Cartesian
@generated function gentups(::Val{d}, N::Integer) where {d}
    quote
        L = NTuple{d,Int}[]
        $(Symbol("s_",d+1)) = 0
        @nloops $d i (j->1:(j == $d ? N : N-s_{j+1}+$d-1)) (j->s_{j}=i_{j}+s_{j+1}) begin
            push!(L, @ntuple $d j->i_j)
        end
        return L
    end
end
gentups(d::Integer, N::Integer) = gentups(Val(Int(d)), Int(N))

This then gives

julia> gentups(3,3)
10-element Array{Tuple{Int64,Int64,Int64},1}:
 (1, 1, 1)
 (2, 1, 1)
 (3, 1, 1)
 (1, 2, 1)
 (2, 2, 1)
 (1, 3, 1)
 (1, 1, 2)
 (2, 1, 2)
 (1, 2, 2)
 (1, 1, 3)

julia> gentups(4,2)
5-element Array{NTuple{4,Int64},1}:
 (1, 1, 1, 1)
 (2, 1, 1, 1)
 (1, 2, 1, 1)
 (1, 1, 2, 1)
 (1, 1, 1, 2)

as I guess you’re expecting.