Finding array indices without creating an array using vectors?

I apologize for the poor phrasing of the question - it sounds like an oxymoron!

I have been struggling to write a function F which, given:

  1. vectors of sizes N1, N2, N3… etc
  2. a number 1<= i <= N1 * N2 * N3,

returns indices x1, x2, x3… etc. that correspond to i.

For example, assuming vectors

A = [1, 2], B = [3, 4], C = [5,6],

we can imagine a combined array ABC.

ABC =   1   3   5
        2   3   5
        1   4   5
        2   4   5
        1   3   6
        2   3   6
        1   4   6
        2   4   6

Then, given an input of i = 2, the elements on the second row in ABC are [2,3,5].
These correspond to A[2] = 2, B[1] = 3, C[1] = 5.

The function F would return the indices 2,1,1. ie.

2,1,1 = F(2,A,B,C)

I was wondering if there were functions in Julia that can do the same job as the function that I described above? I am especially hoping to find indices without actually creating the array ABC to minimize memory use.

julia> function F(idx::Integer, arys...)
           res = zeros(Int, length(arys))
           for (i, A) in enumerate(arys)
              l = length(A)
              res[i] = mod1(idx, l)
              idx = cld(idx, l)

I think this does what you want, not 100% sure, my brain is small and it’s kinda late. The idea is to think of each array as n-bit counter. We start with the bit that cycles the quickest, find out how many times it cycled, use that cycle number for the next bit.


I have made a naive attempt by writing down a function as follows.

function returnIndex(index, N1, N2, N3)

    index0 = 1

    tmp10 = div(index - 0.05, index0)
    tmp11 = mod(tmp10, N1)
    A1 = convert(Int, floor(tmp11))  + 1
    index0 = index0*N1
    tmp20 = div(index-0.05, index0)
    tmp21 = mod(tmp20, N2)
    A2 = convert(Int, floor(tmp21)) + 1
    index0 = index0*N2
    tmp30 = div(index-0.05, index0)
    tmp31 = mod(tmp30, N3)
    A3 = convert(Int, floor(tmp31)) + 1
    index0 = index0*N3

    return A1, A2, A3

To make the code more generalizable, I have tried to write the code in a recursive format.

I was wondering if the above function is too expensive for computing?

Also, is there an easy way to let the function accept variable number of arguments?

Thanks!! I will try out the code.

I think this does what you want:

function f(i, vecs::AbstractVector...)
    cis = CartesianIndices(length.(vecs))
    ci = cis[i]

We pretend we have an n-dimensional array where each vector corresponds to one dimension, with CartesianIndices describing all indices of that array in a lazy manner. Then you can index with an integer into that and will get the CartesianIndex at that position.

julia> A = [1, 2]; B = [3, 4]; C = [5,6]
2-element Vector{Int64}:

julia> f(2, A, B, C)
CartesianIndex(2, 1, 1)

Another possibility for collecting the indices sought, edited for variable number of input arrays:

function g(n, vecs...)
    it = [1:length(v) for v in vecs]

A = [1, 2]
B = [3, 4, 5]
C = [6, 7, 8, 9, 10]

julia> g(2, A, B, C)
(2, 1, 1)
julia> g(2, B, C)
(2, 1)
julia> g(2, B)

Just noticed that this function runs out of memory for large arrays. For sake of understanding, could someone point out the issue(s)?
Thanks in advance.

This creates an array that contains the Cartesian product of all the iterators. Only after that (possibly huge) array is created, does [n] extract one value from it.

You could do something like first(Iterators.drop(Iterators.product(it...), n-1)) to avoid creating the array.

1 Like

Is this for complete polynomials? If so, I remember seeing a function that did this in one of the related packages providing orthogonal polynomial utilities.

1 Like
floor(Int, tmp11) 
1 Like

@sudete, thanks for the explanation, it makes sense. However, this function will be very slow for very large arrays.
The direct computation by @jling seems to be more natural/a more direct solution.
@jules’ elegant CartesianIndices solution ran into problems in the first example below, though. It’s as if the choice of the integer types for the array indices was not appropriate for their Cartesian product?

function f1(n::Integer, vecs...)  # by @jling
    res = Vector{Int}(undef, length(vecs))
    @inbounds for i in eachindex(vecs)
        l = length(vecs[i])
       res[i] = mod1(n, l)
       n = cld(n, l)

function f2(n, vecs::AbstractVector...)  # by @jules
    cis = CartesianIndices(length.(vecs))
    ci = cis[n]

using BenchmarkTools
A = randn(1_000_000);
B = randn(2_000_000);
C = randn(5_000_000);
@btime f1(11_000_000_000,$A,$B,$C)   # 41 ns (1 allocation: 112 bytes)
@btime f2(11_000_000_000,$A,$B,$C)   # ERROR: BoundsError: attempt to access 1000000×2000000×5000000

A = randn(10_000_000);
B = randn(20_000_000);
C = randn(50_000_000);
@btime f1(11_000_000_000,$A,$B,$C)   #  43 ns (1 allocation: 112 bytes)
@btime f2(11_000_000_000,$A,$B,$C)   #  310 ns (6 allocations: 192 bytes)
1 Like

That’s probably because:

julia> 1_000_000 * 2_000_000 * 5_000_000

The bounds check integer overflows. I don’t know if that’s worth an issue, because “real” arrays probably never have sizes that can cause that :slight_smile:

julia> function f1(n, vecs...)  # by @jling
           res = Vector{Int}(undef, length(vecs))
           @inbounds for i in eachindex(vecs) # this inbounds can already be inferred by compiler I think
              l = length(vecs[i])
              res[i] = mod1(n, l)
              n = cld(n, l)
f1 (generic function with 1 method)

didn’t know we’re competing for last drop of performance but here’s a much faster version.


@jling, thank you. This is just for Julia learning purposes.

Fixed the code above with your new turbo version and given the new running times, I think logic finally prevailed.

Ah! This was for parallelization of a grid search problem. I was trying to find values for each element of a large functional array.

BTW, hi Taeuk :smiley:

1 Like

Hi Tyler! What a small world!! :smiley:

1 Like