How can I easily skip indices in an array?

amrods · October 25, 2021, 4:46am

I have arrays with 2 dimensions. Elements in the arrays are only considered valid for certain values on the indices. For example, consider the arrays A and B below:

A = round.(rand(10, 2) .* 100)
A[1:3, 2] .= NaN
B = round.(rand(10, 2) .* 100)
B[1:3, 2] .= NaN

If it helps, you can think of the arrays as recording number of people, where the first index represents age groups, and the second index represents education level (HS and college). Younger individuals, say for indices 1 to 3, are too young to have a college degree, so those entries are illegal.

I have to perform operations on the arrays, say sum or prod. How can I skip the illegal indices for those operations? I can manually keep track of those indices as below

s = sum(A[a, e]*B[a, e] for a in 1:10, e in 1:2 if (e == 1) || ((e == 2) && (a > 3)))
p = prod(A[a, e]*B[a, e] for a in 1:10, e in 1:2 if (e == 1) || ((e == 2) && (a > 3)))

Is there a better, less cumbersome, solution?

jishnub · October 25, 2021, 5:29am

julia> sum(x -> isnan(x) ? zero(x) : x, A[i]*B[i] for i in eachindex(A))
65709.0

julia> C = A .* B; sum(@view C[:,1]) + sum(@view C[4:end, 2])
65709.0

Per · October 25, 2021, 6:26am

The Missing type is useful for denoting missing data. Example:

A = [a ∈ 1:3 && e == 2 ? missing : 100*rand() for a = 1:10, e = 1:2]
B = [a ∈ 1:3 && e == 2 ? missing : 100*rand() for a = 1:10, e = 1:2]
sum(skipmissing(A .* B))

rafael.guerra · October 25, 2021, 6:51am

Try also Cartesian Indices:

CI = filter(x -> x ∉ CartesianIndices((1:3,2:2)), CartesianIndices(A))
s = sum(A[I]*B[I] for I in CI)

amrods · October 25, 2021, 7:11am

How can I build the indices on determined sizes of arrays? For example, I defined A above to be 10-by-2, if I have variables like x = 10, y = 2, how can I build CI using those instead?

rafael.guerra · October 25, 2021, 7:20am

Not sure if this is what you ask for:

CI = filter(x -> x ∉ CartesianIndices((1:3,2:2)), CartesianIndices((1:x,1:y)))

amrods · October 25, 2021, 11:19pm

Can I use that CartesianIndices object to iterate inside a sum? something like

sum(a*e for (a, e) in CI)

is it possible?

rafael.guerra · October 25, 2021, 11:25pm

Please check this other post.

amrods · October 25, 2021, 11:26pm

got it working like this:

sum( *(Tuple(i)...) for i in CI)

rafael.guerra · October 25, 2021, 11:30pm

Not on the computer, but this doesn’t work?

sum(c[1]*c[2] for c in CI)

amrods · October 25, 2021, 11:31pm

It does! I didn’t think about that way.

amrods · November 17, 2021, 4:53am

How can I streamline this?

using StatsBase

L = rand(2, 9, 2, 2, 5)
idx0 = sample(0:1, ProbabilityWeights([0.4, 0.6]), size(L))
L .= L .* idx0

skipzeros = filter(x -> x ∉ findall(L .== 0), CartesianIndices((1:2, 1:9, 1:2, 1:2, 1:5)))

gaek = ones((2, 9, 2, 5))

for idx in skipzeros
    g, a, e, _, k = Tuple(idx)
    t1 = L[g, a, e, 1, k]
    t2 = L[g, a, e, 2, k]
    gaek[g, a, e, k] = t1 + t2
end

gek = ones((2, 2, 5))

for idx in skipzeros
    g, _, e, _, k = Tuple(idx)
    tt1 = gaek[1, :, 1, k]
    tt2 = gaek[2, :, 1, k]
    gek[g, e, k] = sum(tt1 + tt2)
end

The problem is that the second loop is too wasteful in the sense that I am assigning the same tt1, tt2 and gek several times. Perhaps more specific questions would be: how can I “extract” the g, e, and k components from skipzeros?

rafael.guerra · November 17, 2021, 6:13am

You may use getindex(), broadcasted for that purpose:

g = getindex.(skipzeros, 1)
e = getindex.(skipzeros, 3)
k = getindex.(skipzeros, 5)

amrods · November 17, 2021, 6:18am

I was hoping to get another collection of CartesianIndices. Is it possible?

amrods · November 17, 2021, 6:24am

And how can I access all indices along a dimension within the scope of skipzeros? For example gaek[1, :, 1, k], but that : would be among all the indices in the second dimension that are present in skipzeros.

rafael.guerra · November 17, 2021, 6:37am

The following comprehension works but there should be a simpler way:

CI = [CartesianIndex(i[1],i[3],i[5]) for i in skipzeros]

amrods · November 17, 2021, 6:40am

I guess I can also create a generator:

CI = (CartesianIndex(i[1],i[3],i[5]) for i in skipzeros)

any ideas about the other question?

Topic		Replies	Views
How to "localize" slicing an array? General Usage indexing , arrays	3	351	November 18, 2021
Find indices or corresponding values except given indices New to Julia indexing	2	335	December 6, 2021
Slice, ignoring out of bounds General Usage	3	441	July 31, 2023
CartesianIndices overhead Performance indexing	9	673	September 13, 2020
Mix of index-based and slice based indexing across different dimensions General Usage arrays	1	194	August 8, 2023

How can I easily skip indices in an array?

Related topics