# How can I easily skip indices in an array?

I have arrays with 2 dimensions. Elements in the arrays are only considered valid for certain values on the indices. For example, consider the arrays `A` and `B` below:

``````A = round.(rand(10, 2) .* 100)
A[1:3, 2] .= NaN
B = round.(rand(10, 2) .* 100)
B[1:3, 2] .= NaN
``````

If it helps, you can think of the arrays as recording number of people, where the first index represents age groups, and the second index represents education level (HS and college). Younger individuals, say for indices 1 to 3, are too young to have a college degree, so those entries are illegal.

I have to perform operations on the arrays, say `sum` or `prod`. How can I skip the illegal indices for those operations? I can manually keep track of those indices as below

``````s = sum(A[a, e]*B[a, e] for a in 1:10, e in 1:2 if (e == 1) || ((e == 2) && (a > 3)))
p = prod(A[a, e]*B[a, e] for a in 1:10, e in 1:2 if (e == 1) || ((e == 2) && (a > 3)))
``````

Is there a better, less cumbersome, solution?

``````julia> sum(x -> isnan(x) ? zero(x) : x, A[i]*B[i] for i in eachindex(A))
65709.0

julia> C = A .* B; sum(@view C[:,1]) + sum(@view C[4:end, 2])
65709.0
``````
2 Likes

The `Missing` type is useful for denoting missing data. Example:

``````A = [a ∈ 1:3 && e == 2 ? missing : 100*rand() for a = 1:10, e = 1:2]
B = [a ∈ 1:3 && e == 2 ? missing : 100*rand() for a = 1:10, e = 1:2]
sum(skipmissing(A .* B))
``````
2 Likes

Try also Cartesian Indices:

``````CI = filter(x -> x ∉ CartesianIndices((1:3,2:2)), CartesianIndices(A))
s = sum(A[I]*B[I] for I in CI)
``````
2 Likes

How can I build the indices on determined sizes of arrays? For example, I defined `A` above to be 10-by-2, if I have variables like `x = 10, y = 2`, how can I build `CI` using those instead?

Not sure if this is what you ask for:

``````CI = filter(x -> x ∉ CartesianIndices((1:3,2:2)), CartesianIndices((1:x,1:y)))
``````
1 Like

Can I use that `CartesianIndices` object to iterate inside a `sum`? something like

``````sum(a*e for (a, e) in CI)
``````

is it possible?

got it working like this:

``````sum( *(Tuple(i)...) for i in CI)
``````
1 Like

Not on the computer, but this doesn’t work?

``````sum(c[1]*c[2] for c in CI)
``````
1 Like

It does! I didn’t think about that way.

1 Like

How can I streamline this?

``````using StatsBase

L = rand(2, 9, 2, 2, 5)
idx0 = sample(0:1, ProbabilityWeights([0.4, 0.6]), size(L))
L .= L .* idx0

skipzeros = filter(x -> x ∉ findall(L .== 0), CartesianIndices((1:2, 1:9, 1:2, 1:2, 1:5)))

gaek = ones((2, 9, 2, 5))

for idx in skipzeros
g, a, e, _, k = Tuple(idx)
t1 = L[g, a, e, 1, k]
t2 = L[g, a, e, 2, k]
gaek[g, a, e, k] = t1 + t2
end

gek = ones((2, 2, 5))

for idx in skipzeros
g, _, e, _, k = Tuple(idx)
tt1 = gaek[1, :, 1, k]
tt2 = gaek[2, :, 1, k]
gek[g, e, k] = sum(tt1 + tt2)
end
``````

The problem is that the second loop is too wasteful in the sense that I am assigning the same `tt1`, `tt2` and `gek` several times. Perhaps more specific questions would be: how can I “extract” the `g`, `e`, and `k` components from `skipzeros`?

You may use `getindex()`, broadcasted for that purpose:

``````g = getindex.(skipzeros, 1)
e = getindex.(skipzeros, 3)
k = getindex.(skipzeros, 5)
``````
1 Like

I was hoping to get another collection of `CartesianIndices`. Is it possible?

And how can I access all indices along a dimension within the scope of `skipzeros`? For example `gaek[1, :, 1, k]`, but that `:` would be among all the indices in the second dimension that are present in `skipzeros`.

The following comprehension works but there should be a simpler way:

``````CI = [CartesianIndex(i[1],i[3],i[5]) for i in skipzeros]
``````
1 Like

I guess I can also create a generator:

``````CI = (CartesianIndex(i[1],i[3],i[5]) for i in skipzeros)
``````

any ideas about the other question?