# Help needed with some matrix manipulation

I need some code review… Suppose you are given a matrix A \in \mathbb{N}^{m \times n} you are to determine all the rows k \leq m of A that have a given maximum interaction j which is given by

j = \sum_{i=0}^{n} \mathbf{1}_{A_{k,i} }, \quad \text{where} \quad \mathbf{1}_{A_{k,i}} = \begin{cases} 1, & A_{k,i} > 0, \\ 0, & \text{otherwise}.\end{cases}

Here’s an example. Take

julia> inds
15×4 Array{Int64,2}:
0  0  0  0
1  0  0  0
0  1  0  0
0  0  1  0
0  0  0  1
2  0  0  0
1  1  0  0
1  0  1  0
1  0  0  1
0  2  0  0
0  1  1  0
0  1  0  1
0  0  2  0
0  0  1  1
0  0  0  2


and set j = 2, then

julia> get_interaction(inds,2)
6×4 Array{Int64,2}:
1  1  0  0
1  0  1  0
1  0  0  1
0  1  1  0
0  1  0  1
0  0  1  1


So, the function get_interaction finds all rows of a matrix that have j non-zero entries.

function get_interaction(inds::Matrix{Int},j::Int)
M = .!(iszero.(inds)) # auxiliary variable that is true for every non-zero element of inds
J = findall(x->x==j,sum(M;dims=2))
vcat(map(j->inds[j.I,:],J)'...) # return as a matrix of integers
end


This is one way to implement this, but it feels too bulky. There’s probably a more clever way to exploit the CartesianIndices returned by findall.

thanks!

p.s.: The last row could be simplified to the more performant

    map(j->inds[j.I,:],J)


which would return an arrray of arrays of integers

julia> get_interaction(inds,2)
6-element Array{Array{Int64,1},1}:
[1, 1, 0, 0]
[1, 0, 1, 0]
[1, 0, 0, 1]
[0, 1, 1, 0]
[0, 1, 0, 1]
[0, 0, 1, 1]


Imho a quite julian way would be:

 julia> interactions(inds,j) = Iterators.filter(x -> count(!iszero,x) == j,
eachrow(inds))


which quite closely matches your description. Note that this returns an iterator and to get the resulting vectors you need to collect the result or just iterate through it - might be advantageous if you only work on one row at a time.

The standard filter doesn’t work on eachrow unless you collect first - that’s why I used the one from Iterators but there might be a better solution.

edit:
Comparing the performance for your example input I get roughly an order of magnitude speedup. For larger problems still factor 4 but note that even collecting the iterator returns SubArrays. To get the same, you can e.g. collect each vector individually - the performance advantage is smaller but still exists.

Thanks, this looks promising!

It’s not crucial to return a matrix type of integers, SubArrays are fine (as long as I adapt the functions that use get_interactions downstream).

If it’s only about bulkiness and not performance, I would do

get_interaction(inds, j) = inds[vec(sum(inds .!= 0, dims = 2)) .== j, :]


This is quite fast and returns full array:

inds[ [count(!iszero,r) == j for r in eachrow(inds)], : ]
6×4 Array{Int64,2}:
1  1  0  0
1  0  1  0
1  0  0  1
0  1  1  0
0  1  0  1
0  0  1  1

1 Like