Here’s the relevant code from a toy problem I’m working with:
‘’‘#Inputs
data = [1 1;
1 1;
0 1;
0 1;
0 0;
1 0;
0 0;
0 0]
C = [“A”,“B”];
Interactions = [[1,2],[5,7,8]]’‘’
I’m trying to build sets of indices corresponding to duplicate entries in data. Because rows 1 and 2 of data are all ones, row 1 of Interactions needs to be [1,2] and row 2 needs to be [7, 8] as they are all zeros.
I’m trying to write the code to scale up to include more columns and higher order interactions. For instance, in a dataset with 3 columns it should build of set of indices for entries with ones in the same two columns, zeros in the same two columns, ones in all three columns, and zeros in all there columns. Any help is appreciated!
I am writing an optimization problem with lots of constraints, and I figured that only finding duplicates of 1s and 0s would be less computationally demanding.
That being said, finding rows 3 and 4 as duplicates should still produce the same result while the problem is still small. Perhaps narrowing it down to the ones and zeros is a later step to take.
julia> function indmap(v)
dd=Dict{Array{Int64}, Array{Int64}}()
for i in eachindex(v)
if v[i][1] == v[i][2]
push!(get!(()->Int[],dd,v[i]),i)
end
end
dd
end
indmap (generic function with 1 method)
julia> indmap(eachrow(data))
Dict{Array{Int64}, Array{Int64}} with 2 entries:
[0, 0] => [5, 7, 8]
[1, 1] => [1, 2]
An ambitious attempt to generalize
function indmap(v, pred=(_)->true)
dd=Dict{Array{Int64}, Array{Int64}}()
for i in eachindex(v)
if pred(v[i])
push!(get!(()->Int[],dd,v[i]),i)
end
end
dd
end
indmap(eachrow(data))
indmap(eachrow(data), r->allequal(r))