I want to compute marginal distributions among binary data.For example, I have several vectors like
a = [0, 1, 0, 1]
b = [1, 0, 0, 1]
and compute all possible distributions, like P(a=0, b=0) = 1/4, P(a=0, b=1) = 1/4, P(a=1, b=0) = 1/4, P(a=1, b=1) = 1/4. I code as followed
function marginal_distribution(data::Matrix, port::Vector)
m, n = size(data)
len = 2 ^ length(port)
res = zeros(Float64, len)
for i in 0:(len-1)
tmp = ones(typeof(data[1]), m)
for (j, cols) in enumerate(port)
t = 1 - 1 & (i >> (j - 1))
tmp .*= t .⊻ data[:, cols]
end
res[i+1] = mean(tmp)
end
return res
end
Each col of data represent a vector, and port is to choose the vector to be calculated. But this function is too slow. How can I acculate it?