Compute Distribution among binary vector

GoYetChallenged · March 12, 2022, 3:26pm

I want to compute marginal distributions among binary data.For example, I have several vectors like

a = [0, 1, 0, 1]
b = [1, 0, 0, 1]

and compute all possible distributions, like P(a=0, b=0) = 1/4, P(a=0, b=1) = 1/4, P(a=1, b=0) = 1/4, P(a=1, b=1) = 1/4. I code as followed

function marginal_distribution(data::Matrix, port::Vector)
    m, n = size(data)
    len = 2 ^ length(port)
    res = zeros(Float64, len)
    
    for i in 0:(len-1)
        tmp = ones(typeof(data[1]), m)
        
        for (j, cols) in enumerate(port)
            t = 1 - 1 & (i >> (j - 1))
            tmp .*= t .⊻ data[:, cols]
        end
        res[i+1] = mean(tmp)
    end
    return res
end

Each col of data represent a vector, and port is to choose the vector to be calculated. But this function is too slow. How can I acculate it?

goerch · March 12, 2022, 8:21pm

Hi @GoYetChallenged,

This question could be either directed at domain experts, which might introduce you to better algorithms, or at the general public. For the general understanding it might be better to post a complete M(inimal)W(orking)E(xample) and tell us why you think this function is too slow.

Topic		Replies	Views
[Nerdsnipe warning] Speed up short vector comparisons to beat R Performance	37	1711	April 30, 2024
Improve performance of matrix computation Performance	10	1146	April 25, 2018
Cdf for multinomial General Usage	21	1067	April 2, 2025
Count occurrences of columns in a 2d array (using countmap) Performance question , performance , arrays	13	2088	December 3, 2021
Making multivariate Kolmogorov-Smirnov benchmarks Performance	8	397	June 26, 2022

Compute Distribution among binary vector

Related topics