I made a EM Algorithm for Gaussian Mixture Models, but that is not working

I made a EM Algorithm for Gaussian Mixture Models, but that is not working.

error massages → matrix is not Hermitian; Cholesky factorization failed.

why error happen, please teach me.

using LinearAlgebra
using Distributions

struct GMMsResults
    post_prob
    mean
    sigma
    mixing_coef
end


function GMMs!( data, num_class ;seed=1, max_iter=100 )
    
    num_data, dims = size( data )
    rng = MersenneTwister( seed )
    
    # 初期化
    post_prob   = fill( 1/num_class, num_data, num_class ) 
    mean        = rand( rng, dims, num_class )
    sigma       = [ Matrix{Float64}( I, dims, dims ) for _ in 1:num_class ] # 配列にΣがクラス数分入っている
    mixing_coef = fill( 1/num_class, num_class )
    
    for _ in 1:max_iter

        update_post_prob!( data, num_class, post_prob, mean, sigma, mixing_coef )
        sum_class_assignment = sum( post_prob', dims=2 )
        update_mean!( data, num_class, post_prob, mean, sum_class_assignment )
        update_sigma!( data, num_class, post_prob, mean, sigma, sum_class_assignment )
        update_mixing_coef!( data, num_class, mixing_coef, sum_class_assignment )
        
    end
    
    return GMMsResults( post_prob, mean, sigma, mixing_coef )
end


function update_post_prob!( data, num_class, post_prob, mean, sigma, mixing_coef )
    
    num_data = size( data )[1]
    
    for n in 1:num_data
        sum_tmp = 0
        for k in 1:num_class
            sum_tmp += mixing_coef[k] * pdf( MvNormal( mean[:,k], sigma[k] ), data[n,:] )
        end
        
        for k in 1:num_class
            post_prob[n,k] = ( mixing_coef[k] * pdf( MvNormal( mean[:,k], sigma[k] ), data[n,:] ) ) / sum_tmp
        end
    end
    
end

function update_mean!( data, num_class, post_prob, mean, sum_class_assignment )
    
    num_data = size( data )[1]
    
    for k in 1:num_class
        sum_tmp = zeros( num_class )
        for n in 1:num_data
            sum_tmp += post_prob[n,k] .* data[n,:]
        end    

        mean[:,k] = sum_tmp ./ sum_class_assignment[k]
    end
    
end

function update_sigma!( data, num_class, post_prob, mean, sigma, sum_class_assignment )
    
    num_data, dims = size( data )

    for k in 1:num_class
        sum_tmp = zeros( dims, dims )
        for n in 1:num_data
            sum_tmp += post_prob[n,k] .* ( data[n,:] - mean[:, k] ) * ( data[n,:] - mean[:, k] )'
        end    

        sigma[k] = sum_tmp ./ sum_class_assignment[k]
    end
    
end


function update_mixing_coef!( data, num_class, mixing_coef, sum_class_assignment )
    
    num_data = size( data )[1]
    
    for k in 1:num_class
        mixing_coef[k] = sum_class_assignment[k] / num_data
    end
    
end

Does it happens with any data or just in specific cases ?

You can compare your algo with this one: Clustering · BetaML.jl Documentation

If I remember correctly I had the same problem when I didn’t implemented a minimum variance/covariance in the Gaussian…

1 Like

Thanks for reply.
I used any data. data is generated by multivariate normal distribution.

e.g.

μ = ones( 2 )
Σ = I
x = MvNormal( μ, Σ );
y = MvNormal( μ .+ 5 , Σ + ones( 2,2 ) );

X = rand( x, 50 ) 
Y = rand( y, 50 ) 
data = hcat( X, Y )';

plot( aspect_ratio = 1 )
scatter!( data[:,1], data[:,2], markersize=8 )
covellipse!( x.μ, x.Σ )
covellipse!( y.μ, y.Σ )

I`ll compare my alogo with that one.

I solved this problem. A cause of the error is MvNormal in Distribution.jl . When I gave MvNormal positive-definite matrix, MvNormal returned error. So, I replaced MvNormal with gaussian I implemented. My algo is correctly working.

1 Like

Another solution is using a type::Symmetric. When my algo calculate a variance/covariance, it caused a numeric error on Float. That’s why MvNormal() returned error.