I have a dataset that results from calculating the CpG observed over expected (an estimation of cytosine methylation in the genome) and I am trying to fit the data into a two mean, 1D Gaussian Mixture model using Gaussian Mixtures. Example input data below:

740736×2 Matrix{Any}:

“RDRX01000001” 0.251146

“RDRX01000002” 0.279088

“RDRX01000003” 0.428743

“RDRX01000004” 0.327771

“RDRX01000005” 0.350942

“RDRX01000006” 0.90064

“RDRX01000007” 0.267866

“RDRX01000008” 0.360507

“RDRX01000009” 0.600434

⋮

“RDRX01799874” 1.40488

“RDRX01799875” 1.65668

“RDRX01799876” 1.66154

“RDRX01799877” 1.04891

“RDRX01799878” 0.987179

“RDRX01799879” 1.29231

“RDRX01799880” 1.72998

“RDRX01799881” 1.08387

The histogram looks like:

I constructed the gmm with:

`g = GMM(2,1,kind=:full)`

and received this output, which I take as a success

GMM{Float64} with 2 components in 1 dimensions and full covariance

Mix 1: weight 0.500000

mean: [0.0]

covariance: 1×1 Matrix{Float64}:

1.0

Mix 2: weight 0.500000

mean: [0.0]

covariance: 1×1 Matrix{Float64}:

1.0

however when I run the training function:

`em!(g,x1)`

where x1 is the matrix of the above data

I get the following error:

ERROR: Inconsistent size gmm and x

Stacktrace:

[1] error(s::String)

@ Base .\error.jl:33

[2] em!(gmm::GMM{Float64, Vector{LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, x::Matrix{Any}; nIter::Int64, varfloor::Float64, sparse::Int64, debug::Int64)

@ GaussianMixtures ~.julia\packages\GaussianMixtures\1pQcF\src\train.jl:238

[3] em!(gmm::GMM{Float64, Vector{LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, x::Matrix{Any})

@ GaussianMixtures ~.julia\packages\GaussianMixtures\1pQcF\src\train.jl:238

[4] top-level scope

@ REPL[16]:1

What am I missing here?