I’m not sure what the cause of the seg fault, but for diagnosis I suggest you start with:
- Instead of a DataFrame, wrap your matrix as a table using
MLJ.table(mat)
(orTables.table(mat)
) - If that still gives seg fault, try using MultivariateStats.jl directly without MLJ interface. In this case you call directly on a matrix, but with columns as observations. You do this with something like:
using MultivariateStats
mat = rand(20, 10000)
theta = fit(PCA, mat, pratio=0.99, maxoutdim=10)
transform(theta, mat)
By the way, I see the the MLJ interface computes a matrix transpose (where I reckon it ought to compute an adjoint) which means an extra copy of your data.