You are first applying the cor, which returns missing because your dataframes have missing values. Then you are applying the skipmissing, but the only thing you have at that moment is a missing value. Hence what you are doing is skipmissing(missing)
If both columns have the same missing values, I suppose you could do:
But even if you were to correct it, cor wouldn’t work. This is a long-standing annoyance.
This is not a good idea, since the observations are not guaranteed to be matched.
We don’t have a good solution for this at the moment. Missings.jl (which is re-exported by DataFrames) provides skipmissings.
julia> using Missings, Statistics
julia> x = [rand() < .2 ? missing : rand() for i in 1:10];
julia> y = [rand() < .2 ? missing : rand() for i in 1:10];
julia> sx, sy = collect.(skipmissings(x, y));
julia> cor(sx, sy)
-0.32257867573052007
But skipmissings is not guaranteed to exist in the future. It’s deliberately documented as such even though Missings.jl is past 1.0.