Ignoring NaNs when calculating means of columns of a dataframe

Hi all,
I have a dataframe with rows as dates and columns as data points. I need to calculate the mean of each column, but the problem is that some of the columns have “NaN” values.
Is there an efficient way to achieve this by ignoring the NaN values for each calculation? (meaning if one column has 4 values but one is NaN, then the mean will only include 3 values in the calculation).
It sounds like a simple problem but I couldn’t find a solution yet.

Thank you!

1 Like

There is a package SkipNan which might help. Look at the following example:

julia> using SkipNan

julia> skipnan([1.0, NaN, 2.0])
skipnan([1.0, NaN, 2.0])

julia> mean(skipnan([1.0, NaN, 2.0]))
1.5
2 Likes