Hello,
I need to calculate the sum of the values in the column of a DataFrame. But it can also be empty.
How can I do that?
sum([])
gives an long stack traceβ¦
Hello,
I need to calculate the sum of the values in the column of a DataFrame. But it can also be empty.
How can I do that?
sum([])
gives an long stack traceβ¦
Both of these work
julia> sum(Float64[])
0.0
julia> sum([], init=0.0)
0.0
As long as the column is typed, sum
should just work:
julia> df = DataFrame(:x => Float64[])
0Γ1 DataFrame
Row β x
β Float64
ββββββ΄βββββββββ
julia> sum(df[!,:x])
0.0
julia> combine(df, :x => sum)
1Γ1 DataFrame
Row β x_sum
β Float64
ββββββΌβββββββββ
1 β 0.0
Well, the column type is a union of missing and bool, therefore sum(skipmissing(df.x)) did not work, but sum(skipmissing(df.x), init=0) worksβ¦
Are you sure thatβs the problem?
julia> sum(skipmissing([true, missing]))
1
julia> typeof([true, missing])
Vector{Union{Missing, Bool}}
Yes, that was my problem. Try:
sum(skipmissing([missing]))
there is a recent PR for emptymissing
in Missings.jl which solves this.
emptymissing(sum)(skipmissing([missing]))
maybe not the most elegant syntax, but it will propagate missing
in this scenario.
Also, with Julia 1.6 you can use init
as a keyword argument.
julia> emptymissing(sum)(skipmissing([missing]))
missing
julia> sum(skipmissing([missing]); init = 0.0)
0.0