Hi,
I have code that uses StatsBase.summarystats
on a Vector{T} where T <: Real
sometimes T
can be Rational
, and summarystats
fails as it is not defined on Rational
.
Is there an alternative I can use?
My current workaround is to use this block:
if eltype(data) <: Rational
summstats = summarystats(float.(data))
q1 = rationalize(summstats.q25)
q3 = rationalize(summstats.q75)
else
summstats = summarystats(data)
q1 = summstats.q25
q3 = summstats.q75
end
But I was wondering if there was a julian way to do it without the if/else
branching.
Thanks in advance.
nilshg
February 14, 2021, 10:23am
2
I guess you could locally define the method if that’s what you want it to do:
ulia> import StatsBase:summarystats
julia> summarystats(x::Vector{<:Rational}) = summarystats(float.(x))
summarystats (generic function with 2 methods)
julia> data = Rational.(rand(10))
10-element Vector{Rational{Int64}}:
2128900019238815//2251799813685248
49740005378731//1125899906842624
334714768863577//1125899906842624
1998648540630853//4503599627370496
1894366034501649//4503599627370496
4093457228645821//4503599627370496
186443106130015//281474976710656
2069399109862347//2251799813685248
1119960256153467//1125899906842624
614936132122229//2251799813685248
julia> summarystats(data)
Summary Stats:
Length: 10
Missing Count: 0
Mean: 0.590943
Minimum: 0.044178
1st Quartile: 0.328123
Median: 0.553084
3rd Quartile: 0.916481
Maximum: 0.994725
but this is type piracy so proceed with caution and don’t do this in library code.
I haven’t thought about potential complications of defining this method in general but it might be worth opening an issue to discuss with maintainers whether this should maybe be added?
1 Like
yha
February 14, 2021, 11:35am
3
The quantile
function seems to work fine with rationals, so you can implement your block simply as q1, q3 = quantile(data, [1//4, 3//4])
.
julia> quantile(rand(50),[1//4,3//4])
2-element Array{Float64,1}:
0.2188261621212782
0.8143315574024831
julia> quantile(rand(1:20,50).//rand(1:20,50), [1//4,3//4])
2-element Array{Rational{Int64},1}:
11//24
23//12
An issue to StatsBase
might be worthwhile anyway. I see no reason why summarystats
shouldn’t supports rationals too.
2 Likes
It turns out that this only works if the percentile values are also Rational. If my data
is Rational but the percentile values are Float64, then the result is also Float64. Still, this is a good idea.