I got a Float64 number from the p-value of a GLM model. I could NOT show the fitted model with the following error:
ERROR: p-values must be in [0; 1]
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] StatsBase.PValue(::Float64) at /Users/Thomas/.julia/packages/StatsBase/EA8Mh/src/statmodels.jl:430
after some investigations, I found that some p-value floats are strange:
julia> p = (GLM.coeftable(mod).cols[4])[5] # mod is a model from GLM.lmI()
julia> typeof(p)
Float64
julia> p == 0
true
julia> p >= 0
true
julia> 1 >= p >= 0
true
julia> 0 <= p <= 1 # strange !!!
false
julia> p == 0.0 # strange !!!
false
in particular, that strange false from 0 <= p <= 1 is the cause of the error messages inside StatsBase.PValue().
I don’t know if it’s a bug from GLM or a more general and subtle bug in Julia? I’m using v1.5.3, thanks.
This is so far from reasonable that you don’t need to worry about the M. Any complete and working example which can reliably reproduce the output from your first message at the end would be good enough.
Pretty much the only ways to explain those results are:
It’s not actually the same p everywhere.
Someone has committed an absolutely heinous act of type piracy and redefined floating point comparisons.
is it possible for me to save the data as a file and attach here?
it IS the same p everywhere. That’s why it’s “strange”.
===============================================
I saved the design matrix X and response vector y into a .jld (by using JLD).
Then in a fresh session I loaded them and called mod = GLM.lm(X, y, true); and now the problem has gone?! that means I could not reproduce the bug by uploading the data…
I have no idea, I could not reproduce it here… Thus, I will start to say random things
Is it possible to execute changing the name of mod to something else? mod is a function in Base. It is not supposed to give you any problem, but I am lacking ideas…
EDIT: Wait, I am not understanding, p is NaN in 1.6 and still isnan(p) is false?
Yeah, unfortunately seems calculations of p-values are not quite right in a few key packages, we have the same error with:
a = [12,10,7,6,3,1]
b = [11,9,8,5,4,2]
MannWhitneyUTest(a,b)
Error showing value of type ExactMannWhitneyUTest{Float64}:
ERROR: p-values must be in [0; 1]
Stacktrace:
...
now I know the cause of the problem: --math-mode=fast
I understand that any operation on NaN is unpredictable in fastmath mode, we need an important exception: isnan(NaN).
now in fastmath mode:
julia> isnan(NaN)
false
and this failure to detect NaN is the cause of all confusions. In this case, the following isnan(v) fails to catch the NaN and throws an error:
struct PValue <: Real
v::Real
function PValue(v::Real)
0 <= v <= 1 || isnan(v) || error("p-values must be in [0; 1]")
new(v)
end
end
we often have no control whether a package would produce NaN, and on the other hand, the package has no control if the user is doing fastmath or not. I strongly recommend that isnan(NaN) to return true even in fastmath mode!
Next thing you will get some other error somewhere else because of another assumption of IEEE math in another package. It’s just completely unsafe to use this globally for running code you do not control 100% yourself.
well… seems like I could only override my ownisnan() to cope with the issue…
a question to ask: how could I detect if the current session is --math-mode=fast or not? thanks.