Hi all!
I have a linear model, created with GLM.jl
. Since I correlated some variables with a categorical one, I have lots of NaN
as coefficients, when the data are provided for only some of the categorical values:
julia> model
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}
Base power mean (W) ~ 1 + mnemonic + APSR (s flag) + Is conditional + Dest reg == source reg + Barrel shift amount + Has barrel shift + Has immediate operand + mnemonic & Binary weight + Barrel shift amount & Has barrel shift + mnemonic & APSR (s flag) + mnemonic & Is conditional + mnemonic & Dest reg == source reg + mnemonic & Barrel shift amount + mnemonic & Has barrel shift + mnemonic & Has immediate operand + mnemonic & Barrel shift amount & Has barrel shift + mnemonic & Binary weight & APSR (s flag) + mnemonic & Binary weight & Is conditional + mnemonic & Binary weight & Dest reg == source reg + mnemonic & Binary weight & Barrel shift amount + mnemonic & Binary weight & Has barrel shift + mnemonic & Binary weight & Has immediate operand + mnemonic & Binary weight & Barrel shift amount & Has barrel shift
Coefficients:
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(Intercept) 0.0804049 0.0305935 2.63 0.0086 0.0204363 0.140373
mnemonic: add -0.0209991 0.0306201 -0.69 0.4929 -0.0810198 0.0390215
mnemonic: and 0.000384739 0.0309979 0.01 0.9901 -0.0603766 0.0611461
mnemonic: asr -5.10902e-6 0.0385573 -0.00 0.9999 -0.0755842 0.0755739
mnemonic: b 0.00127954 0.00981586 0.13 0.8963 -0.0179612 0.0205203
mnemonic: bfc 0.0 NaN NaN NaN NaN NaN
mnemonic: bfi 0.0 NaN NaN NaN NaN NaN
mnemonic: bic 0.0 NaN NaN NaN NaN NaN
This means that, when I do a prediction with variables for which the model is not trained for, I got an error. Is there a way I can substitute each NaN
with 0.0
, so that the model doesn’t throw error anymore, but just ignores the missing coefficients?
Thanks!