Hello,
I’m trying to build a model for the Store Sales Kaggle Competition.
I’ve tried to build a super-simple model that only takes two-three features and submit my predictions to Kaggle. The numbers didn’t correspond between evaluate!
and Kaggle at all, so I figured that my evaluation metric was wrong…
After consulting the competitions’ documentation about evaluation, I switched from Root Mean Squared Error to Root Mean Squared Log Error (in fact, I included both in my evaluation).
My problem is that I get Inf
as RMSLE. I’m also worried that my model is not optimized towards RMSLE but instead RMSL
I’d like to get something that at least ressembles my Kaggle score, which is of 2.89
.
train_timeseries = load_timeseries("./data/train.csv")
y, X = process_for_tree(train_timeseries, true)
tree = EvoTreeRegressor(
measure=rmsle
)
mach = machine(tree, X, y)
evaluate!(mach, measure=[rms, rmsle])
This gives me the following evaluation:
┌───────────────────────────┬───────────┬─────────────┬─────────┬───────────────────────────────────────────────┐
│ measure │ operation │ measurement │ 1.96*SE │ per_fold │
├───────────────────────────┼───────────┼─────────────┼─────────┼───────────────────────────────────────────────┤
│ RootMeanSquaredError() │ predict │ 978.0 │ 151.0 │ [704.0, 878.0, 892.0, 1070.0, 1070.0, 1170.0] │
│ RootMeanSquaredLogError() │ predict │ Inf │ NaN │ [Inf, Inf, Inf, Inf, Inf, Inf] │
How could I make evaluate!
output something else than Inf
?
Notes about tuning EvoTrees for :logistic loss
I’ve tried setting loss=:logistic
in my Tree’s constructor (doc) but I got this error:
tree = EvoTreeRegressor(
loss=:logistic
)
┌ Error: Problem fitting the machine machine(EvoTrees.EvoTreeRegressor{EvoTrees.Logistic, Float32}
│ - nrounds: 10
│ - lambda: 0.0
│ - gamma: 0.0
│ - eta: 0.1
│ - max_depth: 5
│ - min_weight: 1.0
│ - rowsample: 1.0
│ - colsample: 1.0
│ - nbins: 32
│ - alpha: 0.5
│ - monotone_constraints: Dict{Int64, Int64}()
│ - rng: Random.TaskLocalRNG()
│ - device: cpu
│ , …).
└ @ MLJBase ~/.julia/packages/MLJBase/uxwHr/src/machines.jl:682
[ Info: Running type checks...
[ Info: Type checks okay.
ERROR: DomainError with -1.0025849:
log will only return a complex result if called with a complex argument. Try log(Complex(x)).
Stacktrace:
[1] throw_complex_domainerror(f::Symbol, x::Float32)
@ Base.Math ./math.jl:33
[2] _log(x::Float32, base::Val{:ℯ}, func::Symbol)
@ Base.Math ./special/log.jl:336
[3] log
@ ./special/log.jl:264 [inlined]
[4] log_fast
@ ./fastmath.jl:349 [inlined]
[5] logit
@ ~/.julia/packages/EvoTrees/RueBJ/src/loss.jl:134 [inlined]
[6] init_evotree(params::EvoTrees.EvoTreeRegressor{EvoTrees.Logistic, Float32}; x_train::SubArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}, y_train::SubArray{Float64, 1, Vector{Float64}, Tuple{Vector{Int64}}, false}, w_train::Nothing, offset_train::Nothing, fnames::Nothing)
[7] fit(model::EvoTrees.EvoTreeRegressor{EvoTrees.Logistic, Float32}, verbosity::Int64, A::NamedTuple{(:matrix, :names), Tuple{SubArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}, Vector{Symbol}}}, y::SubArray{Float64, 1, Vector{Float64}, Tuple{Vector{Int64}}, false}, w::Nothing)
[... irrelevant stacktraces below removed ...]