Unfamiliar error

I am receiving an error that I can find no documentation on.

While running a GeneralizedLinearMixedModel using JuliaCall/RCall I am getting this error:

number of averaging steps > 10
with the following Stacktrace:
Stacktrace:
[1] pirls!(m::GeneralizedLinearMixedModel{Float64, Bernoulli{Float64}}, varyβ::Bool, verbose::Bool; maxiter::Int64)
@ MixedModels C:\Users\coverton.julia\packages\MixedModels\alVRh\src\generalizedlinearmixedmodel.jl:614
[2] pirls!
@ C:\Users\coverton.julia\packages\MixedModels\alVRh\src\generalizedlinearmixedmodel.jl:578 [inlined]
[3] (::MixedModels.var"#obj#100"{Bool, Bool, Int64, Bool, Int64, GeneralizedLinearMixedModel{Float64, Bernoulli{Float64}}, Vector{Tuple{Vector{Float64}, Float64}}, ProgressMeter.ProgressUnknown, typeof(MixedModels.setβθ!)})(x::Vector{Float64}, g::Vector{Float64})
@ MixedModels C:\Users\coverton.julia\packages\MixedModels\alVRh\src\generalizedlinearmixedmodel.jl:288
[4] nlopt_callback_wrapper(n::UInt32, x::Ptr{Float64}, grad::Ptr{Float64}, d_::Ptr{Nothing})
@ NLopt C:\Users\coverton.julia\packages\NLopt\w0c7n\src\NLopt.jl:388
[5] optimize!(o::NLopt.Opt, x

My logistic regression model is being run on subsets of data and most run well so I expected this had to do with either 1) inestimable (rank-deficient) parameters that may have occurred in some subsets 2) levels of random effects that may be rank-deficient.
I have evaluated the first case which does not seem to be the issue

Is there a way to further diagnose this error?

Hi,

Welcome to the Julia community!

I don’t have any experience with RCall (and very little with R), so I doubt I’ll be able to help much directly, but it seems you didn’t post the actual error? :slight_smile:

(As an example

julia> x * 2
ERROR: UndefVarError: `x` not defined
Stacktrace:
 [1] top-level scope
   @ REPL[1]:1

in a new Julia REPL gives an UndefVarError.)

Also, your stacktrace ends abruptly. I’m not sure if it would help, but it certainly can’t hurt to post the complete stacktrace.

Finally, you are likely to get more responses if you provide a minimal working example, so that other people can (hopefully) replicate the error.

2 Likes

Thank you eldee for your reply.

Unfortunately, both the error notification and the stacktrace that are provided are via the Rstudio interface with Julia and don’t actually afford me the ability (that I know of) to print out a more compete trace.

I was hoping to deduce the cause from the Julia ERROR message which is returned as the top of my Stacktace, i.e. “number of averaging steps >10”

I sort of think that this error is a function of the vagaries of my dataset and random effect structure. Specifically, I am modeling nested random effects (month/individual) and I suspect that one or more individuals may only have a single fixed effect factor resulting in identifiability problems. This error is not present in all subsets of data and the data I am processing has tens of millions of rows. As such, I am not sure I could create a minimal working example.

I am more interested in the source of the error/error message I can’t find any documentation anywhere that references “number of averaging steps >10”.
Short of hacking every package and searching every subroutine, I don’t know of another way to figure out what that means.

I am more interested in the source of the error/error message I can’t find any documentation anywhere that references “number of averaging steps >10”.

The stacktrace shows you that the error message comes from MixedModels.jl/src/generalizedlinearmixedmodel.jl at 19b90aa0384100fa9c3a38db631afbb89287b848 · JuliaStats/MixedModels.jl · GitHub. I can’t help much more than that but just showing you how to find the source. Passing verbose=true might help according to the function’s docstring

Thank you @DanielVandH, that does indeed direct me to the error, now I just need to interpret the error and figure out what in my data in causing it.

Much appreciated!

Based on the source code, it seems that the error type would indeed have been uninformative, so that would then not have helped much anyway.

I’m not very familiar with PIRLS and the variable names in the code are not very descriptive in isolation. But from the general structure it seems to me there is an optimisation loop (iter), where in every step a subproblem (involving obj) needs to be solved. The error then gets thrown (only in the first iteration for some reason) when solving this subproblem takes too many steps, which consist of averaging. So basically, I’d interpret this error as saying that the process is not converging for your data.

@eldee, thank you. I was slowly coming to the same conclusion. Although I have submitted a question about the Number iteration steps and optimization routine to further my own understanding, I expect you are correct; that the sparse representation across my fixed effect factors across my nested random effect probably resulted in convergence problems.
Better data cleaning would likely have prevented that.

Your insight is much appreciated.

1 Like

I’m coming to this discussion late but, as one of the authors of MixedModels.jl, I hope I can provide some insight.

As indicated by @eldee the error occurs when trying to reduce the penalized deviance by a PIRLS (penalized iteratively re-weighted least squares) algorithm. Your diagnosis of sparsity in the observed data resulting in an inability to support estimation of a complex model is likely correct.

I am a bit unsure of how JuliaCall/RCall get into this. Are you running an R session and trying to fit the model by calling Julia through JuliaCall? I would be happy to look into the problem if you were able/willing to provide me with the data and the formula of the model you are trying to fit.

2 Likes

Bringing things back together: the step-halving bit was also discussed in a MixedModels issue. For that specific problem, let’s keep the conversation on GitHub. :smile:

Bringing context from the GH issue: RCall comes into this because this was being invoked from within R using JuliaCall, which uses RCall internally.

Thank you both, @palday and @dmbates,

I have included a script and dataset on the GH issue if you wish to review it.
I think my questions have been answered sufficiently for me to have a path forward. Some additional data curation is probably what I need to avoid the PIRLS error.