Common error: providing an Integer when a Float is expected


#1

Hi all,

Perhaps this should be taken as a comment, but I wanted to raise it nonetheless, as it is an issue I have run into time and time again, and I am beginning to suspect I am missing something.

The following code, taken from the Optim.jl docs, runs fine in Julia 1.0.2:

using Optim
rosenbrock(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
result = optimize(rosenbrock, [0.0,0.0], BFGS())

However, when I change this to:

using Optim
rosenbrock(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
result = optimize(rosenbrock, [0,0], BFGS())

I get the following error:

InexactError: Int64(Int64, NaN)

Stacktrace:
 [1] Type at ./float.jl:700 [inlined]
 [2] convert at ./number.jl:7 [inlined]
 [3] fill!(::Array{Int64,1}, ::Float64) at ./array.jl:386
 [4] x_of_nans at /Users/thomasmoore/.julia/packages/NLSolversBase/ylZ0n/src/NLSolversBase.jl:45 [inlined]
 [5] alloc_DF at /Users/thomasmoore/.julia/packages/NLSolversBase/ylZ0n/src/objective_types/abstract.jl:21 [inlined]
 [6] Type at ./none:0 [inlined]
 [7] promote_objtype at /Users/thomasmoore/.julia/packages/Optim/ULNLZ/src/multivariate/optimize/interface.jl:40 [inlined]
 [8] #optimize#87(::Bool, ::Symbol, ::Function, ::Function, ::Array{Int64,1}, ::BFGS{InitialStatic{Float64},HagerZhang{Float64,Base.RefValue{Bool}},getfield(Optim, Symbol("##17#19"))}, ::Optim.Options{Float64,Nothing}) at /Users/thomasmoore/.julia/packages/Optim/ULNLZ/src/multivariate/optimize/interface.jl:113
 [9] optimize(::Function, ::Array{Int64,1}, ::BFGS{InitialStatic{Float64},HagerZhang{Float64,Base.RefValue{Bool}},getfield(Optim, Symbol("##17#19"))}, ::Optim.Options{Float64,Nothing}) at /Users/thomasmoore/.julia/packages/Optim/ULNLZ/src/multivariate/optimize/interface.jl:113 (repeats 2 times)
 [10] top-level scope at In[33]:3

All I have done is change an input from a float to an integer, and the program has failed. It’s a little embarrassing, but in the year I’ve been playing around with Julia, this exact problem has cost me a lot of time, when working with ODE solvers, optimisation routines, and several other packages.

Now, part of me feels that, given that 0 == 0.0 returns True, this should simply never be an issue, but I’m sure there are technical reasons why Integers and Floats must be treated separately.

I suppose my main concern is that, to a novice like me, the error message is completely unintuitive. I don’t know if I’ve installed something incorrectly, made a mistake in my function definition, or done something else entirely wrong.

Are there any plans in the works to provide a more intuitive error message - even something like “Int64 provided where Float64 expected in test.jl, line 22”. Alternatively, perhaps it is possible to provide some guidance on how to interpret these error messages? For the novice user like me, it’s rather daunting diving straight into the Julia codebase, especially for ODE solvers, which interlink with so many different packages.

Thanks
Tom


#2

Now, part of me feels that, given that 0 == 0.0 returns True, this should simply never be an issue, but I’m sure there are technical reasons why Integers and Floats must be treated separately.

Having vectors with specific, concrete element types is critical to Julia’s performance and flexibility. This isn’t just about Ints vs Floats but about any types at all: an array which stores elements of a specific type can be stored efficiently and indexed without having to look up the type of each element. Julia wouldn’t be the same without this. So, while 1 and 1.0 compare as equal, their types are not, and an Array{Int, 1} and an Array{Float64, 1} are totally different types.

I suppose my main concern is that, to a novice like me, the error message is completely unintuitive. I don’t know if I’ve installed something incorrectly, made a mistake in my function definition, or done something else entirely wrong.

Totally understandable. It makes sense that you’d want a more informative error message, but the problem is that Julia has no idea what you’ve actually done wrong. The optimize function you’re calling takes an array x of any AbstractArray type with any kind of element type. It just happens that in this particular case, it tries to fill an array of that type with NaN later on, which is impossible for an Array{Int, 1}, since NaN cannot be converted to Int.

You could argue that perhaps this is a bug in the method definition of optimize: perhaps it should have been defined as:

optimize(f, initial_x::AbstractArray{<:AbstractFloat}, ...)

in which case you would have gotten a much more useful error message. Unfortunately, that might also make it impossible to call optimize() with x set to some other exotic data type like ForwardDiff.Dual or SymPy.Sym. This is a real tradeoff when writing generic code: being very flexible often means allowing users to try things that won’t actually work.

One could imagine a future version of the language in which the definition of optimize() could specify a requirement that the element type of your initial_x is something to which NaN could be converted. Or you could treat this as a bug in NLSolversBase and change it to avoid trying to fill arrays with NaN when it’s not possible.

One more thing that could easily be done is just to improve the way InexactErrors are printed to be a little more helpful. That would require a change to Julia itself, but that’s absolutely doable and would be a great way to make the language better for everyone.


#3

This actually speaks to an issue that is deserves more visibility in the Julia world. We’ve made a significant correction to JuMP development practices to implement what I call the "MethodError principle": http://www.juliaopt.org/JuMP.jl/dev/style/#User-facing-MethodError-1. Basically, error messages at library boundaries should explain to users what they did wrong and not which internal assumptions were violated deep in the code. For the specific case of MethodErrors this implies that “A user should see a MethodError only for methods that they called directly.

This is a goal that’s not always possible with generic code; however, there are small changes that could go a long way once we recognize this as a priority. I don’t have time to write an essay on this right now, but maybe I will later.


#4

There is a question of what the conversion rules should be. Optim clearly assumed a data type that was a superset of Int, i.e. somewhere the Array{Int,1} was promoted to Array{Float64,1}. This is questionable behavior, or at least suspect, since there is no guarantee that a demotion is possible, calling for a warning.


#5

No, that is not at all what that error message says. That error message says that there is no Integer representation of NaN - and that is exactly the problem here. The convert call happens because we fill an array similar to the input arrays with NaN. We could also have called x_out = fill!(similar(x), eltype(x)(NaN)) and you would have gotten the error message pointing to another line, but it would still be the convert(Int64, NaN) function call that failed.

I’ve actually spent quite some time loosening type signatures in Optim, which means that you can pass in different number types. One requirement is that that number type has a NaN value in it’s specification, blame those who designed Int64 for not defining a NaN value - though it would soon fail somewhere else even if it did. No method in Optim is designed to optimize over the set of integers, so unsurprisingly the software fails if you provide an integer.

This issue is largely one of education in my view. There are so many functions across the ecosystem that are written to accept various inputs in a generic way. The whole point of that is that I don’t get to specify what numbers people input (to the extent that the algorithm of interest might make sense for such an input)!

One thing I could do, and I think this is something along the lines of what Miles is suggesting, is to say: okay, what is the functions that a novice user is expected to call in Optim? Really, it’s only optimize and then constructors. I could go in and make a signature that catches <:Integers and then all those who put in 0 would be able to get a custom warning. I actually like the general idea, but it’s not so simple. What types should I allow? Real?

But what about Rationals then? They’re <:Reals, but they’re based on Integers, so they won’t have a NaN either! Oh well, make a new signature for Rationals also? Honestly, I don’t want to maintain that, and in this case I guess the novice users pay the price so advanced users (potentially themselves once they learn) can have their cake. And even if Rationals had supertype <:AbstractRationals (which would probably make a lot more sense, but this is becoming a digression), it would also be wrong to require <:Real because we also support complex inputs! It may be true that 0==0.0 is true, but so is false==0.0==0, do we need a signature for <:Bool as well? :slight_smile: I’m (slightly) kidding, but these things can not be used to guarantee behavior of some random package code.

So in conclusion, some of this is the price to pay for flexibility. The learning curve could be improved for newcomers without removing flexibility if you make MethodErrors along the lines of Miles’ suggestion - but remember that such a solution is not a pareto improvement. Someone would have to write and maintain that extra code. I know I should specify 0.0 if I want a floating point zero, so I’m not going to spend any of my limited (almost non-existing) julia-time these days to implement this.

I should maybe mention that I’ve commented and closed many issues around this “Integer” input that people seem to run into, and I’ve helped and explained each person about this issue of generic inputs. I have yet to see any PRs from these people on either documentation, error handling, conversions, whatever solution there might be out there. ¯\ _ (ツ) _ /¯


#6

What about something like the following snippet at the right place in optimize!:

try
    convert(eltype(initial_x), NaN)
catch
    throw(ArgumentError("initial_x has element type $(eltype(initial_x)) which cannot represent NaN. Consider using Float64 instead."))
end

For the benefit of the hundreds (thousands?) of people who have might have experienced the issue (assume a small fraction of users open issues when they see errors) and for the benefit of your time as a maintainer (educating users one-by-one can be rewarding but is inefficient), it seems like a potential win from my perspective.

These one-off fixes do add some noise to the otherwise pristine generic code, but IMO this is really about being upfront about your assumptions on the input instead of relying on Julia to blow up later on. When it’s easy and not time consuming to check that the assumptions are satisfied, then do so. It makes the code more readable for developers and it saves debugging time for users. You can also prioritize which assumptions to check by the stream of new issues.


#7

Could be a solution, yes.

I am 100% on board with your POV that these errors should be caught earlier rather than later. It’s not useful for the user to look at float.jl in Base when they try to optimize and they get an error. So yes, I might include something like what you posted. Thanks.