Bug? parse(Integer, ...) calls typemax

Mark_Nahabedian · August 29, 2021, 12:26am

Is this a bug

parse(Integer, "1234567890987654321234567890")
ERROR: MethodError: no method matching typemax(::Type{Integer})
Closest candidates are:
  typemax(!Matched::Union{Dates.DateTime, Type{Dates.DateTime}}) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Dates\src\types.jl:426
  typemax(!Matched::Union{Dates.Date, Type{Dates.Date}}) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Dates\src\types.jl:428
  typemax(!Matched::Union{Dates.Time, Type{Dates.Time}}) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Dates\src\types.jl:430
  ...
Stacktrace:
 [1] tryparse_internal(#unused#::Type{Integer}, s::String, startpos::Int64, endpos::Int64, base_::Int64, raise::Bool)
   @ Base .\parse.jl:128
 [2] parse(::Type{Integer}, s::String; base::Nothing)
   @ Base .\parse.jl:241
 [3] parse(::Type{Integer}, s::String)
   @ Base .\parse.jl:241
 [4] top-level scope
   @ none:1

julia --version
julia version 1.6.0

rafael.guerra · August 29, 2021, 12:29am

Did you mean parse(Int,...)?
(which will overflow, btw)

Mark_Nahabedian · August 29, 2021, 12:38am

Oops. Integer is an abstract type. I still think I should be able to parse to Integer and get the subtype that the value fits in.

This works

parse(BigInt, "1234567890987654321234567890")
1234567890987654321234567890

but this

typeof(parse(BigInt, "4"))
BigInt

seems wasteful.

jling · August 29, 2021, 1:13am

Integer can have infinitely many sub-types, what should Julia do? dynamically call subtypes and check each typemax and sort them and find the smallest fitting one? Seems wasteful.

Mark_Nahabedian · August 29, 2021, 1:34am

Yup. Good point.

gustaphe · August 29, 2021, 4:02am

And if that is what you want, you can of course do

T = Int8

while true
    try
        x = parse(T, s)
        break
    catch e
        e isa UnrepresentableError || rethrow(e) #whatever the error type is called, need to look this up
        T = widen(T)
    end
end

DNF · August 29, 2021, 8:12am

A workaround to fix this is to write

parse(Int, "4")

instead of specifically asking for a BigInt.

gustaphe · August 29, 2021, 8:25am

Int isn’t less specific than BigInt (on a given system). I think OP’s point is they want the smallest Integer type which can represent the number in a string.

DNF · August 29, 2021, 8:31am

Read it again:

He’s suggesting that this should not return a BigInt.

gustaphe · August 29, 2021, 8:34am

I read that to mean that the more inclusive method parse(BigInt, stringvariable) is wasteful for small values of stringvariable.

DNF · August 29, 2021, 8:36am

Maybe, but that depends on the use case, one can easily imagine cases where combinations of small integers overflow.

Anyway, I read it as a request that the compiler should override parse(BigInt, "4") to return a smaller type. Can you clarify, @Mark_Nahabedian?

gustaphe · August 29, 2021, 9:22am

At my computer now, so I could test it out:

function minimalparse(s)
    T = Int8
    while true
        try
            return parse(T, s)
        catch e
            e isa OverflowError || rethrow(e)
            T = widen(T)
        end
    end
end

julia> typeof(minimalparse("123"))
Int8

julia> typeof(minimalparse("1234"))
Int16

julia> typeof(minimalparse("123456"))
Int32

julia> typeof(minimalparse("123456789012"))
Int64

julia> typeof(minimalparse("123456789012345678901234"))
Int128

julia> typeof(minimalparse("12345678901234567890123456789012345678901234567890"))
BigInt

Obviously this is not type stable. It may very well be more performant to use a “too large” type for many applications.

Mark_Nahabedian · August 29, 2021, 10:17am

I’m sorry. From the discussion, it’s clear I wasn’t really thinking.

A CommonLisp implementation would use FIXNUM until it overflowed into BIGNUM. The built-in types in CommonLisp aren’t extensible though.

Given that Integer us subtypable, the Julia implementation can’t guess which subtypes are most appropriate for a given application. I suppose if a developer really didn’t know the range of integers they were dealing with, and wanted a minimal size integer, they could implement their own function to do that, or specialize parse.

One possible behavior would be to start with Int until the number being parsed overflows and then resort to BigInt. That would be analogous to the CommonLisp implementation, would represent all integers, and those that fit would be represented in a manner mist suited to the architecture. This is not consistent with Julia’s wrap on overflow behavior though. It’s been so long since I’ve dealt with anything near the machine level that I have no clue whether machines still allow trap on arithmetic overflow or whether any IS or language runtime exposes that to the programmer.

I suppose at the hardware level the “extra” most negative twos compliment integer could be treated as an overflow indicator analogous to the various floating point NaNs. I doubt thus could be gone without adding gate delay to what is probably the most fundamental operation in computation.

Can you tell I’m having trouble falling asleep?

| DNF
August 29 |

| - |

Maybe, but that depends on the use case, one can easily imagine cases where combinations of small integers overflow.

Anyway, I read it as a request that the compiler should override parse(BigInt, "4") to return a smaller type. Can you clarify, @Mark_Nahabedian?

Mark_Nahabedian · August 29, 2021, 1:20pm

DNF: I don’t think I have a clear, coherent request.

Gustaphe:. Pretty neat.

I don’t understand how the existence of Overflow error and it’s behavior in gystaphe’s example is consistent with what I read in the Julia numbers doc about wrap-around. See Integers and Floating-Point Numbers · The Julia Language under “Overflow Behavior”. Can someone clarify this inconsistency?

gustaphe · August 29, 2021, 1:30pm

I think that specifically applies to arithmetic. When doing a + b, it doesn’t check for overflow. parse(T, s) on the other hand, is assumed to be a rare and expensive enough operation, the overflow check is not a big overhead. Another way to view it is, there is a pretty clear and consistent way to interpret addition of two large numbers, but there is no obvious answer to what julia should do with parse(Int8, "1234"). It can only do normal overflow if it parses the entire number into a larger type (or somehow divides it into some arithmetic operations). An addition on the other hand doesn’t need to know that it’s overflowing, which is why the choice was made to not error.

Elrod · August 29, 2021, 1:47pm

On x86, many instructions will set the CF (the carry flag) to indicate this. inc (increment) and dec (decrement) will not, for example, but add/sub will.

Topic		Replies	Views
Is it a bug or I am doing something wrong? New to Julia question , bug	5	246	September 6, 2024
From Int64 to Int128 New to Julia	5	678	October 30, 2020
What is causing the error "MethodError: no method matching +(::UnitRange{Int64}, ::Int64)" New to Julia	2	1072	April 29, 2021
Abstract Type Number does´t recognize Int64 as suitable type New to Julia	9	517	June 27, 2019
Demoting BigInt to Int64 General Usage question	6	465	January 7, 2021

Bug? parse(Integer, ...) calls typemax

Related topics