Why is `Base.return_types(String, Tuple{AbstractVector{UInt8}})` so pessimistic for this method?

julia> Base.return_types(String, Tuple{AbstractVector{UInt8}})
3-element Vector{Any}:
 String
 String
 Any

julia> methods(String, Tuple{AbstractVector{UInt8}})
# 3 methods for type constructor:
 [1] String(v::Vector{UInt8})
     @ Base strings/string.jl:67
 [2] String(s::Base.CodeUnits{UInt8, String})
     @ Base strings/string.jl:107
 [3] String(v::AbstractVector{UInt8})
     @ Base strings/string.jl:66

So Base.return_types thinks that String(::AbstractVector{UInt8}) may not always be inferrable, and it blames this method in base/strings/string.jl, at line 66:

But I don’t understand how can that not always be perfectly inferrable? The copyto! return type is guaranteed to infer as Vector{UInt8}, and String(::Vector{UInt8}) also has perfect inference, according to Base.return_types:

julia> Base.return_types(copyto!, Tuple{Vector{UInt8},AbstractVector{UInt8}})
2-element Vector{Any}:
 Vector{UInt8} (alias for Array{UInt8, 1})
 Vector{UInt8} (alias for Array{UInt8, 1})
julia> Base.return_types(String, Tuple{Vector{UInt8}})
1-element Vector{Any}:
 String

Seems to me something here is inconsistent/buggy.

Pretty sure the problem is length:

julia> Base.return_types(length, Tuple{AbstractVector{UInt8}})
19-element Vector{Any}:
 Int64
 UInt8
 Any
 Any
 Any
 Any
 Any
 Int64
 Int64
 Integer
 Integer
 Int64
 Integer
 Any
 Int64
 Integer
 Any
 Union{}
 Any

Abstract code in julia is incredibly fragile.

5 Likes

OK, I guess we could add some type assertions to some of those length methods. EDIT: actually, the first thing to do is add an ::Integer type assertion into the method above. EDIT2: that doesn’t help for some reason, so I guess it’ll just be a ::String annotation.

As I explained here, I’m not sure you’ll get much support for this idea. Relying on world-splitting like this is generally considered a bad idea now, and attempts to fix it like this will just be an endless game of whack-a-mole.

More likely would just be that people support turning on max_methods=1 so that no world splitting occurs, and thus no code needs to be recompiled if someone adds a wonky method to length or string.

2 Likes

Ah, so getting a clean output from Base.return_types is an outdated concern when the input argument types are abstract?

EDIT: still, I don’t think it would hurt to add a ::String type assertion to the end of every String method per Require constructors and `convert` to return objects of stated type? Β· Issue #42372 Β· JuliaLang/julia Β· GitHub.

I wonder where do these Any come from? Getting different integer types is one thing, but this is unexpected. To paraphrase Michael Jackson:

Any, are you okay?
So, Any, are you okay?
Are you okay, Any?

4 Likes

It’s mostly caused by the various AbstractRange subtypes of AbstractVector, since their length is computed from the elements. e.g.

julia> code_typed(length, Tuple{UnitRange{Real}})
1-element Vector{Any}:
 CodeInfo(
1 ── %1  = Base.getfield(r, :start)::Real
β”‚    %2  = Base.getfield(r, :stop)::Real
β”‚    %3  = Base.zero(%2)::Real
β”‚    %4  = Base.zero(%1)::Real
β”‚    %5  = (%3 - %4)::Any
β”‚    %6  = Base.oneunit(%5)::Any
β”‚    %7  = (%6 isa Base.Signed)::Bool
└───       goto #3 if not %7
2 ──       goto #9
3 ── %10 = Base.:>=::typeof(>=)
β”‚    %11 = (isa)(%2, BigFloat)::Bool
β”‚    %12 = (isa)(%1, BigFloat)::Bool
β”‚    %13 = (Core.Intrinsics.and_int)(%11, %12)::Bool
└───       goto #5 if not %13
4 ── %15 = Ο€ (%2, BigFloat)
β”‚    %16 = Ο€ (%1, BigFloat)
β”‚    %17 = invoke %10(%15::BigFloat, %16::BigFloat)::Any
└───       goto #6
5 ── %19 = (%2 >= %1)::Any
└───       goto #6
6 ┄─ %21 = Ο† (#4 => %17, #5 => %19)::Any
└───       goto #8 if not %21
7 ──       goto #9
8 ── %24 = Base.zero(%6)::Any
└───       goto #10
9 ┄─ %26 = (%2 - %1)::Any
└─── %27 = (%6 + %26)::Any
10 β”„ %28 = Ο† (#9 => %27, #8 => %24)::Any
β”‚    %29 = Base.Integer(%28)::Any
└───       return %29
) => Any
2 Likes