Why isn't `size` always inferred to be an Integer?

I have also spent an endless number of hours in SnooptCompile to achieve improvements of 1-2 seconds in compile time. One thing that I really can’t understand is why taking the size or length of a var that is Any returns an Any.
Damn it, size can only be an Int.

@code_warntype GMT.common_plot_xyz("", [0.0 0.0; 1.0 1.0], "", true, false)
...
152 ─ %813  = GMT.length(val::AbstractArray)::Any
│     %814  = GMT.size(arg1@_125, 1)::Any
│     %815  = (%813 != %814)::Any
julia> struct Bad <: AbstractVector{Int} end

julia> Base.size(::Bad, args...) = missing

Where is your god now?
Joking aside, the problem is that the compiler can’t prove that someone didn’t do the equivalent of what I just did above. That’s why it needs to return Any.

3 Likes

I wont go into discussing this as it quickly dives into territories where I cannot go, but it seems a week reason. If it can’t just error. The docs seem clear

size(A::AbstractArray, [dim])

  Return a tuple containing the dimensions of A.

Not necessarily. Even with only builtin types:

julia> R = big(1):big(2)^400
1:2582249878086908589655919172003011874329705792829223512830659356540647622016841194629645353280137831435903171972747493376

julia> length(R)
2582249878086908589655919172003011874329705792829223512830659356540647622016841194629645353280137831435903171972747493376

julia> typeof(ans)
BigInt

julia> R isa AbstractArray
true
4 Likes

Fine. As long as it’s not a Any that keeps propagating. How can we be expected to prevent Any's propagation with all this?

And I think I meant an Integer

julia> BigInt <: Integer
true
1 Like

Converting it to an Int? I mean, if you know that is what it should be, let it be. :stuck_out_tongue:

Also, a function barrier may work? Don’t know the rest of the code so IDK.

1 Like

I don’t think the compiler reads the docs. If it can’t prove it, it has to be Any. I mean, how else could it work?

1 Like

Never fully Groked the concept but isn’t this type piracy? And if yes, why comply with it?

No, if Bad is your own type, you actually have to make your own size.

3 Likes

Knowing it’s gotta be an Integer is, really, not any better than Any for the purposes of the compiler and possible runtime speed. It still needs the costly indirection to figure out what instructions to use, and it can’t just shove it into a 64-bit register.

8 Likes

This is not type-piracy. The type Bad is defined inside the actual “module” and you define Base.size on it.

IIUC type piracy is that you define an external function to your module to a type external to it. Something like this

module A
function f() end
end

module B
struct MyType end
end

module C
import ..A: f
import ..B: MyType

function f(x::MyType) end # type piracy
end
2 Likes

Thanks, this I can understand.

And on the same spirit of this thread, how can a string return a Any

└────         goto #157
155 ─ %823  = GMT.string(val)::Any
│     %824  = (%823 != "indata")::Any

Same, even though we “know” the interfaces, the compiler cannot guaranteed always that they are respected.

I think you may be taking this output too seriously. Does it actually cause type instability to propagate in your package?

It’s not literally returning Any — the compiler just isn’t sure what it might return at runtime. And it’s because the compiler doesn’t know what type val will be, so it doesn’t know what method of string will be called.

Just as in the size example, the trouble isn’t so much string (or size), but upstream of that — sort out the stability of their arguments and they’ll then be stable.

2 Likes

Maybe not, but what I know is the module in question is precompiled and it still takes a further ~6 seconds to run on the first run. I try to reduce that and the only thing I can tie to the the potential instabilities … and ~zero success.

As @mbauman says it’s something else in the function thats unstable. You can fix that, or if thats not possible split the function into 2 functions to create a function barrier, so as much of it as possible is stable.

I know that, but in this case it’s an impossible task because those are derived from input arguments to the function and they can have different types. Nothing really important for run time … as long as per-compilation had worked well. But it didn’t and I’m just trying to find out why and the Any's are the beasts to chase, so we are told.

And to give it more context, this the function I’m referring to https://github.com/GenericMappingTools/GMT.jl/blob/master/src/psxy.jl#L7. The source of the Any's is the kwargs tht are converted into a Dict(:symbol, Any) and from there anything extracted from the Dict is a Any. Have no idea on how to work around this.
But again, this does not seem to hurt runtime … after compilation of first run, hence latency.

Type instabilities in end-packages tend to impact runtime speed, not the (pre-)compile time. In upstream libraries, yes, they can indeed become one of the magnets for invalidation, but I don’t think they’re typically troublesome at the point where I think you are.

This conversation is now circling back to the original thread from which it was split — we can go back to Taking TTFX seriously: Can we make common packages faster to load and use for concrete tips for reducing that compile time.