Improving speed of runtime dispatch detector

I’m working on this small package DispatchDoctor – GitHub - MilesCranmer/DispatchDoctor.jl which helps to address some of the issues discussed in this thread.

This package provides the @stable macro as a more ergonomic way to use Test.@inferred within a codebase:

using DispatchDoctor: @stable

@stable function f(x)
    if x > 0
        return x
    else
        return 1.0
    end
end

which will then throw an error for any type instability:

julia> f(2.0)
2.0

julia> f(1)
ERROR: return type Int64 does not match inferred return type Union{Float64, Int64}
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] f(x::Int64)
   @ Main ~/PermaDocuments/DispatchDoctor.jl/src/DispatchDoctor.jl:18
 [3] top-level scope
   @ REPL[4]:1

I could see this being useful for maintaining type hygiene in a codebase – you see type instabilities early, rather than needing to fix things when code is already slow.

The @stable macro is pretty simple (using MacroTools)


function _stable(fex::Expr)
    fdef = splitdef(fex)
    closure_func = gensym("closure_func")
    fdef[:body] = quote
        let $(closure_func)() = $(fdef[:body])
            $(Test).@inferred $(closure_func)()
        end
    end

    return combinedef(fdef)
end

However, this @inferred call is quite slow – a massive 400ns per call.

Is there anything I can do to only trigger the Test.@inferred on the first call with the given set of input types? (Is my only option to use @generated?)


Here’s a benchmark:

julia> using DispatchDoctor: @stable

julia> @stable f(x) = x > 0 ? x : 0.0;

julia> @btime f(1.0);
  567.568 ns (12 allocations: 752 bytes)

julia> g(x) = x > 0 ? x : 0.0;

julia> @btime g(1.0);
  0.875 ns (0 allocations: 0 bytes)

Any tricks I should try?

Ideally I would like to have the Test.@inferred completely compiled away by the second run… Not sure if that’s possible or not.

20 Likes

I’d wager the stated problem is solvable, even without metaprogramming, however I don’t like the idea, as it would incur a heavy penalty on the first call, and it seems like it’d make using a debugger less nice.

IMO using @inferred in the test suite is preferable.

What about something like

julia> function stable_wrap(f::F, args...) where {F}
           T = Base.promote_op(f, map(typeof, args)...)
           Base.isconcretetype(T) || error("Not stable!")
           f(args...)::T
       end
stable_wrap (generic function with 1 method)

julia> macro stable(ex::Expr)
           fdef = ex.args[1]
           (Base.sym_in(ex.head, (:function,:(=))) && Meta.isexpr(fdef, :call)) || error("not a function")
           args = @view(fdef.args[2:end])
           for a in args
               a isa Expr && Base.sym_in(a.head, (:kw,:parameters)) && error("need to implement kwarg support")
           end
           fname = fdef.args[1]
           fdef.args[1] = gname = gensym(fname)
           quote
               $ex
               $fname(args...) = $stable_wrap($gname, args...)
           end |> esc
       end
@stable (macro with 1 method)

julia> @stable f(x,y) = x * y / 3
f (generic function with 1 method)

julia> f(2, 3)
2.0

julia> @code_typed f(2, 3)
CodeInfo(
1 ── %1  = Core.getfield(args, 1)::Int64
│    %2  = Core.getfield(args, 2)::Int64
└───       goto #10 if not true
2 ┄─ %4  = φ (#1 => 2, #9 => %16)::Int64
│    %5  = Base.sle_int(1, %4)::Bool
└───       goto #4 if not %5
3 ── %7  = Base.sle_int(%4, 2)::Bool
└───       goto #5
4 ──       nothing::Nothing
5 ┄─ %10 = φ (#3 => %7, #4 => false)::Bool
└───       goto #7 if not %10
6 ──       Base.getfield((Int64, Int64), %4, true)::DataType
│    %13 = Base.add_int(%4, 1)::Int64
└───       goto #8
7 ──       goto #8
8 ┄─ %16 = φ (#6 => %13)::Int64
│    %17 = φ (#6 => false, #7 => true)::Bool
│    %18 = Base.not_int(%17)::Bool
└───       goto #10 if not %18
9 ──       goto #2
10 ┄       goto #11
11 ─       goto #12
12 ─       goto #13
13 ─       goto #14
14 ─ %25 = Base.mul_int(%1, %2)::Int64
│    %26 = Base.sitofp(Float64, %25)::Float64
│    %27 = Base.div_float(%26, 3.0)::Float64
└───       goto #15
15 ─       return %27
) => Float64

julia> @code_llvm f(2, 3)
;  @ REPL[2]:12 within `f`
define double @julia_f_427(i64 signext %0, i64 signext %1) #0 {
top:
; ┌ @ REPL[1]:4 within `stable_wrap`
; │┌ @ REPL[3]:1 within `##f#226`
; ││┌ @ int.jl:88 within `*`
     %2 = mul i64 %1, %0
; ││└
; ││┌ @ int.jl:97 within `/`
; │││┌ @ float.jl:294 within `float`
; ││││┌ @ float.jl:268 within `AbstractFloat`
; │││││┌ @ float.jl:159 within `Float64`
        %3 = sitofp i64 %2 to double
; │││└└└
; │││ @ int.jl:97 within `/` @ float.jl:412
     %4 = fdiv double %3, 3.000000e+00
     ret double %4
; └└└
}

The idea is that the check compiles away.

5 Likes

It would be quite tedious to explicitly test the inference over all possible permutations of types to every internal function in a library. Especially functions that are deeply nested, for which a failed inference may not be picked up by a top-level @inferred. Those methods which would require some manual @descend work are not practical for automation. But tagging it at the call site would let you automate this.

Anyways Im not looking to convince anyone of the utility at this stage. I hate type instabilities and I hate finding them, so I want to get this @stable faster so I can use it in my own stuff.

6 Likes

Very nice!! Thanks!

I’ve always wanted something small and convenient like this! I’ve also seen a macro floating around for erroring on all allocations inside a macro-ed function, which could also live in such a package (combined into @static)?

3 Likes

Sounds great. Let me know if you find that macro, I’d love to throw it in the package too

2 Likes

Hmm, you’re right. What about hiding this behavior behind a compile time preference, with Preferences.jl? This way it could be turned off for production but turned on in the test suite.

1 Like

JuliaLang/AllocCheck.jl: AllocCheck (github.com)? or have I misunderstood.

6 Likes

How does this sound for working with keywords? The downside is that it has to call the internal function Core.kwcall, but it seems like promote_op doesn’t define a keyword-compatible method:

function stable_wrap(f::F, args...; kwargs...) where {F}
    T = if isempty(kwargs)
        Base.promote_op(f, map(typeof, args)...)
    else
        Base.promote_op(Core.kwcall, typeof(NamedTuple(kwargs)), F, map(typeof, args)...)
    end
    Base.isconcretetype(T) || error("...")
    return f(args...; kwargs...)::T
end

Full implementation here: DispatchDoctor.jl/src/DispatchDoctor.jl at main · MilesCranmer/DispatchDoctor.jl · GitHub.

It seems to work for a variety of scenarios too which is great:

@testitem "smoke test" begin
    using DispatchDoctor
    @stable f(x) = x
    @test f(1) == 1
end
@testitem "with error" begin
    using DispatchDoctor
    @stable f(x) = x > 0 ? x : 1.0

    # Will catch type instability:
    @test_throws TypeInstabilityError f(1)
    @test f(2.0) == 2.0
end
@testitem "with kwargs" begin
    using DispatchDoctor
    @stable f(x; a=1, b=2) = x + a + b
    @test f(1) == 4
    @stable g(; a=1) = a > 0 ? a : 1.0
    @test_throws TypeInstabilityError g()
    @test g(; a=2.0) == 2.0
end
@testitem "tuple args" begin
    using DispatchDoctor
    @stable f((x, y); a=1, b=2) = x + y + a + b
    @test f((1, 2)) == 6
    @test f((1, 2); b=3) == 7
    @stable g((x, y), z=1.0; c=2.0) = x > 0 ? y : c + z
    @test g((1, 2.0)) == 2.0
    @test_throws TypeInstabilityError g((1, 2))
end
1 Like

Slightly related question… Does anybody know how to unit-test that the LLVM is as expected?

julia> using DispatchDoctor

julia> @stable f(x) = x
f (generic function with 1 method)

julia> @code_llvm f(1)
;  @ /Users/mcranmer/PermaDocuments/DispatchDoctor.jl/src/DispatchDoctor.jl:65 within `f`
define i64 @julia_f_460(i64 signext %0) #0 {
top:
  ret i64 %0
}

I can do this check manually but would prefer to have the CI scream at me when Julia no longer compiles away the check.

using InteractiveUtils: code_llvm

llvm_ir = sprint(code_llvm, f, (Int,))
@test !occursin(str, llvm_ir)

Replace str with some ir that shows up when the check isn’t compiled away.
Plenty of examples at [Code search results (github.com)](Repository search results · GitHub and I imagine CUDA.jl, GPUCompiler.jl and LLVM.jl also have more examples.

1 Like

Amazing. Thanks!

(And btw do you foresee any issues with the use of Core.kwcall? I noticed it wasn’t compatible with earlier Julia, so I basically am just having @stable be a no-op on Julia earlier than 1.10)

See GitHub - JuliaTesting/PerformanceTestTools.jl
It takes care of getting rid of flags like --check-bounds=yes and code coverage.

Here is an example use:

and from the included file

vector.body is a name LLVM typically gives to vectorized loop bodies, so this code checks to make sure a gigantic broadcast vectorized.

You could do things like add the debuginfo=:none kwarg to code_llvm, and then check for number of lines.
Or for totally trivial cases, you could try things like comparing string distance with what the optimized IR is supposed to be like (with debuginfo=:none of course; e.g. we don’t are about LineNumberNode paths matching).

EDIT: should maybe replace the String(take!(io)) from FastBroadcast’s tests with sprint.

1 Like

Nice! That worked. Thanks for the help, I think this is ready for the registry now.

It’s great that you done this, could be nerdsniped to do it, so maybe you or someone else can be nersniped to make improvements building on this. I.e. apply one or more macros globally, like a REPL mode that could for your f do implicitly:

stable> @stable @check_allocs f(x) = <my_function>

i.e. you wouldn’t need to specify those there, in that mode, only in the regular julia prompt, that you would no longer use most of the time.

[We already have a package for checked arithmetic; and a package for a REPL more that enables it, and we could have the above REPL mode include that, and call it debug…]
Ideally all functions (you care about) would be type-stable (and non-allocating if important), but it’s a learning curve, I think can’t be checked at compile time for arbitrary types. Your example relu code isn’t type-stable, since it used 0.0, should use zero(x) to also work for e.g. Float32; and one(x) where applies, and division / (and I guess \) give Float64, another stability trap.

Would you want to check for such to have type-stability at compile time, for most or all generic code? Often it’s ok to know type-stable for the types I use at runtime. You merged a credit for a perfomance trick minutes ago, is this now no overhead if the code is type-stable (for some types, but not then you get a type-instability error)?

Yeah it should be zero overhead now. I have a unittest for this too.

3 Likes

One other thing that would be useful would be a module-wide version:

@stable module A
  
function f1(x)
    x
end
function f2(x, y)
    x * y
end

end

and it would add @stable to every function in-scope.

I’m assuming this isn’t possible though…

The hard part would be include.
At that point, it may be worth trying to play with Core.Compiler/inference instead, to see if you can create a module-level Base.Experimental.@ option like @optlevel or @max_methods.

3 Likes

Thanks.

Btw, I found a weird case of Julia’s specialization rules interfering with this interface:

using DispatchDoctor

@stable f(a, t::Type{T}) where {T} = sum(a; init=zero(T))

f([1f0, 1f0], Float32)

Despite the normal function being type stable, this actually fails the type specialization test

ERROR: TypeInstabilityError: Instability detected in function `f`
with arguments `(Vector{Float32}, DataType)`. Inferred to be 
`Any`, which is not a concrete type.
Stacktrace:
 [1] #_stable_wrap#1
   @ ~/PermaDocuments/DispatchDoctor.jl/src/DispatchDoctor.jl:25 [inlined]

because of Julia’s type specialization rules:

As a heuristic, Julia avoids automatically specializing on argument type parameters in three specific cases: Type, Function, and Vararg.

Even if I modify _stable_wrap to be

_stable_wrap(f::F, caller::G, args::Vararg{Any,N}; kwargs...) where {F,G,N}

it still fails, because now there are multiple non-specializing cases (Vararg and Type) – it seems like Julia lacks logic to deal with this situation.

I started a thread about this issue a while back:

but seems like that solution doesn’t work here.

Is there any way to force Julia to specialize no matter what?