Avoiding type instabilities (with StaticKernels)

I am trying to construct a Kernel from StaticKernels.jl dynamically. I would like to pass the window range and the function to apply as variables. The problem is that the axes of the Kernel are a part of the type information for the Kernel, so that the obvious attempt is not type stable

using StaticKernels
julia> mykernel(f,r::UnitRange{Int}) = Kernel{(r,)}(w->f(Tuple(w)))
mykernel (generic function with 1 method)

julia> @code_warntype mykernel(sum,-10:0)
Variables
  #self#::Core.Compiler.Const(mykernel, false)
  f::Core.Compiler.Const(sum, false)
  r::UnitRange{Int64}
  #207::var"#207#208"{typeof(sum)}

Body::Kernel{_A,var"#207#208"{typeof(sum)}} where _A
1 ─ %1 = Core.tuple(r)::Tuple{UnitRange{Int64}}
│   %2 = Core.apply_type(Main.Kernel, %1)::Type{Kernel{_A,F} where F} where _A
│   %3 = Main.:(var"#207#208")::Core.Compiler.Const(var"#207#208", false)
│   %4 = Core.typeof(f)::Core.Compiler.Const(typeof(sum), false)
│   %5 = Core.apply_type(%3, %4)::Core.Compiler.Const(var"#207#208"{typeof(sum)}, false)
│        (#207 = %new(%5, f))
│   %7 = #207::Core.Compiler.Const(var"#207#208"{typeof(sum)}(sum), false)
│   %8 = (%2)(%7)::Kernel{_A,var"#207#208"{typeof(sum)}} where _A
└──      return %8

Is there a way to make it happen?

The type of the Kernel axes should be NTuple{<:Any,UnitRange{Int}}, but I am only interested in the 1D case (general solutions accepted of course). I am sure this must be possible, as there are similar examples throughout the community (StaticArrays.jl does this with generated functions I guess?).

I ended up brute forcing my solution, which is to use code generation to create lots of different Kernel objects having window sizes in the range I’m looking for and functions that I might want to apply and declared them to be const. Not quite what I would prefer, but well worth the effort in terms of performance.

This seems like a textbook case for a function barrier. In your case, you can construct the type-unstable Kernel{(r,)} and then pass it into a function which does the actual work.

Your brute-force approach is essentially an attempt to do by hand what the compiler will do for you with a function barrier.

2 Likes

You are right. I did try that before but I must have made a mistake in my benchmarking because when I was crafting a MWE the timing difference was gone.