Avoiding type instabilities (with StaticKernels)

I am trying to construct a Kernel from StaticKernels.jl dynamically. I would like to pass the window range and the function to apply as variables. The problem is that the axes of the Kernel are a part of the type information for the Kernel, so that the obvious attempt is not type stable

using StaticKernels
julia> mykernel(f,r::UnitRange{Int}) = Kernel{(r,)}(w->f(Tuple(w)))
mykernel (generic function with 1 method)

julia> @code_warntype mykernel(sum,-10:0)
  #self#::Core.Compiler.Const(mykernel, false)
  f::Core.Compiler.Const(sum, false)

Body::Kernel{_A,var"#207#208"{typeof(sum)}} where _A
1 ā”€ %1 = Core.tuple(r)::Tuple{UnitRange{Int64}}
ā”‚   %2 = Core.apply_type(Main.Kernel, %1)::Type{Kernel{_A,F} where F} where _A
ā”‚   %3 = Main.:(var"#207#208")::Core.Compiler.Const(var"#207#208", false)
ā”‚   %4 = Core.typeof(f)::Core.Compiler.Const(typeof(sum), false)
ā”‚   %5 = Core.apply_type(%3, %4)::Core.Compiler.Const(var"#207#208"{typeof(sum)}, false)
ā”‚        (#207 = %new(%5, f))
ā”‚   %7 = #207::Core.Compiler.Const(var"#207#208"{typeof(sum)}(sum), false)
ā”‚   %8 = (%2)(%7)::Kernel{_A,var"#207#208"{typeof(sum)}} where _A
ā””ā”€ā”€      return %8

Is there a way to make it happen?

The type of the Kernel axes should be NTuple{<:Any,UnitRange{Int}}, but I am only interested in the 1D case (general solutions accepted of course). I am sure this must be possible, as there are similar examples throughout the community (StaticArrays.jl does this with generated functions I guess?).

I ended up brute forcing my solution, which is to use code generation to create lots of different Kernel objects having window sizes in the range Iā€™m looking for and functions that I might want to apply and declared them to be const. Not quite what I would prefer, but well worth the effort in terms of performance.

This seems like a textbook case for a function barrier. In your case, you can construct the type-unstable Kernel{(r,)} and then pass it into a function which does the actual work.

Your brute-force approach is essentially an attempt to do by hand what the compiler will do for you with a function barrier.


You are right. I did try that before but I must have made a mistake in my benchmarking because when I was crafting a MWE the timing difference was gone.