I raised this issue [on discourse](https://discourse.julialang.org/t/using-a-key…word-argument-leads-to-enormous-allocations/80354), where the consensus seems to be that this is at least very confusing, and possibly a bug.
If I *use* a kwarg for a function that was passed as an argument to another function, Julia does not specialize the latter function. If I don't use the kwarg for that very same function, it does specialize. Below, I'll include a less trivial example that's closer to my actual use case. But schematically, the idea is this:
```julia
f1(a; b=10) = a+b
f2(c, f) = c + f(20)
f3(d, f) = d + f(30; b=40)
```
Calling, for example, `f2(5, f1)` will result in a specialized `f2`; calling `f3(5, f1)` will not result in specialization. I imagine Julia is clever enough to optimize the problem away in this schematic. But in my code, I was seeing slowdowns of ~100x and allocations of multiple GiBs on each call to my core computation function.
As pointed out in discourse, it's possible to manually trigger specialization by adding a type parameter. But the failure to *automatically* specialize is a problem for a few reasons:
1. It's surprising. The [performance tip on specialization](https://docs.julialang.org/en/v1/manual/performance-tips/#Be-aware-of-when-Julia-avoids-specializing) says "Julia will always specialize when the argument is used within the method, but not if the argument is just passed through to another function." In this case, I *did* use the argument (the function with the kwarg). From the discussion on discourse, it looks like the problem is that Julia immediately lowers that to just pass the function through to `Core.kwfunc`. So technically the argument "is just passed through to another function" — but not by the programmer. (Gotta love passive voice!)
2. It's very hard to diagnose. None of the usual tools — profiling, allocation tracking, `@code_warntype`, JET, Traceur — pointed out any problem with the use of kwargs. In fact, profiling and allocation actively focused my attention on other parts of the code that were not at all the source of the problem. Even the `(@which f(...)).specializations` trick from that section of the performance tips seemed to say the function *was* being specialized for my arguments. (See below.)
3. It seems to contradict the docs. If the goal when designing this heuristic is to detect when a function is "just passed through" so that it will "usually [have] no performance impact at runtime", surely the decision of how to arrange parameters in a function definition should not affect the result.
So, at the very least, I would think this is a documentation bug, because the kwarg wrinkle should be noted in that performance tip — rather than requiring the user to mentally combine disparate arcana from the most cryptic parts of the docs. It would also be nice if some standard tools could point toward the source of the problem. But maybe this is truly a bug in Julia, which should actually specialize even when a kwarg is used?
---
For reference, here's a working example that's complicated enough that Julia doesn't just optimize the problem away, while still being a greatly simplified version of my actual use case:
```julia
using Profile
function index(n, mp, m; n_max=n)
n + mp + m + n_max
end
function inplace!(a, n_max, index_func)
i1 = index_func(1, 2, 3; n_max=n_max) # Using this version leads to allocations below
# i1 = index_func(1, 2, 3) # Using this version leads to 0 allocations
i2 = size(a, 1) - 2i1
for i in 1:i2 # Allocates 3182688 B if using kwarg above
a[i + i1] = a[i + i1 - 1] # Allocates 9573120 B if using kwarg above
end
for i in 3:i2-4 # Allocates 3182576 B if using kwarg above
a[i + i1] -= a[i + i1 - 2] # Allocates 12771408 B if using kwarg above
end
end
function compute_a(n_max::Int64)
a = randn(Float64, 100_000)
inplace!(a, n_max, index)
Profile.clear_malloc_data()
inplace!(a, n_max, index)
end
compute_a(10)
```
And yes, there are plenty of ways to improve the performance of this simplified code with function barriers and such. But my actual code is too complicated for that, with the kwarg func being used multiple times inside some loops.
If I look at the specializations of `inplace!(a, n_max, index)`, I get
```julia
svec(MethodInstance for inplace!(::Vector{Float64}, ::Int64, ::Function), MethodInstance for inplace!(::Vector{Float64}, ::Int64, ::typeof(index)), nothing, nothing, nothing, nothing, nothing, nothing)
```
That second element really looks to me like it specialized for my particular `index` function.
<details>
<summary>Here's all my versioninfo</summary>
```
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin19.5.0)
CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, haswell)
```
</details>