I believe so. The missing info you want is likely Memory layout of Julia Objects · The Julia Language, which explains the layout of the jl_value_t*s that jl_value_t *jl_apply_generic(jl_value_t *F, jl_value_t **args, uint32_t nargs) expects. It’s unfortunate that the unrelated Core.Box also exists to cause confusion, but thankfully that doesn’t show up in any of these examples.
Yes, this was what I was alluding to with my first sentence about “requires boxing…” earlier. Array{Number}, is, roughly speaking, an array of already-boxed jl_value_t*s.
The signature of jl_apply_generic is rather restrictive and makes a number of assumptions around the lifetimes of the jl_value_t **args it receives. Part of the linked PR discussion is whether it’d be safe to “fake” jl_value_t *s that point to the stack instead of GC-allocated and tracked memory—the answer appears to be no, not without non-trivial changes on the GC side.
Regardless of what the best solution is, it’d likely require coordinated changes to codegen and the runtime, as noted in [WIP] dynamic dispatch optimization: avoid boxing stack-allocated inputs by NHDaly · Pull Request #50136 · JuliaLang/julia · GitHub. If I had to guess, this level of complexity is a big part of why it hasn’t been done yet.