The way it currently works, this is unlikely to happen soon.
Array is actually not a “real” julia type/object. It is implemented by the C runtime, and mostly used via the foreign function interface. Relevant exceptions are
arraysize, which are special-cased by codegen.
In other words, llvm cannot peek into Array internals for optimizations.
More relevant than allocation,
push! / array resizing is not special-cased either, which is why
push! is so damn slow (cannot be inlined, ~10 cycles).
I think the most realistic way is to compile some subset of the runtime library into both machine-code and llvm bitcode (
clang -fembed-bitcode /
-lto-embed-bitcode style), and use some variant of link-time-optimization when loading a dynamic library with bundled bitcode. The same applies to BigInt and string.
This is a quite general FFI problem:
Have some nice C / rust / fortran code, and try to use it from julia; this works super nice and convenient with OK perf via the current method (compile foreign code into dynamic lib, load lib, use foreign function call).
When you want to get rid of the remaining overhead, e.g. want some foreign functions inlined, then you need to embed bitcode into the dynamic library (by now super common and convenient on apple; not too hard to do on linux; no idea about windows; but fundamentally this can only work with llvm-based compilers, like clang, flang, rust, and won’t work with the likes of gcc, icc or msvc).
Then you need some ugly dance with
clang.jl that I forgot the details of. I’d prefer if this was part of julia Core, just like loading of dynamic libs is; especially since embedded bitcode is becoming so prevalent in the apple world, and a future of ThinLTO dynamic libraries for JIT and debug purposes would be awesome.