Addressing skill intersection issue and realizing Julia full potential

I think that’s exactly the promise of Julia, you can compose and it just works? [There’s at least one exception, if you use OffsetArrays incorrecly, but then don’t do that, or if enable bounds-checking, and you will know, I think in all cases.]

A.
Let’s first take “realizing Julia’s full potential” to mean full speed, i.e. full runtime performance, and even tiny binaries.

One way would be to code in C style, type all your code, functions and structs with concrete types (it’s not hard to avoid non-concrete, abstract. types).

This will result in no (possible) type instabilities, and then I believe thus no invalidations either.

This will result in the same speed as C, will not be idiomatic Julia code(?), though it has one loophole.

C doesn’t have GC, meaning its code is more complex, while sometimes (not always) faster). You can code in Julia without relying on the GC, use Libc.malloc and free, and then the languages are very comparable, and you could compile to tiny fast binaries with StaticCompiler.jl. It seems unfair to me to complain about Julia for speed or being hard to support e.g. GPUs, that seems easier than in C or C++.

You don’t need to know anything about LLVM (it’s an implementation detail, and is actually skipped after compilation for those small binaries).

To get full speed CPUs are just very complex (in any language), the compiler takes care of a lot for you, e.g. loop unrolling, and SIMD, though it can be manually tuned.

Potentially Julia should be taught this way? Some argue C or assembly should be taught first to learn the low-level details that really help you as a programmer. I think it’s a mistake to teach that way first, but you could do that even with Julia.

There is one other loophole here, C has globals and they are fast, but only if const in Julia:

B.
What Julia is trying to do is cater to the Python and MATLAB crowd, work as other dynamic languages.

But Julia is less dynamic for speed-reasons, than e.g. Python, e.g. has potential of overflow of integer types. Those are defaults, Julia can be made as dynamic, and prevent overflows, or use BigInts as new defaults. Could also be less strict regarding arrays, similar to Matlab, it’s just thought to be bad engineering.

It’s instructive to think about what needs to happen to get rid of invalidations. It’s entirely possible to do without them, as in A. It’s just that in Python where you don’t have them you don’t try as hard to compile and be generic. In C you can’t be generic (or could not, it has since added some features for, I think rarely used, Go has also added generic feature, less powerful than Julia’s).

Why do invalidations happen at all? Julia is recompiling for some reason (and while it’s an annoyance, it just works, though with annoyingly long “startup”); a type changed I think (but why?). Even with the changes now needing recompilation, the recompilation is in theory avoidable, it’s just because of a performance-obsessed compiler, it could opt into less inlining and no recompilation. When you program in Python you use fast C code with, and Python doesn’t have this problem since the C code is precompiled. But the limitation, is that then the binary code for that C code is static, it basically can’t be recompiled (Julia could limit itself in the same way, and even still support generic code). Since the precompiled code from C is static it also means, it can’t be generic, that code, i.e. it only works for some types, usually machine floats and machine integers, and strings, no alternative floats or integers. It also breaks the illusion that Python will not overflow your integers, Python doesn’t, but when used with NumPy, NumPy does it for it.

For Julia to stay generic (a good thing) it needs NOT have compiler at runtime. It just needs to inline less. It means not quite as fast code at runtime. Some features in Julia, like eval, require either a compiler or an interpreter (Julia has both).

The compiler knows exactly where and when it’s (heap) allocating (or e.g. using a non-const global). It could show you with a WARNING when you define the function (if it knows, but it CAN’T know for all types, e.g. those not yet defined). I’m not sure we should be too concerned. Heap (and stack) allocations are very natural in a program. It’s just in some extreme cases we want to avoid or eliminate completely. They are however often a symptom of other problems, that go away if you address the allocations. Very possibly one code-path allocates, and another doesn’t. Which is not a problem, the compiler sees the whole picture. What’s more problematic is that you can call code that allocates, and also if your code is generic, then some code path may only allocate for some type, e.g. none of the default types, or what you have tested with. There’s probably no way around this except go with A. and no generic.