major blockers at the moment is codegen and possibly GC support (if stack objects can reference heap object). In retrospect, I think it would have been possible to start from a more dynamic approach where escapes of manually-declared stack objects are detected at run-time.
What about for GPU code and GPU AD? Allocations are an even larger problem there, where Julia often still lags behind mainstream systems. Forgive me for not knowing the details well, but does your mention of the GC, which cannot see GPU allocations, imply that this is only a CPU solution?
Julia tends to revolve around adding more stuff to the compiler/optimizer. This certainly is a very exciting part. But I think we should pay a similar amount of attention to the language/library features as well, so that Julia programs have more predictable performance while still being easy to write and maintain.
Yes and those of use using Julia for ML feel this acutely, where the optimizer subset has far outpaced the language semantics. The workable semantics for fast GPU+AD is hard to meet
or even virtually non-existent for more complex models which often require hand rolled rrules.
The approach of designing new dynamic semantics for every new demand on the compiler (and even this is hard because of compatibility issues I guess) seems like it will always be playing catch-up because it’s comparatively easier to write passes.
This will be especially true with the upcoming compiler plugin work which are hoped to be composable and easily accessible from user-land. Is there something analogous to this for language semantics or the type system, so that people who write passes can also co-design those with semantics or invariants? Do haskell language extensions provide a useful guide here?
If Julia’s dynamic nature makes it hard, would it be feasible to explore dynamically and then declare locally static regions? Even with current optimizations like conventional devirtualization, hitting the right semantics can be hard for users, so going static opt in (if possible??) as opposed to different dynamic semantics sounds like it might have additional benefits.
There’s talk about finding a good static subset of julia semantics, but unless it has prophetic foresight, does that just punt the problem to when we want to make new compiler demands of that? So maybe better to think about how we can make that programmable. We’d have to give up certain global properties like decideability and soundness, but I think it’s worth the trade-off.
Edit: am I just describing macros?