Software (including Julia?) as Crash-only systems

Your apps (and that’s, currently, your responsibility) should be “crash-only systems”, but possibly Julia (as a “component” of your system), should also be “crash-only software”:

Crash-only software is software that crashes safely and recovers quickly.
[…]
The conclusion
Properly implemented, crash-only software produces higher quality, more reliable code; poorly understood it results in lazy programming.

Julia can crash (e.g. on OOM), and it’s safe in some sense (recovers edit history), but Julia doesn’t checkpoint (most of) your code by default, so in that sense Julia isn’t safe, i.e. Julia doesn’t recover. Julia does have some such non-default option:

--bug-report=rr-local
Run julia inside rr record but do not upload the recorded trace. Useful for local debugging.

Maybe that can and should be amended to be the default.

From the paper linked from the article:

“Crash-only systems are built from crash-only components”

Android is a system where OOM isn’t a problem, and programs restart (up to them to recover, so not all do a good job), and macOS has changed to a similar system (at least for new apps, and all included I believe).

1 Like