Compiler work priorities

I know this problem is on very few people’s radar, but I believe having Cassette usage incur zero runtime overhead (and preferably reasonable compile time overhead) would in the long term have a more dramatic impact on the Julia ecosystem than any other feature I can think of.

3 Likes

By the same token, MATLAB is also not ready for prime time because it takes ~30seconds to start.

If you don’t buy that argument, then ask yourself “why?” Is it because you don’t start Matlab as many times in a day? If so, the obvious answer is to help make Revise.jl better. There are relatively few open bugs reported against Revise, and this makes it difficult to know what causes people to not use Revise, or for it to not work as they might hope. I confess to being personally skeptical that the list of changes Revise can’t handle accounts for the majority of restarts. If you agree, you can help by taking a few moments to figure out why you’re restarting, whether there is a good alternative, and at least upvoting an existing issue or creating a new one. Even better, submit PRs to provide a broken test case and/or fix problems.

Revise 1.0 will be coming out soon, based on a revamp in https://github.com/timholy/Revise.jl/pull/217 that may eliminate one of the major reasons for restarting Julia. (Since it is a fairly extensive redesign of the internals, I’m going to allow some real-world testing before I tag it.) If there are other “easy fixes,” let’s try to get them in too.

30 Likes

To be fair, Julia does seem to aim for a broader audience than MATLAB, no?

1 Like

Awesome, I agree that switching between free and dev is another substantial chunk of restarts. The main other one being type redefinition.

4 Likes

Just as an example: I’m using Julia in my algorithm course, with INGInious. Even small snippets submitted by the students for evaluation take, say, 20–30 seconds. I also write programs that I generally just run from the command-line – not suitable for iterative data analysis or the like, which I guess people do in the REPL. I could certainly write these in, say, Python instead (which would certainly have been my impulse earlier), but my main hope for Julia is to get away from the two-language problem – I’d rather not go from Python + C to Julia + Python. :slight_smile:

Don’t get me wrong: I believe things are on the right track and are going to be great. But there are those of us doing very different things with Julia than what we would have done with MATLAB. (In fact, I’ve never used MATLAB, precisely because it didn’t suit my needs at all.)

1 Like

Compilation time is not an issue if the user is from c++ and MATLAB background. C++ compilation time could be very long for some small code "Modern" C++ Lamentations · Aras' website. While for MATLAB, the startup freezes my laptop so often.

1 Like

For what it’s worth, I’ve found that Revise works beautifully and I’ve almost never encountered cases it can’t handle. It still feels like magic!! I’ve never filed a bug because I’ve never found one. Maybe I’m too amazed that it works in the first place and too aware of what will likely confuse it and why to be as demanding as other users. I am also fairly used to doing work that requires rebuilding Julia. Testing that you haven’t broken code loading and bootstrapping is unfortunately not a good use case for Revise :grimacing:

4 Likes

Thanks! Glad it’s useful.

But part of what I’m interested in is “why do people restart their sessions?” That is, not bugs per se in Revise, but perhaps limitations that force people to restart. The idea is that if you keep the same Julia session open for, say, a whole week, you will have already paid the compilation price for just about anything you actually use. At which point concerns over compile time disappear.

In practice, though, I don’t keep my active sessions open for a week. In Julia 1.0, package management has been the main reason for me to restart. Now that it’s “solved” (modulo Revise bugs and How precompile files are loaded need to change if using multiple projects are going to be pleasant · Issue #27418 · JuliaLang/julia · GitHub), I’m going to be curious what causes me and others to restart. For me personally, type redefinition isn’t a dominant reason; I usually get my types nailed down fairly quickly, but then can spend hours working on the methods.

I don’t mean to derail this into a Revise thread; simply to say that, given the repeated concerns about compile latency, it’s worth people knowing that by getting a bit engaged with the process of improving Revise, it’s a problem that should be mostly solvable today.

4 Likes

I agree that Revise is great f you’re thinking in terms of sessions (sort of foreign to me, as I don’t use the REPL for any serious work, though it’s probably something I should get used to) – applies less when running programs, I guess.

From my perspective it’s less “Why do I restart my sessions?” and more “What sessions?” :slight_smile:

It’s definitely type redefinition for me. The practical result is that I typically start developing modules inside an ijulia cell so that I can keep re-evaluating the cell and replacing the module when the types change. When that gets unwieldy, I might refactor the code out into a file and include() it, again so that I can replace the module and its tyoes. It would be great not to need to do that and to just start with a blank package and do using Foo from the beginning, but having to restart julia any time a type changes is a bit awkward in the early stages of package development.

Of course, I should still say that Revise.jl is magical and I have no idea how I ever lived without it :slightly_smiling_face:

17 Likes

But, yeah, tricking myself somehow might be useful. E.g., running a REPL in the background, and have “fake” shell commands communicate with it, or something :wink:

(Or something along the lines of my proposal in issue #203.)

1 Like

For me there are two main reasons to restart Julia. The first is when switching a package version, as was already mentioned here. The second is to get a clean “workspace” when prototyping inside Juno cells. If I make some non-trivial changes to the code I want to make sure it still works “from scratch”, rather than accidentally due to some global variables hanging around from previous iterations. In MATLAB this is clear all. Julia briefly had workspace(), but it had trouble with reloading packages (I’m guessing it was dropped because these issues were considered not worth fixing, perhaps because Revise came along).

11 Likes

I’ve been working on a data analysis project where I’ve been working entirely out of Jupyter notebooks. One reason for restarts has been that I “delete” a function but Julia doesn’t know about it but due to the Jupyter model, which is (I think) that code just gets sent to the kernel for evaluation when a cell is run, and that’s it. I also sometimes lose track of what got evaluated in which order and what random values are floating around in memory, and find it easier to restart everything to restore to a consistent state and minimal memory use.

Do people usually include a script from their notebook and switch back and forth between editor and notebook for interactive workflows, or use Juno, or something else?

3 Likes

I restart because I hit segfaults and (what looks like) memory leaks :worried: . I wish there were better tools to explore where my memory went.

1 Like

It may be an age related thing, euphemistically called “having a senior moment.” :smirk:

5 Likes

I think in general if one uses IJulia notebooks, the typical workflow will be open and close them like one opens and closes other documents, and that always triggers a restart of the kernel.

1 Like

Most of my restarts are accidental, I try to interrupt a long-running computation with Ctrl-C and julia crashes instead of interrupts. I mostly work in Juno, this problem occurs more seldom when I work in the terminal directly.

7 Likes

It depends heavily on the package. I find that for numerical code Revise is fully sufficient but there are cases where it’s trickier. In particular libraries that take a while to precompile and hit segfaults with some frequency are a worst case scenario (early development of Makie was an example of this).

I find that fixing compiler bugs (esp. those who can cause Julia to “core dump”) can alleviate the latency issue because if Julia doesn’t crash I agree that there is little reason to restart it.

In terms of Revise limitation, at some point I had the impression that with complicated generated functions, it had some troubles with method deletion but I don’t have a MWE. Other than that it has been working beautifully on Julia 1.0 and I’m very grateful for it.

I restart mainly because of my workflow - I tend to work on lots of projects in parallel, having all windows associated with a certain task in its own desktop and using desktops as my todo list. I’ll have maybe 4-5 projects at a time involving julia, sometimes only having time to touch each a few times per month. So it isn’t practical to have a single session running indefinitely.

3 Likes

I have three machines I work on regularly. I never remember where I left off on which machine, so I normally just restart things to make sure in any given session I start from a known state.

1 Like