Julia 1.8.1 pkg.update() crashes OS / software during precompilation phase

So ever since julia 1.8-rc4 I’ve had very weird issues when trying to add or update packages via the package manager. It started off with failing precompilation via LoadError when trying to add a package and has now progressed into either locking up/freezing my OS or crashing say, firefox due to random errors while Julia is attempting to precompile packages.

I went so far as to wipe Julia installs and all registry info related to julia from my computer, yet the issue persists. Julia 1.7.3 which I have installed as well does not suffer from this issue. IIRC 1.8.0-rc3 did not have this issue

I wish I could give you any error messages along with this post to help determine the reason, but right now I don’t get any error messages, and unfortunately did not save any of the previous ones either.

versioninfo()
Julia Version 1.8.1
Commit afb6c60d69 (2022-09-06 15:09 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 32 × AMD Ryzen 9 3950X 16-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, znver2)
Threads: 1 on 32 virtual cores
Environment:
JULIA_DEPOT_PATH = J:\Julia\Packages

Have you seen this thread? I posted there additional links to related issues.

In short: I have a Ryzen setup and had similar problems (including crashing Firefox), limiting the precompilation threads solved the problem, but slowed package installation etc.
Weirdly for me, I started to see similar crashes in Julia 1.7.3 and on my other PC with an Intel processor, so also cannot say for sure that this mitigates it.
But you can try if it helps.

2 Likes

Huh… Okay, limiting the tasks to 1 solved the issue. I did not try different values just yet, I’m just really glad it doesn’t crash my pc anymore. Thank you!

So, I played around with the JULIA_NUM_PRECOMPILE_TASKS and at the very least values <= 8 work. Setting it to 16 gave this error

(@v1.8) pkg> update                                                                                                                                                                  Updating registry at `J:\Julia\Packages\registries\General.toml`                                                                                                               No Changes to `J:\Julia\Packages\environments\v1.8\Project.toml`                                                                                                                 No Changes to `J:\Julia\Packages\environments\v1.8\Manifest.toml`                                                                                                              Precompiling project...                                                                                                                                                            Progress [>                                        ]  0/238                                                                                                                      ? Combinatorics                                                                                                                                                                  ? FunctionWrappers                                                                                                                                                               ? SnoopPrecompile                                                                                                                                                                ? GroupsCore                                                                                                                                                                     ? Calculus                                                                                                                                                                       ? LaTeXStrings                                                                                                                                                                   ? PDMats                                                                                                                                                                         ? SignedDistanceFields                                                                                                                                                           ? IndirectArrays                                                                                                                                                                 ? PolygonOps                                                                                                                                                                     ? ExprTools                                                                                                                                                                      ? IteratorInterfaceExtensions                                                                                                                                                    ? ModernGL                                                                                                                                                                       ? TensorCore                                                                                                                                                                     ? StatsAPI                                                                                                                                                                       ? Contour                                                                                                                                                                      fatal: error thrown and no exception handler available.                                                                                                                          ErrorException("schedule: Task not runnable")                                                                                                                                    error at .\error.jl:35                                                                                                                                                           #schedule#613 at .\task.jl:791                                                                                                                                                   schedule##kw at .\task.jl:789 [inlined]                                                                                                                                          notify at .\condition.jl:148                                                                                                                                                     #notify#586 at .\condition.jl:142 [inlined]                                                                                                                                      notify at .\condition.jl:142 [inlined]                                                                                                                                           notify at .\condition.jl:142 [inlined]                                                                                                                                           _uv_hook_close at .\stream.jl:719                                                                                                                                                jfptr__uv_hook_close_56165.clone_1 at J:\Julia\Julia-1.8.1\lib\julia\sys.dll (unknown line)                                                                                      jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1838 [inlined]                                                                                           jl_uv_call_close_callback at /cygdrive/c/buildbot/worker/package_win64/build/src\jl_uv.c:88 [inlined]                                                                            jl_uv_closeHandle at /cygdrive/c/buildbot/worker/package_win64/build/src\jl_uv.c:107                                                                                             uv_pipe_endgame at /workspace/srcdir/libuv\src/win\pipe.c:669                                                                                                                    uv_process_endgames at /workspace/srcdir/libuv\src/win\handle-inl.h:112                                                                                                          uv_run at /workspace/srcdir/libuv\src/win\core.c:501                                                                                                                             ijl_task_get_next at /cygdrive/c/buildbot/worker/package_win64/build/src\partr.c:563                                                                                             poptask at .\task.jl:921                                                                                                                                                         wait at .\task.jl:930                                                                                                                                                            task_done_hook at .\task.jl:634                                                                                                                                                  jfptr_task_done_hook_45281.clone_1 at J:\Julia\Julia-1.8.1\lib\julia\sys.dll (unknown line)                                                                                      jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1838 [inlined]                                                                                           jl_finish_task at /cygdrive/c/buildbot/worker/package_win64/build/src\task.c:254                                                                                                 start_task at /cygdrive/c/buildbot/worker/package_win64/build/src\task.c:942                                                                                                                                                                                         
1 Like

I think this is fixed in Julia 1.8.2 which was just out. Then the defaults should just work.

There’s a tradeoff between parallel or serial install, parallel will always take more RAM (i.e. times number of CPUs used) while doing it, but now Julia tries to limit its use to available RAM, not just for install, but I believe that too. Someone was complaining about not working on 2 GB (container). Tell me if you have that low, or run on Linux. I see now you’re on Windows, I thought you would just run out of mem and not crash the OS (Linux crashing on too much use of RAM is a known problem, OOM, not really just related to Julia).

Are you sure the Windows OS really crashed? If it was thrashing because of virtual memory use, then it will feel like it, but eventually recover? That might take a long time if you use a regular hard disk (not SSD, and maybe with that too). I would consider running without VM on, or maybe only with max 2 GB of pagefile. Not as much pagefile as RAM, in case the RAM Is huge.

1 Like

Thanks for the info, will try 1.8.2 today and see how it goes.
Could you point to the pull requests that made the changes? Because the issues I found on this topic on github were closed as solved some time ago.

Also, for me it didn’t seem to be directly related to RAM usage, my systems have 32 and 64 GB, and task manager did not report extensive usage. It rather seemed like one of the threads spawned during precompilation tried to access not-Julia memory.
Hopefully it is now fixed either way.

That seems like a lot of memory to eat up by installing packages (but you likely had some other big stuff consuming RAM). So did you install unusual packages, or that had many dependencies like SciMl?

Don’t trust me (or the other guy) that the fix is in. It’s annoying as hell if your OS crashes, so you may want to be safe using the ENV, and try without when convenient:

Likewise to mkoculak, I have 32 GB of RAM, Julia was not using anywhere close to that amount. The symptoms were much more like a data race / illegal memory access error than being out of memory.

Firefox would crash with an error, it also managed to freeze my GPU driver, where the screen would freeze or turn black until the driver reset itself to some backup version where my desktop resolution dropped from native 4k to 1080p and only a hardware reset would fix it. In other instances it simply froze my OS and I had to do a hard reset at the computer reset button to get it responding again.

That shouldn’t happen in Julia (or elsewhere). But under memory pressure some other process might try to allocate, such as Firefox or your GPU driver, and it will fail. And it’s not being tested which is sloppy programming wherever that’s not done (on Windows, on Linux not needed since you will always get some virtual memory pages back I think), or handled otherwise:

Notice in Julia when you allocate you never test if you ran out of RAM. Because you can’t rely on getting a NULL pointer back in case memory is full (nor would you want one). On systems where malloc would do that, I believe you would get an exception instead, a much safer alternative than having to check.

In Windows XP timeframe I didtchs windows because if couldn’t handle well memory running out (or maybe it was Window handles, not easy for the user to know or describe the GUI redraw issues). I went to Linux, and kind of haven’t looked back, though OOM there is annoying too, in a slightly different way, and getting better I believe with recent software that installed by default in e.g. recent Ubuntu versions.

I agree it shouldn’t, but I’m trying to tell you the symptoms really make it seem that’s exactly what’s happening. Which is why this also seems so concerning to me.

I’m using this same system for rendering and have gone close and over the RAM and VRAM capacity, and while the system crawls to a stop, I’ve never encountered similar kinds of random and hard errors than what are happening during precompilation. And no, I was not doing anything taxing when the precompilation has been running.

1 Like

So it seems the issue is fixed for me.

Do I understand correctly that Julia now estimates how much memory is available and adjusts the number of precompilation threads? It feels that this behavior has been given a conservative approach. My Windows reports 16 GB of free memory, but without setting any env variables Julia spins only two threads.
When I explicitly bump the number up, it works as expected (while not using more than couple hundred MB of RAM).

The changes to the memory/GC stuff was only relevant if you are running inside containers.

set number of openblas threads to 1 while precompiling by KristofferC · Pull Request #46792 · JuliaLang/julia · GitHub might possibly have helped with the specific issue here.

Good to know, but while that PR is merged is it not effective until similarly named open PR is also merged?

Yeah, that sounds reasonable.
I have not tested changing the number of openblas threads and precompilation threads separately, but maybe there will be no need for that anymore.

what is curious, it’s that the issue seemed to affect primarily ryzen cpu’s. maybe some vendor specific issue with openblas threads.

Julia defaults to more BLAS threads if you have more cores, so given AMD has more cores than most it seems sensible AMD users had most problems, but otherwise memory use is strictly unrelated to CPUs at least their frequency etc.

Right, makes sense.

Is there a lot of instances in the codebase where openblas threads are useful for precompilation?
I wouldn’t expect a lot of linear algebra there.

It is effective.

Tested 1.8.2 and JULIA_NUM_PRECOMPILE_TASKS = 16 worked without any issues

1 Like