With native code caching in upcoming Julia versions, these import times will increase further (see left column in image below)
So, I’m a bit worried. (Especially because, I presume, most core devs use Linux. And thus, loading more small files might feel like a relatively cheap operation – correct me if I’m wrong).
Should I switch to Linux? I’ll miss PowerPoint.
Should Julia package loading do something different on Windows? (A shot in the dark: maybe bundling many small cache files together in one big tarball would help, to pay the I/O tax only once, instead of many times)
Other things being possibly true, one must also not forget that Windows Julia binaries are built on Unix instead of being natively Windows. I’ve seen many times dlls built with the MinGW tool-chain being 2, 3, 5, ~10 times bigger than those built with VisualStudio.
There was a time when Octave had a native Windows built. The main VS dll was ~20 MB and 200 MB in the official cross-compiling distribution. Unfortunately it’s easy to point fingers and forget this little detail.
That might make the difference between Linux and Windows smaller, not bigger. It seems to me likely that the change in loading time will be OS-independent.
Also, get the profiler out and start measuring things before speculating too much about performance numbers. If it actually is I/O that is slow, then which part of I/O? Once that is known, then there can be discussions about how to fix it.
Just for reference and anyone stumbling upon this:
I am now doing my Julia work in WSL2 (windows subsystem for linux), and…
I’m absolutely baffled. Package load times are amazingly fast, in comparison.
So much so that I feel WSL should be the official recommendation for Julia on Windows.
Edit: see post below. Seems like most of this speedup was due to something else, like upgrading Julia from 1.8 to 1.9 , and not due to WSL
My general sense is that Windows performance may get less attention than Linux performance. It seems much more likely that I’m the first one to recognize an issue on Windows. This could be merely because many of the developers and users of Julia tend to primarily work on Linux.
The exchange that Kristoffer linked above revealed an issue that was affecting Windows users more than Linux users. stat is regularly used when accessing a file since it determines basic information such as does the file exist and if you have permission to access it. In particular, a basic stat operation on Linux is about 100x faster than its Windows analog (~100 nanoseconds versus ~10 microseconds). At a small scale a few microseconds is not noticeable, but over many file accesses this becomes noticeable on Windows before Linux. In that case, Julia was accessing the same file repeatedly. On Linux the cost for doing so is negligible but the cost is noticeable on Windows.
Fortunately due to Windows Subsystem for Linux 2, this is quite easy to evaluate now. Simply profile the operations in Windows and WSL2 and report the output of the following.
using Profile
Profile.clear()
@profile begin
<task that is slower on Windows than WSL2>
end
Profile.print(; C = true, format = :flat, sortedby = :count, mincount = 4)
The difference between Windows and WSL2 does point to something having gone awry in the Win32 API or NTFS file system rather than a fundamental issue in the Windows NT microkernel.
Ok I tried a quick benchmark (not yet with any Profileing); and it seems WSL is not actually that much faster here.
Seems like the big package load time speedup I observed was rather due to upgrading from Julia 1.8 to 1.9. My apologies
mkdir timetest
julia --startup-file=no --project=timetest -e "using Pkg; pkg\"add https://github.com/tfiers/MyToolbox.jl#11629da\"; @time using MyToolbox"
(warning, this takes a while in a fresh julia install: 152 packages will be precompiled. In parallel, so your cpu is occupied for a bit)
Windows:
4.416453 seconds (2.47 M allocations: 161.221 MiB, 3.22% gc time, 0.59% compilation time)
WSL2:
3.383194 seconds (2.35 M allocations: 154.908 MiB, 5.67% gc time, 0.75% compilation time)
These times very a bit from run to run, but WSL is consistently about one second faster
Julia 1.9.0-beta3 on both.
I don’t have the previous Julia version installed anymore, but this import used to take 40+ seconds until just a day ago (in an IJulia notebook, on Julia 1.8, with other packages in the same environment too). Hence my enthousiasm in the above post
The experience from IJulia is sometimes strange. I notice that sometimes precompilation caching does not work sometimes with IJulia, but it does pick up the cache once I precompiled with the plain Julia REPL first.
Not to derail this thread too much, but how do you actually use Julia in WSL2? I set up WSL2 and installed Julia, but given that WSL2 is just a headless Ubuntu by default it can’t plot, there’s no GUI applications like VSCode, no browser to run Pluto in etc. I looked into setting things up such that GUI applications can be run but it all seemed pretty cumbersome.
Seems like you need either win 11 or win 10 build 19044+, I recently got a high enough version on my old win 10 and tried it out, worked quite nicely
I also have a vague feeling I have been able to run servers on localhost and access them from windows, but not sure about this and not at that computer right now. A quick google seems to claim this is how windows/wsl should work
but apparently wsl2 lost that ability. Though it seems to be coming back if you have a recent enough build