Package load speed: Windows vs Linux

I have a feeling that Julia package import times are significantly faster on Linux than on Windows.
See e.g. this thread where:

  • @giordano’s seven-year-old Linux laptop loads Unitful in 0.44 seconds
  • my two-year-old Windows laptop (i7, 8-core, SSD) loads Unitful in 1.38 seconds (3x as slow)

(Both measurements were made in a fresh tmp env, in a --startup-file=no session. My Julia version was 1.8.1)

Quoting Mosè Giordano in that same thread, “I/O is pretty terrible on Windows”

(For an explanation of why Windows I/O is slower than Linux, see e.g. these two reddit threads: one, two. The second links to "I Contribute to the Windows Kernel. We Are Slower Than Other Operating Systems. Here Is Why.")


With native code caching in upcoming Julia versions, these import times will increase further (see left column in image below)

So, I’m a bit worried. (Especially because, I presume, most core devs use Linux. And thus, loading more small files might feel like a relatively cheap operation – correct me if I’m wrong).

Should I switch to Linux? I’ll miss PowerPoint.
Should Julia package loading do something different on Windows? (A shot in the dark: maybe bundling many small cache files together in one big tarball would help, to pay the I/O tax only once, instead of many times)



Nice plot by @msjgriffiths (source) of Tim Holy’s data in #47184:

1 Like

Other things being possibly true, one must also not forget that Windows Julia binaries are built on Unix instead of being natively Windows. I’ve seen many times dlls built with the MinGW tool-chain being 2, 3, 5, ~10 times bigger than those built with VisualStudio.

There was a time when Octave had a native Windows built. The main VS dll was ~20 MB and 200 MB in the official cross-compiling distribution. Unfortunately it’s easy to point fingers and forget this little detail.

1 Like

That’s not true. And even if that were true, it doesn’t address the fact I/O on Windows is intrinsically slow also outside of the Julia world.

2 Likes

What part is not true?

The part that I quoted: the official Julia release is compiled natively on Windows.

With VisualStudio?

I think you’re getting off-topic.

1 Like

That might make the difference between Linux and Windows smaller, not bigger. It seems to me likely that the change in loading time will be OS-independent.

Also, get the profiler out and start measuring things before speculating too much about performance numbers. If it actually is I/O that is slow, then which part of I/O? Once that is known, then there can be discussions about how to fix it.

I’d love to but am not well acquainted with low-level systems software. How’d you go about that?

You can learn :).

As a reference see the productive discussion in Loading is 10x slower under Windows than WSL on the same machine · Issue #40570 · JuliaLang/julia · GitHub which lead to e.g. prevent doing excessive file system checks in require calls by KristofferC · Pull Request #40890 · JuliaLang/julia · GitHub.

Thanks for the pointers, hadn’t seen those yet (I only searched the discourse for keywords)

Just for reference and anyone stumbling upon this:
I am now doing my Julia work in WSL2 (windows subsystem for linux), and…

I’m absolutely baffled. Package load times are amazingly fast, in comparison.
So much so that I feel WSL should be the official recommendation for Julia on Windows.

Edit: see post below. Seems like most of this speedup was due to something else, like upgrading Julia from 1.8 to 1.9 , and not due to WSL

2 Likes

Can you give some more concrete examples? I’m interested in looking at this but e.g. Windows is slower than WSL2 in loading GR, Preferences, Julia 1.9.0-beta2 - #5 by kristoffer.carlsson failed to repro for me.

My general sense is that Windows performance may get less attention than Linux performance. It seems much more likely that I’m the first one to recognize an issue on Windows. This could be merely because many of the developers and users of Julia tend to primarily work on Linux.

The exchange that Kristoffer linked above revealed an issue that was affecting Windows users more than Linux users. stat is regularly used when accessing a file since it determines basic information such as does the file exist and if you have permission to access it. In particular, a basic stat operation on Linux is about 100x faster than its Windows analog (~100 nanoseconds versus ~10 microseconds). At a small scale a few microseconds is not noticeable, but over many file accesses this becomes noticeable on Windows before Linux. In that case, Julia was accessing the same file repeatedly. On Linux the cost for doing so is negligible but the cost is noticeable on Windows.

Fortunately due to Windows Subsystem for Linux 2, this is quite easy to evaluate now. Simply profile the operations in Windows and WSL2 and report the output of the following.

using Profile
Profile.clear()
@profile begin
   <task that is slower on Windows than WSL2>
end
Profile.print(; C = true, format = :flat, sortedby = :count, mincount = 4)

The difference between Windows and WSL2 does point to something having gone awry in the Win32 API or NTFS file system rather than a fundamental issue in the Windows NT microkernel.

2 Likes

Ok I tried a quick benchmark (not yet with any Profileing); and it seems WSL is not actually that much faster here.
Seems like the big package load time speedup I observed was rather due to upgrading from Julia 1.8 to 1.9. My apologies :sweat_smile:

In this test I use a kind-of-messy collection of re-exports and tools, GitHub - tfiers/MyToolbox.jl: To import at the start of an interactive session. Re-exports useful libraries and defines miscellaneous utility functions that don't fit in a proper package.

Protocol:

mkdir timetest
julia --startup-file=no --project=timetest -e "using Pkg; pkg\"add https://github.com/tfiers/MyToolbox.jl#11629da\"; @time using MyToolbox"

(warning, this takes a while in a fresh julia install: 152 packages will be precompiled. In parallel, so your cpu is occupied for a bit)

Windows:

  4.416453 seconds (2.47 M allocations: 161.221 MiB, 3.22% gc time, 0.59% compilation time)

WSL2:

  3.383194 seconds (2.35 M allocations: 154.908 MiB, 5.67% gc time, 0.75% compilation time)

These times very a bit from run to run, but WSL is consistently about one second faster

Julia 1.9.0-beta3 on both.


I don’t have the previous Julia version installed anymore, but this import used to take 40+ seconds until just a day ago (in an IJulia notebook, on Julia 1.8, with other packages in the same environment too). Hence my enthousiasm in the above post

1 Like

The experience from IJulia is sometimes strange. I notice that sometimes precompilation caching does not work sometimes with IJulia, but it does pick up the cache once I precompiled with the plain Julia REPL first.

2 Likes

Not to derail this thread too much, but how do you actually use Julia in WSL2? I set up WSL2 and installed Julia, but given that WSL2 is just a headless Ubuntu by default it can’t plot, there’s no GUI applications like VSCode, no browser to run Pluto in etc. I looked into setting things up such that GUI applications can be run but it all seemed pretty cumbersome.

Seems like you need either win 11 or win 10 build 19044+, I recently got a high enough version on my old win 10 and tried it out, worked quite nicely

I also have a vague feeling I have been able to run servers on localhost and access them from windows, but not sure about this and not at that computer right now. A quick google seems to claim this is how windows/wsl should work

but apparently wsl2 lost that ability. Though it seems to be coming back if you have a recent enough build

You can use vs code and wsl together Developing in the Windows Subsystem for Linux with Visual Studio Code

This has been around for several years.

VS Code remote indeed, and for plotting: jupyter notebooks (I use Mambaforge conda), with IJulia.

The notebook server is available just on localhost on Windows, through some magic
I.e. I can confirm what @albheim researched (I have win11 and WSL2)