Is there a way to prevent Julia from updating the registries almost every time?

Just a further comment about size of registry transfers. You are talking about hundred of megabytes, which hints me that it is a Git tar that is downloaded (with 1.7, the tar is only 3MB). If you are on 1.6, you could try to change registry type by registry rm General / registry add General which should change the transfer to tar of data only, if I am not mistaken.

My 2cents, hoping I am not too wrong …

Not really an edge case since systems where antivirus software slowly checks every file update has similar characteristics. That is why Julia 1.7 introduces and defaults to the option of not unpacking the registry at all. I recommend that you try the Julia 1.7 beta to see if it works better for you. You should make sure to remove your unpacked registry though, e.g. pkg> registry rm General.

1 Like

Where does this 100 MB number come from? It should be about 3.5 MB / update.

1 Like

for reference on Windows 10 and a farily new notebook.

EDIT: this is on Julia 1.5.4

julia> @time Pkg.Registry.update()
   Updating registry at `C:\Users\bernhard.konig\.julia\registries\General`
   Updating git-repo `https://github.com/JuliaRegistries/General.git`
  8.320919 seconds (1.89 M allocations: 102.974 MiB, 1.16% gc time)

That’s the amount of memory allocated by Julia via GC during the timed operation, not the size of anything transferred. Git updates transfer less data than tarball updates because git only sends the changes since the last update (with history, so it’s a bit larger than just sending a patch, but still). The negotiation can take a while but it should not be transferring hundreds of megabytes of data.

5 Likes

Simple: I checked the network data usage on my mobile phone (network is shared via wifi hotspot) before and after the registry update. And it increased by roughly 100 MB. It is good to know that it shouldn’t take that much.

2 Likes

Thanks, @jdad, My registry copy was inherited from Julia v1.0, I wasn’t aware that this could cause an issue. I tried what you suggested, and I will check the network usage for future registry updates.

Do you have the registry as a Git repository? 100+ MB is the size of the entire cloned repository

Even so, a registry update should only do git fetch and not reclone the whole repo…

Yes, I have an unpacked Git repo, and I’m also surprised why the update used so much traffic. I agree it shouldn’t. I’m hoping that it was just a local quirk, and the solution suggested by @jdad, i.e., remove and readd the General registry should solve that.

I don’t think that should do anything.

  • If you are using the PkgServer, it is not doing incremental updates anyway so it will download that full 3.5MB file all the time.
  • If you are using git, it will just need to reclone the whole registry before it goes back to incrementally fetching.

My guess is something was off with your measurement.

1 Like

If the last update was a very long time in the past then the incremental update might have been large. Frequent incremental updates should not be that large.

Haha, last time I did this it was an hour or more to delete. I will grab 1.7 beta!

Yikes. The things they call file systems these days.

2 Likes

My post was mainly about the time (about 8 seconds compared to frederik’s 0.04 seconds, albeit on a different Julia version). I think my last fetch was only a few days ago.
Personally I don’t see why I should care at all about the data transferred.

Offtopic: is there a good technical explanation somewhere online for why Windows filesystem operations are so slow? I don’t understand why they’ve tolerated this being so bad for so long.

2 Likes

I think the basic answer is that ntfs is 25 years old. Mac got hfs+ in 2005 and apfs in 2015(ish) Linux got ext3, ext4, and now zfs. Windows just never got a good file system (probably for backwards compatibility reasons). Open zfs runs on windows though, so I have some hope they will over time switch to it, but that’s fairly optimistic on my part.

2 Likes

See also https://github.com/JuliaLang/Pkg.jl/issues/2014#issuecomment-690936584 for slow tar unpacking on Windows.

For the registry, we hit one of the points mentioned in: Gregory Szorc's Digital Home | Surprisingly Slow

Closing File Handles on Windows


While I didn’t realize it at the time, the cause for this was/is Windows Defender. Windows Defender (and other anti-virus / scanning software) typically work on Windows by installing what’s called a filesystem filter driver. This is a kernel driver that essentially hooks itself into the kernel and receives callbacks on I/O and filesystem events. It turns out the close file callback triggers scanning of written data. And this scanning appears to occur synchronously, blocking CloseHandle() from returning. This adds milliseconds of overhead. The net effect is that file mutation I/O on Windows is drastically reduced by Windows Defender and other A/V scanners.
As far as I can tell, as long as Windows Defender (and presumably other A/V scanners) are running, there’s no way to make the Windows I/O APIs consistently fast.

5 Likes

I presume (hope) this also applies to 1.8?