I have a noticeable latency in 1.11-rc3 than in 1.10.5, when working locally it isn’t noticeable, but when using Julia through ssh it is quite noticeable.
More info: It’s worst when starting the REPL and it gets smoother over time, like if functions are compiling in the background.
Starting julia with --trace-compile=stderr should print a message for each method that gets compiled. Replacing stderr with a file name should output to that file. Could you try comparing what gets compiled on REPL startup with and without SSH?
Julia takes a tad bit longer to start, and when I start typing, and when I press enter. It’s also present when printing to stdout.
There’s also substantial slowdown especially when precompiling packages, it takes me longer to precompile on the HPC cluster than it takes me to precompile and train the model on my laptop. The only thing I can think of is that there is a lot of file I/O since that may be the only place where there is a big difference between the cluster and my laptop.
Another Cluster also has a Xeon Silver (probably slower single core than my laptop), but it’s always slow whether I’m using the powerful EPYC or the Xeon.
Maybe Julia somehow fails to recognize the architecture precisely? If so, it should be possible to regain (most of the) performance using the --cpu-target option to julia, I think:
So it seems that the performance problem disappeared after I deleted the cache. Even without specifying the target cpu it’s still good. No idea why that might be. Maybe I was using very packages compiled from rc2/rc1.
Is there a way to know which target cpu was detected by Julia?