Latency with v1.11-rc3 REPL over SSH

I have a noticeable latency in 1.11-rc3 than in 1.10.5, when working locally it isn’t noticeable, but when using Julia through ssh it is quite noticeable.

More info: It’s worst when starting the REPL and it gets smoother over time, like if functions are compiling in the background.

Starting julia with --trace-compile=stderr should print a message for each method that gets compiled. Replacing stderr with a file name should output to that file. Could you try comparing what gets compiled on REPL startup with and without SSH?

Could you describe the latency in more detail? Do you experience it when you are typing? After you push enter?

1 Like

Julia takes a tad bit longer to start, and when I start typing, and when I press enter. It’s also present when printing to stdout.

There’s also substantial slowdown especially when precompiling packages, it takes me longer to precompile on the HPC cluster than it takes me to precompile and train the model on my laptop. The only thing I can think of is that there is a lot of file I/O since that may be the only place where there is a big difference between the cluster and my laptop.

There’s no difference between the two.

1 Like

Can you share which CPU you use on the cluster and which on the laptop?
You can find out with the command

cat /proc/cpuinfo  | grep 'name'| uniq

Often server CPUs have much lower clock speeds than desktop (and even laptop) CPUs.

Cluster: AMD EPYC 7662 64 cores
Laptop: i7 13700H

Another Cluster also has a Xeon Silver (probably slower single core than my laptop), but it’s always slow whether I’m using the powerful EPYC or the Xeon.

In 1.10 there isn’t much slowdown.

The EPIC might be powerful, but still has a much lower single-core performance than your laptop.

In addition, 1.11 moved a lot of code out of the default system image to packages, which, if not correctly pre-compiled slows down the start-up time.

Maybe Julia somehow fails to recognize the architecture precisely? If so, it should be possible to regain (most of the) performance using the --cpu-target option to julia, I think:

https://docs.julialang.org/en/v1.11-dev/manual/command-line-interface/

1 Like

It appears to be faster when I set cpu target to znver2 (Zen 2), will test some more.

1 Like

With Julia 1.11 you also have znver3 and znver4 available.

1 Like

I thought that in my case since this is a Zen 2 CPU I should set to znver2, or is it irrelevant?

Actually, speaking of target, I often get the message “cache misses: target mismatch” while precompiling packages, even now with -C znver2.

1 Like

It is relevant to choose the heighest cpu version that matches your target.

Did you clear your cache after adding the parameter znver2?

No, probably why I get a cache miss.

You could delete the .julia folder if you are brave.

As long as you are using juliaup everything gets re-installed automatically.

To clear out the compiled cache, only deleting the .julia/compiled/v1.11 directory should be enough. No need to delete the entire .julia directory.

3 Likes

It’s probably better in the latest RC4. I got some fix in they might apply.

So it seems that the performance problem disappeared after I deleted the cache. Even without specifying the target cpu it’s still good. No idea why that might be. Maybe I was using very packages compiled from rc2/rc1.

Is there a way to know which target cpu was detected by Julia?

1 Like