Hello,
I tried just a simple benchmark of printing a million numbers in an array.
Surprisingly, it takes a lot of time. I imagine this might be connected to how julia communicates with the system, printing parts not the whole chunk. Is that the case? How do I make it faster?
I have modified the julia example to show that most of the time is spent printing but similar times occur when using the Python equivalent:
a = []
function populate()
global a
for i = 1:1000000
push!(a, i)
end
end
function printit()
println(a)
end
populate()
@time printit()
$ time julia gen.jl
....
17.392193 seconds (6.53 M allocations: 213.003 MiB, 0.39% gc time)
real 0m17.653s
user 0m10.038s
sys 0m7.569s
a = []
for i in range(1, 1000001):
a.append(i)
print(a)
$ time python3 gen.py
....
real 0m0.399s
user 0m0.187s
sys 0m0.051s
I can reproduce that by installing and using alacritty. There’s a couple of natural follow on questions:
what is Python doing to make printing to alacritty this fast?
why are you printing that much data to a terminal?
Printing to a file, these are the same general speed:
python3 print_array.py > out.py.txt 0.19s user 0.03s system 97% cpu 0.230 total
julia print_array.jl > out.jl.txt 0.75s user 0.15s system 134% cpu 0.671 total
It would be good to get to the bottom of the first question and see if there’s something Julia can do to be faster in this case, but there’s also a reason it hasn’t come up before: printing a lot of data to a terminal is not generally something one wants to do.
I am sorry, I edited my comment previously as I didn’t expect it to have any significance - I have tried with Uxterm and Urxvt instead of alacritty and the performance was the same. I am curious what terminal emulator you, @StefanKarpinski , use? I have no idea as to why it might work well with some terminals and not the others.
I am learning julia and was just fiddling around. However, I feel like this is something worth looking at anyway. Sometimes you may want to just run a script multiple times while debugging and not modify the script not to print the huge data.
Sure. Definitely good to look into, as I said. (Issue opened: #36639)
Sometimes you may want to just run a script multiple times while debugging and not modify the script not to print the huge data.
But why print huge data to the terminal in the first place? To a file, ok, that makes sense (and is fast). To a terminal, what’s the point? You can’t look at it all since it’s huge and you’re not saving it to a file… so why print it at all?
A similar example has happened to me many times. Let’s say I am working with an array of 10 elements regularly. I am printing each item and in the end I print a value calculated in the loop. I realize I have a mistake in my computation algorithm and test with a large input to get a feeling of what might have happened wrong - thus I am only concerned about the final print and just ignore the preceding mess. But sure, you’re right that this is not the typical use case, @StefanKarpinski.
I want to add, that on windows, this is deadly slow. I actually expected this (before I tried it).
The script from the first post takes 122 seconds (in cmd).
PowerShell did not seem faster (EDIT: 400s).
I also tried wt: it took ages as well (EDIT: 280s)
I am not sure what terminal vscode uses by default, but it was definitely the slowest (EDIT: 1414s)
Notably I did run some of these in parallel.
I wonder if the size of the terminal (full screen, minimized, ‘normal’) has any impact…
Maybe someone with Windows could challenge/confirm my numbers?
The size of the terminal window should have minimal impact. Either the terminal supports fast rendering of large texts regardless of window size or it doesn’t.
Does Python version run faster? If so, it is most likely the aforementioned buffering.
I have found that using this script produces the same output but in hundreds of milliseconds:
a = []
for i in 1:1000000
push!(a, i)
end
println("Any[" * join(a, ", ") * "]")
Thus, it seems like an issue with printing arrays to me rather than IO in general. I am currently going through the code to try to narrow it down. Should I continue on GitHub or here?
I believe that the problem is that when manually joining numbers to a string then the content of the string is printed with single print → write call in [1] whereas when printing a vector, you end up calling write in [1] for each number from [2].