Why is printing to a terminal slow?

asmar · July 13, 2020, 4:34pm

Hello,
I tried just a simple benchmark of printing a million numbers in an array.

Surprisingly, it takes a lot of time. I imagine this might be connected to how julia communicates with the system, printing parts not the whole chunk. Is that the case? How do I make it faster?

I have modified the julia example to show that most of the time is spent printing but similar times occur when using the Python equivalent:

a = []

function populate()
	global a
	for i = 1:1000000
		push!(a, i)
	end
end

function printit()
	println(a)
end

populate()
@time printit()

$ time julia gen.jl
....
 17.392193 seconds (6.53 M allocations: 213.003 MiB, 0.39% gc time)

real	0m17.653s
user	0m10.038s
sys	0m7.569s

a = []
for i in range(1, 1000001):
	a.append(i)

print(a)

$ time python3 gen.py
....
real	0m0.399s
user	0m0.187s
sys	0m0.051s

StefanKarpinski · July 13, 2020, 4:42pm

The python version takes 46 seconds in my terminal. Are you redirecting the output of the Python program to /dev/null or something?

asmar · July 13, 2020, 4:44pm

No, I see the numbers in my terminal. I use alacritty terminal. EDIT: I just tried with Urxvt and Uxterm and it made no difference.

asmar · July 13, 2020, 4:46pm

$ uname -srm
Linux 5.6.16-1-MANJARO x86_64
$ python3 -V
Python 3.8.3
$ julia -v
julia version 1.4.2

I run the examples as given above.

StefanKarpinski · July 13, 2020, 5:01pm

I can reproduce that by installing and using alacritty. There’s a couple of natural follow on questions:

what is Python doing to make printing to alacritty this fast?
why are you printing that much data to a terminal?

Printing to a file, these are the same general speed:

python3 print_array.py > out.py.txt   0.19s user 0.03s system  97% cpu 0.230 total
julia   print_array.jl > out.jl.txt   0.75s user 0.15s system 134% cpu 0.671 total

It would be good to get to the bottom of the first question and see if there’s something Julia can do to be faster in this case, but there’s also a reason it hasn’t come up before: printing a lot of data to a terminal is not generally something one wants to do.

rdeits · July 13, 2020, 5:03pm

I can reproduce @asmar’s results using the built-in terminal on Ubuntu 18.04. For time python3 gen.py I get:

real	0m0.882s
user	0m0.181s
sys	0m0.040s

and just doing @time println(collect(1:1000000)) in Julia 1.4.2 gives: 13.133670 seconds (6.13 M allocations: 200.272 MiB, 0.35% gc time)

asmar · July 13, 2020, 5:06pm

I am sorry, I edited my comment previously as I didn’t expect it to have any significance - I have tried with Uxterm and Urxvt instead of alacritty and the performance was the same. I am curious what terminal emulator you, @StefanKarpinski , use? I have no idea as to why it might work well with some terminals and not the others.
I am learning julia and was just fiddling around. However, I feel like this is something worth looking at anyway. Sometimes you may want to just run a script multiple times while debugging and not modify the script not to print the huge data.

StefanKarpinski · July 13, 2020, 5:10pm

I use iTerm2.

Sure. Definitely good to look into, as I said. (Issue opened: #36639)

Sometimes you may want to just run a script multiple times while debugging and not modify the script not to print the huge data.

But why print huge data to the terminal in the first place? To a file, ok, that makes sense (and is fast). To a terminal, what’s the point? You can’t look at it all since it’s huge and you’re not saving it to a file… so why print it at all?

rfourquet · July 13, 2020, 5:11pm

There might be some buffering optimization in Python: doing

for i in range(1, 1000001):
        print(i)

instead of printing the array uses 5.5s instead of 0.8s in my terminal (still faster than Julia with 16s though)

ToucheSir · July 13, 2020, 5:15pm

Makes sense given how the first script’s output shows up all at once. Terminal emulators are really not optimized for large quantities of text output…

asmar · July 13, 2020, 5:17pm

I cannot test that, unfortunately.

A similar example has happened to me many times. Let’s say I am working with an array of 10 elements regularly. I am printing each item and in the end I print a value calculated in the loop. I realize I have a mistake in my computation algorithm and test with a large input to get a feeling of what might have happened wrong - thus I am only concerned about the final print and just ignore the preceding mess. But sure, you’re right that this is not the typical use case, @StefanKarpinski.

asmar · July 13, 2020, 5:19pm

Yes, I have also tried this. This has been my initial inspiration of what might be happening in Julia.

StefanKarpinski · July 13, 2020, 5:23pm

Thanks for bringing this up. If you encounter any other issues, don’t hesitate to post.

asmar · July 13, 2020, 5:35pm

https://github.com/JuliaLang/julia/issues/36639

bernhard · July 13, 2020, 6:35pm

I want to add, that on windows, this is deadly slow. I actually expected this (before I tried it).
The script from the first post takes 122 seconds (in cmd).
PowerShell did not seem faster (EDIT: 400s).
I also tried wt: it took ages as well (EDIT: 280s)
I am not sure what terminal vscode uses by default, but it was definitely the slowest (EDIT: 1414s)

Notably I did run some of these in parallel.
I wonder if the size of the terminal (full screen, minimized, ‘normal’) has any impact…

Maybe someone with Windows could challenge/confirm my numbers?

asmar · July 13, 2020, 7:48pm

VS Code uses a modified version of Xterm.js.

The size of the terminal window should have minimal impact. Either the terminal supports fast rendering of large texts regardless of window size or it doesn’t.

Does Python version run faster? If so, it is most likely the aforementioned buffering.

asmar · July 13, 2020, 8:34pm

I have found that using this script produces the same output but in hundreds of milliseconds:

a = []
for i in 1:1000000
	push!(a, i)
end

println("Any[" * join(a, ", ") * "]")

Thus, it seems like an issue with printing arrays to me rather than IO in general. I am currently going through the code to try to narrow it down. Should I continue on GitHub or here?

asmar · July 13, 2020, 10:18pm

I believe that the problem is that when manually joining numbers to a string then the content of the string is printed with single print → write call in [1] whereas when printing a vector, you end up calling write in [1] for each number from [2].

[1] https://github.com/JuliaLang/julia/blob/5f2bb1d194d220894d130db5be06bec68c726f0c/base/strings/io.jl#L184-L187
[2] https://github.com/JuliaLang/julia/blob/5f2bb1d194d220894d130db5be06bec68c726f0c/base/show.jl#L973

bernhard · July 14, 2020, 7:11am

Yes. Python is actually blazing fast (about seven seconds (cmd) according to my watch (my Python know-how is near zero)).

The code you posted above is 8 seconds in Julia/cmd. So this is comparable to Python for me.

Tamas_Papp · July 14, 2020, 8:35am

The only time I print tons of data to the REPL is by accident. When I am impatient, I just reach for good old Ctrl-C.

So if this can be fixed trivially, then that would be nice, otherwise I don’t think this is something that needs to be optimized.

Topic		Replies	Views
Julia seems an order of magnitude slower than Python when printing to the terminal, because of issue with "sleep" General Usage performance	67	4054	June 28, 2024
Why Julia is fast in interpreter but slow when dealing with files Performance	11	5950	March 1, 2018
Question on simple performance comparison between Python and Julia General Usage question	23	976	June 13, 2023
Why is this code so slow in julia compared to a numpy implementation? Performance performance	9	3548	October 24, 2017
Why is python faster than Julia Performance	14	1865	March 12, 2020

Why is printing to a terminal slow?

Related topics