Why is printing to a terminal slow?

Hello,
I tried just a simple benchmark of printing a million numbers in an array.

Surprisingly, it takes a lot of time. I imagine this might be connected to how julia communicates with the system, printing parts not the whole chunk. Is that the case? How do I make it faster?

I have modified the julia example to show that most of the time is spent printing but similar times occur when using the Python equivalent:

a = []

function populate()
	global a
	for i = 1:1000000
		push!(a, i)
	end
end

function printit()
	println(a)
end

populate()
@time printit()
$ time julia gen.jl
....
 17.392193 seconds (6.53 M allocations: 213.003 MiB, 0.39% gc time)

real	0m17.653s
user	0m10.038s
sys	0m7.569s
a = []
for i in range(1, 1000001):
	a.append(i)

print(a)
$ time python3 gen.py
....
real	0m0.399s
user	0m0.187s
sys	0m0.051s
2 Likes

The python version takes 46 seconds in my terminal. Are you redirecting the output of the Python program to /dev/null or something?

No, I see the numbers in my terminal. I use alacritty terminal. EDIT: I just tried with Urxvt and Uxterm and it made no difference.

$ uname -srm
Linux 5.6.16-1-MANJARO x86_64
$ python3 -V
Python 3.8.3
$ julia -v
julia version 1.4.2

I run the examples as given above.

I can reproduce that by installing and using alacritty. There’s a couple of natural follow on questions:

  1. what is Python doing to make printing to alacritty this fast?
  2. why are you printing that much data to a terminal?

Printing to a file, these are the same general speed:

python3 print_array.py > out.py.txt   0.19s user 0.03s system  97% cpu 0.230 total
julia   print_array.jl > out.jl.txt   0.75s user 0.15s system 134% cpu 0.671 total

It would be good to get to the bottom of the first question and see if there’s something Julia can do to be faster in this case, but there’s also a reason it hasn’t come up before: printing a lot of data to a terminal is not generally something one wants to do.

7 Likes

I can reproduce @asmar’s results using the built-in terminal on Ubuntu 18.04. For time python3 gen.py I get:

real	0m0.882s
user	0m0.181s
sys	0m0.040s

and just doing @time println(collect(1:1000000)) in Julia 1.4.2 gives: 13.133670 seconds (6.13 M allocations: 200.272 MiB, 0.35% gc time)

1 Like
  1. I am sorry, I edited my comment previously as I didn’t expect it to have any significance - I have tried with Uxterm and Urxvt instead of alacritty and the performance was the same. I am curious what terminal emulator you, @StefanKarpinski , use? I have no idea as to why it might work well with some terminals and not the others.

  2. I am learning julia and was just fiddling around. :wink: However, I feel like this is something worth looking at anyway. Sometimes you may want to just run a script multiple times while debugging and not modify the script not to print the huge data.

I use iTerm2.

Sure. Definitely good to look into, as I said. (Issue opened: #36639)

Sometimes you may want to just run a script multiple times while debugging and not modify the script not to print the huge data.

But why print huge data to the terminal in the first place? To a file, ok, that makes sense (and is fast). To a terminal, what’s the point? You can’t look at it all since it’s huge and you’re not saving it to a file… so why print it at all?

3 Likes

There might be some buffering optimization in Python: doing

for i in range(1, 1000001):
        print(i)

instead of printing the array uses 5.5s instead of 0.8s in my terminal (still faster than Julia with 16s though)

Makes sense given how the first script’s output shows up all at once. Terminal emulators are really not optimized for large quantities of text output…

I cannot test that, unfortunately.

A similar example has happened to me many times. Let’s say I am working with an array of 10 elements regularly. I am printing each item and in the end I print a value calculated in the loop. I realize I have a mistake in my computation algorithm and test with a large input to get a feeling of what might have happened wrong - thus I am only concerned about the final print and just ignore the preceding mess. :slight_smile: But sure, you’re right that this is not the typical use case, @StefanKarpinski.

1 Like

Yes, I have also tried this. This has been my initial inspiration of what might be happening in Julia.

Thanks for bringing this up. If you encounter any other issues, don’t hesitate to post.

2 Likes

https://github.com/JuliaLang/julia/issues/36639

I want to add, that on windows, this is deadly slow. I actually expected this (before I tried it).
The script from the first post takes 122 seconds (in cmd).
PowerShell did not seem faster (EDIT: 400s).
I also tried wt: it took ages as well (EDIT: 280s)
I am not sure what terminal vscode uses by default, but it was definitely the slowest (EDIT: 1414s)

Notably I did run some of these in parallel.
I wonder if the size of the terminal (full screen, minimized, ‘normal’) has any impact…

Maybe someone with Windows could challenge/confirm my numbers?

VS Code uses a modified version of Xterm.js.

The size of the terminal window should have minimal impact. Either the terminal supports fast rendering of large texts regardless of window size or it doesn’t.

Does Python version run faster? If so, it is most likely the aforementioned buffering.

I have found that using this script produces the same output but in hundreds of milliseconds:

a = []
for i in 1:1000000
	push!(a, i)
end

println("Any[" * join(a, ", ") * "]")

Thus, it seems like an issue with printing arrays to me rather than IO in general. I am currently going through the code to try to narrow it down. Should I continue on GitHub or here?

I believe that the problem is that when manually joining numbers to a string then the content of the string is printed with single printwrite call in [1] whereas when printing a vector, you end up calling write in [1] for each number from [2].

[1] https://github.com/JuliaLang/julia/blob/5f2bb1d194d220894d130db5be06bec68c726f0c/base/strings/io.jl#L184-L187
[2] https://github.com/JuliaLang/julia/blob/5f2bb1d194d220894d130db5be06bec68c726f0c/base/show.jl#L973

2 Likes

Yes. Python is actually blazing fast :slight_smile: (about seven seconds (cmd) according to my watch (my Python know-how is near zero)).

The code you posted above is 8 seconds in Julia/cmd. So this is comparable to Python for me.

1 Like

The only time I print tons of data to the REPL is by accident. When I am impatient, I just reach for good old Ctrl-C.

So if this can be fixed trivially, then that would be nice, otherwise I don’t think this is something that needs to be optimized.

2 Likes