Bad performance of eachline() on STDIN

I’m not sure I’d say significant. I suspect perl is doing some trickery since you are not using the line being read. Try these two programs they both report their speed in seconds so hopefully comparing apples to apples:

Perl:

#!/bin/perl -w
use strict;
use warnings;
use utf8;
use Time::HiRes qw(time);

my $start   = time();
my $counter = 0;
while(<>) {
    $counter = $counter + length($_);
}
my $total = time() - $start;
print "STDIN read time ($counter chars): $total\n";

Julia:

using Dates

function from_stdin()
    start = now()
    counter = 0
    for line = eachline(;keep=true)
        counter += length(line)
    end
    total = Dates.value(now() - start)/1000
    println("STDIN read time ($counter chars): $total");
end
from_stdin()

For me on a 100 million line file I get:

$ ./ptest < 100_000_000.txt 
STDIN read time (1388888898 chars): 11.8912749290466

$ julia test.jl < 100_000_000.txt 
STDIN read time (1388888898 lines): 12.91

So a difference of 1 second over a runtime of 12 seconds. Perl still runs faster on this small test, no clue where the overhead is. I suspect Julia might be copying bytes out of the buffer when it converts them to a string, and Perl might not be, but that’s just a wild guess.

1 Like