I’m not sure I can help, but when I run the exact code you posted, including println, it prints a 152 MB file, 1.9 M lines, in 780 milliseconds.
Can you try to make a Julia script which simply contains
open(x -> foreach(println, eachline(x)), ARGS[1])
And then call it from the shell on a plaintext file, redirecting output:
$ julia my_script.jl my_fasta.fna > /dev/null
And compare it to a similar Python script:
import sys
with open(sys.argv[1]) as file:
for line in file:
print(line, end="")
On my computer, when I run both on my 1.9 M line (80 chars per line) FASTA file, Python uses 0.85 seconds, and Julia uses 1.2 seconds (including precompilation). If your Julia script is significantly slower, then we have narrowed down the problem:
- There are no dependencies in the script, so it’s not LibDeflate.jl or any other package
- It doesn’t print to the terminal, so it’s not the terminal being slow
- It can’t be the OS or the filesystem, because then Python wouldn’t be much faster.
And then it would be a good idea to make an issue on the Julia GitHub to get to the bottom of this.
Edit @brendanofallon the fact that your time spend roughtly halves when you replace println with write might suggest that you are facing trouble with IO locking on the operating system. In Julia, println(x) simply calls print(x, '\n'), which does two write operations instead of one. For each write operation, the file is locked and unlocked. Python doesn’t spend time doing that, because it has the Global Interpreter Lock, when prevents multithreading, and so there is no worry about thread safety.
To test this, please also test the following program:
open(x -> foreach(println, eachline(x)), ARGS[1], lock=false)