Indeed - that becomes a much more involved discussion than pure performance and can’t easily be resolved with tables and measurements. Nonetheless, I do think it’s nice to have the foundation of Julia bioinfo packages be so fast that people would not be turned away from implementing highly performance-sensitive tools it in, like a short-read aligner or an assembler.
W.r.t the broader implicaitons of Julia, first notice the absolute timings. On my laptop, Julia crunches 3.6 million reads/sec uncompressed and something like 1.6 m/s compressed. It’s hard to think of a real-life application where any useful work could be done so quick that these timings matter much.
Second, I think it’s very instructive to compare the approach to FASTQ parsing represented by klib.h vs FASTX.jl. Have a look at the source. FASTX implements its parser through a high-level description of the FASTQ format thanks to Automa and lets the machine figure out most of the gritty details. In contrast, look at the C code, here the entire parsing is written by hand. I know which parser I’d rather write myself, and which parser I’d trust results from (assuming Automa itself is well tested). Also, I had some fun trying to break the parsers by seeing what kind of broken FASTQ files they would accept. Unsurprisingly, FASTX’s parser is more robust (because it’s failsafe by design!).