from the community news: " Biofast is a small benchmark for evaluating the performance of programming languages and implementations on a few common tasks in the field of Bioinformatics. It currently includes two benchmarks: FASTQ parsing and interval query."
I recreated the C, Julia and Python part of the first benchmark. A few observations:
The library Klib.jl is a bit obscure, seemingly just used by one guy, and not very well maintained. I donāt think itās useful to pick that code apart, itās just one random guyās Julia code. Itās much more interesting to look at the FASTQ benchmark using FASTX.jl from BioJulia.
I donāt see the same relative numbers b/w C, Python and Julia that he does - for me, Julia does about 30% better on non-zipped data. It might be a matter of hardware, or the fact that I ran FASTX v 1.1, not 1.0.
Looking at his Julia code, it looks great. Thereās some type instability, but union splitting takes care of that, and itās no performance concern.
For the FASTX code, I noticed the FASTQ parser does not use inbounds. Since it reads the input byte by byte, this has a rather large effect, taking another 30% time off the non-zipped input. With this optimization, the running speed of Julia is about 1.25-1.3 times as long as his C code. Not bad! We probably shouldnāt actually remove the boundschecks, because IMO the extra safety is worth a little extra time, especially considering that FASTQ files are essentially always gzipped.
For the gzipped data, it appears that CodecZlib is very slow, something like 4x slower than zlib. Thatās very strange, considering it just calls zlib directly. Profiling confirms that almost all time is spent in the ccall line. I created this issue - Jaakko Ruohio on Slack discovered about half the problem was that zlib was not compiled with -O3, but there are still more gains to be had, I just donāt know where to find them. After this, Julia is about 1.4 slower than C on gzipped data - whereas we ought to be very near C speed here, something like a factor of 1.1. If anyone can fix the CodecZlib issue, that would be great.
The implementation of the FASTQ format and the parsing is very efficient. Except for the boundscheck issue and the zlib issue, the only place left I can find is to optimize Automa.jl directly - which would always be welcome, but I donāt know how.
Edit: Oh yes, and the elephant in the room: The Julia code goes through the 5 million FASTQ read in 5 seconds, whereas it takes around 11 when called from command line due to compile time latency. That puts us way behind C speed, just a tad before Python.
Edit2: It seems the zlib.h source code that CodecZlib obtains from zlib.net is not as optimized as the zlib library that ships with MacOS. The remaining difference in performance is therefore due to an upstream ineffiency. I donāt know if we can find a faster implementation of zlib to use in CodecZlib - presumably MacOS can make certain assumptions about what OS and CPU their users have which Julia canāt.
The other elephant in the room is how much does any of this actually matter vs. other perhaps less benchmarkable aspects of productivity gained using julia and BioJulia?
Indeed - that becomes a much more involved discussion than pure performance and canāt easily be resolved with tables and measurements. Nonetheless, I do think itās nice to have the foundation of Julia bioinfo packages be so fast that people would not be turned away from implementing highly performance-sensitive tools it in, like a short-read aligner or an assembler.
W.r.t the broader implicaitons of Julia, first notice the absolute timings. On my laptop, Julia crunches 3.6 million reads/sec uncompressed and something like 1.6 m/s compressed. Itās hard to think of a real-life application where any useful work could be done so quick that these timings matter much.
Second, I think itās very instructive to compare the approach to FASTQ parsing represented by klib.h vs FASTX.jl. Have a look at the source. FASTX implements its parser through a high-level description of the FASTQ format thanks to Automa and lets the machine figure out most of the gritty details. In contrast, look at the C code, here the entire parsing is written by hand. I know which parser Iād rather write myself, and which parser Iād trust results from (assuming Automa itself is well tested). Also, I had some fun trying to break the parsers by seeing what kind of broken FASTQ files they would accept. Unsurprisingly, FASTXās parser is more robust (because itās failsafe by design!).
At the risk of sounding catty. I also despair at people STILL complaining about Bio.jl loading. It has been unsupported for long enough, and the header in the readme of the repo is quite clear, and the status badge says āinactiveā.
Also, reading:
āProbably my Julia implementations here will get most slaps. I have seen quite a few you-are-holding-the-phone-wrong type of responses from Julia supporters.ā
Maybe Iām crazy, but essentially saying āoh I knew my implementation would be criticisedā ā¦ when we can see the implementation IS affecting the benchmarkā¦ is not a valid defence of the implementation.
Hm, maybe thereās actually something actionable there. Like making a large red banner on the Bio.jl GitHub, or printing a warning when you import Bio.jl
I think we should release a new patch version where the only change is that it prints a warning saying that itās deprecated when people try to load it.
This line is the most interesting I see in the post
āAlso importantly, the Julia developers do not value backward compatibility. There may be a python2-to-3 like transition in several years if they still hold their views by then. I wouldnāt take the risk.ā
Yeah, itās not like we run the tests for all registered packages on every release, fix the problems in Julia they reveal, make patches to make the internals more backward compatible if we see people use it, open PRs and issues on packages that used internals that is now changed etc etc.
And itās not like I have a spreadsheet with e.g. all the test regressions for packages in 1.5 that we are looking into (even though 90% of the time the test errors are due to bad tests in packages or relying on Julia internals). Oh, wait, I do, itās here: PkgEval 1.5 - Google Sheets.
Ok i forgot about that. Itās about expectations then. I never expected 0.6 to 1 be no breaking changes. Massive efforts were put into 0.7 to make the transition easier. So yeah.
Actually, I expected things to break from 0.6 to 1.
Surely, things broke real bad when Python 1 ā 2 but maybe PYthon wasnāt as big when the transition happend and hence it was this huge drama like 2->3 was.
Didnāt dig through the spreadsheet (Iām on mobile), but Iām guessing your not pulling into from registries other than General, right?
To be clear, I think that would be a totally reasonable choice, but another thing us BioJulia folks need to consider with our separate registry. I donāt know if youāve ever publicized that spreadsheet before, but it seems likely that there are other āunofficialā community support things that we donāt have access to. @Ward9250@bicycle1885 something to think about
Itās interesting to see how fast the Julia code is (after tuning), and while also how condensed it is (could be a bit more), and same for Crystal language code.
Author here. This is a benchmark. We would like to compare different implementations of the same algorithm. Klib.jl is the most relevant here. I want to thank @bicycle1885 again for improving this Julia implementation. I really appreciate.
My main concern with Julia is its silent performance traps: correct results but poor performance. The poor performance here is caused by small details like length vs sizeof and type conversion. These are not apparent to new users. I guess even experienced devs may need to pay extra attention. Currently, Julia is still slow on bedcov. I probably messed up typing somewhere, but I am not sure where, given that the code is just doing array accesses.
I am more impressed by Crystal to be honest. My first working Crystal implementation is close to the reported speed. The Nim FASTQ parser is also fast on my first try. A good language should set few traps.
As to other points in this thread:
This is not a zlib benchmark. The reported number now comes from the system zlib, thanks to @bicycle1885ās suggestion. I can understand why Julia ships its own zlib, but this did catch me off guard, again.
The klib fastx parser is more flexible. It parses multi-line fasta and fastq at the same time. You donāt need to tell klib the input format. At least this is an important feature to me. With a regex parser, Fastx.jl canāt do it. In addition, klib has no Julia dependencies and is more than 10 times faster to compile. Due to the long startup of Julia, lightweight libraries are more appreciated.
Productivity: on these two tasks, Crystal and Nim are also expressive. I spent less time on Crystal and Nim to achieve descent performance. I still have performance issues with Julia on bedcov.
When you read through a 100Gb gzipād fastq file, performance matters. 30min vs 1hr is a huge difference. High-performance tools often put fastq reading in a separate thread because it is too slow. Zlib is the main bottleneck here, but parsing time should be minimized as well.
Fastx.jl. I installed Fastx.jl following the instruction on its website. I got 1.0.0. It didnāt work with CodecZlib. I manually fixed that.
Bio.jl. I didnāt know Fastx.jl. I googled biojulia and went to biojulia.net. The website has no documentations. It doesnāt tell me Fastx.jl is the right way to parse fastq. I thought Biojulia is like all the other Bio* projects with a single module. I ended up with a Bio.jl script first. Also, Bio.jl is often the top google search. For example, you can try ābiojulia interval overlapā. By the way, I wanted to implement bedcov with Biojulia. I couldnāt find the right API and gave up.