"A Comparison of Three Programming Languages for a Full-Fledged Next-Generation Sequencing Tool" selects...Go?

Was wondering if anyone has had a chance to look at this paper posted to bioRxiv yesterday? I’ve only just briefly skimmed the paper and associated code, but I find the results…interesting. The authors appear to have taken a tool written in Common Lisp and ported it 3 times to Java, Go, and C++. In the end, they conclude that Java is faster than Go, but Go uses significantly less memory. Most surprisingly, they claim that C++17 was both slower and consumed more memory.

It seems the authors never even considered Julia. I would be curious, though, to see how Julia could stack up against their reimplementation in Go.

probably they tested before Julia 1.0 ( Julia 1.0 - released “on Aug 9, 2018” )

  • and as I see the posted paper - tested with Go 1.9.5 ;
    • and the next Go version: Go 1.9.6 released “on Apr 30, 2018”

the Go implementation started on 07/2014

Given the large number of programming languages and finite resources, we should not be surprised when a benchmark/comparison does not consider Julia.

I think that in the short run, especially in commercial settings, familiarity is an important factor. If someone already knows language X, and X is reasonably fast, it may make sense to use it instead of Y even Y can potentially be faster but would require months to learn.

In fact, I am more surprised that the original was written in CL in 2014. But Pascal Costanza is a well-known Lisp programmer.

I guess the only way to see that is reimplement it in Julia.

2 Likes

I agree that it’s not surprising the authors didn’t consider Julia (especially given the ever-increasing landscape of languages available). It seems, though, like it would’ve fit their overall criteria with the possible exception of multi-threading capabilities. That said, this result does seem to give some credence to the plan to give Julia “Go-like” multithreading.

I would guess that Julia would actually be a rather good fit for the authors’ problem. I’m not that familiar with how Go optimizes math-heavy code or the particulars of their type-system, but I would venture to guess that equivalent, type-stable Julia code should be able to do at least as well if not better. I think the only potential area Julia might lose out to Go is in the GC, but then Go has had more time and more resources to optimize their GC implementation, and I would expect eventually even here Julia would win out.

By my crude count the project seems to be only around 14kloc of Go, and around 4kloc of that is for the SAM parser (which BioJulia conveniently already provides)…seems doable. :slight_smile:

The question is not whether this is doable (it trivially is, since it has been done in another language), but whether someone will do it.

IMO writing and optimizing ~10 kLOC is not something one does just to prove a point about Julia; so unless someone has an interest in the actual tool itself, it is very unlikely that it will be done.

But that said, I think we should learn to wrap our heads around the fact that Julia is becoming a mature language that already has a track record of being fast and convenient, so there is not a big pressure to prove this in every context if there are no other payoffs. Because of this, I find it kind of heartening that people are not stressing too much about the shootout benchmarks, because I think it means that they are just busy using the Julia productively, or improving the language or the libraries.

3 Likes

not stressing too much about the benchmarks game

otoh I usually take lack of interest in contributing better programs as an opportunity to remove a language implementation and try something else :wink: