Precompilation not speeding up startup

question

#1

I was trying to improve the Julia score on the next basic benchmark https://github.com/Microbiology/JuliaPerlBenchmark (where they conclude that Perl is faster than Julia for processing fasta files).

I already spotted one error in the benchmark: The data file used was not big enough to provide data for the larger benchmarks. After correcting this, Julia is clearly faster than Perl for larger files (Go is stil a lot faster though), but slowest for smaller files. This is due to the quite significant start-up/compilation time (0.6 seconds on my system for one small function) of the Julia version.

As in many of my workflows I would need to run commands implemented in Julia (rather than opening a repl and giving the commands), I tried making the benchmark into a module and apply precompilation to remove/reduce the startup cost:

__precompile__()

module Medlength

export medlength

function medlength(file)
	open(file, "r") do fasta_in
	    length_array = Int[]
	    seq_length = -1
	    for line in eachline(fasta_in)
	        if startswith(line,'>')
	            if seq_length != -1
	                push!(length_array, seq_length)
	            end
	            seq_length = 0
	        else
	            seq_length += length(chomp(line))
	        end
	    end
	    if seq_length != -1
	        push!(length_array, seq_length)
	    end
	    println(median(length_array))
	end
end

precompile(medlength,(String,))

end

On first run the module seems to be precompiled to the cache (message, precompiling to a cache does take a lot longer than normal startup,> 1 second).
However, on subsequent runs the program using the module is not faster at all; the startup still takes 0.6 seconds …

Is there something wrong still with the module definition?
In searching for solutions, I read somewhere that precompilation only does the parsing and does not compile to machine code. Is this correct?
If it is, can this be changed? It seems that precompiling to machine code should be possible for a local cache.
Finally, if solved for modules, would it be feasible to precompile/cache functions in scripts (not modules)? (to avoid potential users doing the typical small benchmark script and concluding that Julia is slow)


#2

I believe I said that and I don’t think it’s correct :slight_smile:. @TotalVerb should clarify why I was wrong and what precompilation actually does since I’m still not totally sure.

Add the module to the system image?

Why not just make it a module? Any script can be a module with a single function.

But…

This benchmark is unbelievably… bad.

If you disregard the zero, the graphs are flat. The zeros were probably only added to hide this fact. So essentially 100% of what is being measured is not the sequence reading… for Julia it’s probably the startup time. For Go, since it’s compiled, it’s probably bounded by how long it takes for the OS to open a file or something silly. The number of sequences needs to be much higher for this to benchmark anything. I copied the picture here in-case they ninja edit it.


#3

It would maybe be interesting to compare this to Bio.jl’s FASTA.Reader:

https://biojulia.github.io/Bio.jl/latest/man/reading/#fasta


#4

The flat graphs were due to the error I found (data set too small). Here is the corrected version (on my pc)


The Julia2 version is using the precompiled module. Using precompilation (by using a module) should cut down the startup time, but it doesn’t. The question is why?

There are good reasons to want to make precompilation caching working (actually reducing startup time on second run of the script) and easy (making a module takes extra work, adding modules to the system image is beyond most users): The net is littered with “bad” benchmarks like this one: people try this supposedly fast language on a small problem, see that it is a lot “slower” at this than their current language, blog about it (we have seen a lot of these Julia is slow communications), and move on.
The criticism is not entirely unwarrented either. keeping a repl open (the usual suggestion) is not possible in all workflows; When having many scripts that are used on small as well as large files by users, taking one second to process a small file (due to startup) does leave a bad impression.