Startup Speed

benj · March 14, 2022, 8:11pm

Hello everybody! I’ve been learning Julia and I’ve been loving it!!

I’m curious about startup times…

I have this super simple code, but I’m baffled at how it take about 13 seconds to start! Also tested it on an i9 since 13s is on my apple m1, which seems like Julia is still not stable, and it took 18s there.

All the code is doing is reading a CSV file and nothing else. After googling the issue, it seems like it’s a known issue due to Julia compiling at startup.

I was wondering how can that be helped? Some places mentioned a --precompile flag that seems to be deprecated, other talk about sysimages…

Is there a Julia way to deal with this issue? I’ve also heard about a long running julia process and then using it interactively from there, but then how do I run a file after making code changes? would I need to worry about variables declared before? is there a julia client that when run it just send command to the long running julia process?

For reference, this is the code I’m running:

@time using CSV
@time using DataFrames
@time using Dates

dateformat = "yyyy-mm-dd HH:MM:SS"
types = [DateTime, Float64, Float64, Float64, Float64, Int64]

@time data = CSV.read("data.txt", DataFrame; dateformat = dateformat, types = types)

and I’m running it by calling

julia code.jl

and these are the times that I’m getting

1.911461 seconds (5.97 M allocations: 363.832 MiB, 5.30% gc time, 88.35% compilation time)
0.745739 seconds (1.82 M allocations: 124.596 MiB, 4.18% gc time)
0.001233 seconds (293 allocations: 28.188 KiB)
10.758393 seconds (39.59 M allocations: 1.697 GiB, 4.92% gc time, 99.87% compilation time)

Thanks a lot for your help!

goerch · March 14, 2022, 8:43pm

As far as I understand DataFrame.jl and CSV.jl are heavily optimized for the benchmark. This could (still?) have some repercussions if you try to work with small datasets…

lmiq · March 14, 2022, 8:53pm

Some tips here: Development workflow · JuliaNotes.jl

And you may be interested in https://github.com/dmolina/DaemonMode.jl

But basically, put the code inside functions, keep the section alive, and use Revise.

goerch · March 14, 2022, 9:04pm

Hi @Imiq,

good tips as usual, but do you think this is it? As far as I remember @Raf did some remarkable work on startup time here and I’m not sure if it is in production already.

Raf · March 14, 2022, 9:12pm

The Parsers.jl fix that sped up CSV.jl was merged, but DateTime precompilation was rolled back due to some bugs on windows during precompilation, that no one really understands.

That might be what you are hitting here? You can try pinning Parsers.jl to v2.2.2 to see if it was any faster before.

CSV read will still take some time the first time, but should be faster than 10s on any newish machine.

Henrique_Becker · March 14, 2022, 9:23pm

I recommend using either the REPL or Jupyter to keep the compiled libraries in memory, so new run are instantaneous. And even have some script that you include the first time you open the REPL (or a Jupyter notebook) so everything you need gets precompiled.

lawless-m · March 15, 2022, 6:41am

Completely unrelated to the time question

If you have variables the same name as keyword arguments, you can save some typing by doing this:

data = CSV.read("data.txt", DataFrame; dateformat, types)

and

dateformat strings have a dedicated macro which prevents repeated conversion from string to ::DateFormat

dateformat = dateformat"yyyy-mm-dd HH:MM:SS"

this reduces running time in some situations, although I doubt that is the case here & I haven’t checked.

see:
https://docs.julialang.org/en/v1/stdlib/Dates/#Dates.DateFormat

Topic		Replies	Views
First try seems a bit sluggish Performance	5	619	February 21, 2021
Rough start with julia (with CSV package) New to Julia	18	3850	February 16, 2017
Slow compilation, i think, not sure how to debug New to Julia	15	1344	May 20, 2020
Long startup time when loading Plots.jl General Usage	9	2941	December 24, 2016
Launch speed New to Julia	2	544	February 22, 2022

Startup Speed

Related topics