Are there some updates about solving “time to first” problem?
At what Julia version could we expect load-time/compiler latency improvements?
Are there some obstacles?
Few months ago many were very optimistic about that compile speed could increase “an order of magnitude” as written in some posts.
PackageCompiler is nice, but it works well with few packages, including plotting, which really is fine workaround for time-to-first-plot problem, but some other packages is not compiling so well, also snopping based on runtests.jl doesn’t cover particular use cases, there also exists Fezzik package which uses PackageCompiler API to do more practical snooping based on what user use on REPL, but it seems that doesn’t speedup packages same good (at some cases; probably not all functions gets compiled AOT?) as PackageCompiler (in my case simple application using Makie, Blink, Interact startup time still is about 20s, https://github.com/TsurHerman/Fezzik/issues/4)
Are there some updates about solving “time to first” problem?
According to the compiler priorities, time-to-first plot is the next thing after multithreading, which is expected to be released in 1.3 (of which we just got an alpha version and 1.2 is very close to release, so … maybe 4-6 months?). Currently much work is done on that front and I can imagine that this might continue in some form even after 1.3. After that, there should be more movement on this issue, but I am in no position so say anything about potential timeframes.
That seems about right. It’s already gotten much better in each release since 1.0 and now it’s the top priority for compiler work.
Not sure if this is orthogonal, but I think TTFP is just a shorthand for start-up times right? A lot of people in bioinformatics are used to building tools / writing scripts that are invoked from the command line, and this is my one remaining pain point in julia. Time to first plot itself doesn’t really bother me anymore because (a) it’s much faster than it used to be and (b) I’ve embraced the workflow of just leaving my julia session open. But developing a command line script is a bit annoying, since every invokation may take 10-15 sec to even get started.
Even with this issue, I still advocate julia to everyone. Once this gets solved, 98% of the objections I encounter will go away. Really looking forward to it - thanks everyone to everyone adding to the effort!
For my part I totally agree with usefulness of scripts having quick startup. As a stopgap I am currently using the Julia command option « —compile=min » inside the script header, it cuts off somewhat the « using » time. But I worry about possible side effects, so if somebody could comment about this being a good or bad idea ?
The side effect is that it makes everything slow, and by a potentially very large factor. Compare the performance of
sin with normal Julia:
julia> using BenchmarkTools julia> @btime sin(x) setup = x = Ref(1.0) 6.217 ns (0 allocations: 0 bytes) 0.8414709848078965
and again with
julia> using BenchmarkTools julia> @btime sin(x) setup = x = Ref(1.0) 18.946 μs (54 allocations: 1.05 KiB) 0.8414709848078965
sin(x) is ~3000 times slower.
Yes I understand it - but for scripts using very short time and small datasets (in the 100-1000) the « usings time take up half of the time (about 25secs), by using « compile=min » it goes to half of it, so much better. I understand that with larger computation it would be much worse. My concern was about consequences about precision and/or precompiling issues - you get the Recompiling message with the REPL, but not with a batch script … And what it it should be recompiled and is not, or in a « min » mode?
Ah, I see. As far as I know, the answers you get should be precisely the same, so if using compile=min helps in your case, then I don’t see any problem with it.
That’s useful to know, esp for development (where I’m often re-running stuff on test data over and over).
OT, it looks like you’re trying to quote code using «». Instead, try using back-ticks (`):
Thank you for the reassurance. So, except for possible precompilation woos (I happen to also use same modules with REPL, so without min) it seems ok for dev/quick scripts.
Thank you for the quoting tip then.
yup. i develop things as command line tools taking command line arguments. Startup time is about 2x my run time. A bit painful when debugging.
However this seems to indicate my debugging method is probably flawed. My reading of this thread makes it sound like maybe i should
- open a julia shell, using Revise.
- Put together a “command line” as an intermediate function taking direct arguments, and run that …
- hoping Revise will keep recompiles to a minimum.
Does that sound about right ?
I’ve done something similar in the past - I mostly use ArgParse.jl for getting the arguments, and it essentially returns a dictionary, so I just hard-code the test dictionary and do essentially what you describe.
Can someone explain why the plotting library needs to be written in Julia? Why not consider building a plotting library in C (optimized for use in Julia) or using an already available library, (I believe what
GR.jl uses). If features are missing the Julia community can try to fill in the gaps?
That’s right. I usually use
Vegalite.jl however I find the problem with Vegalite is the initial data processing to get it into a "long’ format as well as the plot setup. Sometimes this formatting takes longer than the 30 seconds using
I think building a new plotting library is about 1000x more complicated than just getting compilation caching working…
Existing libraries like GR are ok but still not Matlab quality. (PyPlot is pretty good but horrendously slow.) I think it’s just not realistic that there will ever be an open source C based plotting library, it’s too tedious to develop so only makes sense if a company is funding it.
Makie seems much more promising. It’s very rough around the edges but on the high performance side it feels light years ahead of more well established open source plotting frameworks. If people have time and energy to throw at fast plotting it seems to me best to use it for (1) compilation caching and (2) improving Makie.
Also how come packages like
ggplot in R is so fast and powerful? is it because R is an interpreted language (which means the library for
ggplot is already compiled)? Sorry for my dumb questions.
It’s more that R is interpreted so it doesn’t have to compile anything ahead of time, and
ggplot is written in C, so it has good performance.
Would it help if you could directly pass arrays in, say like
@vlplot(:point, x=[1,2,3]), instead of having to format it as a table first? I’ve been wanting to add that feature for a while.
Another idea is to create another package like SimpleVega.jl that essentially implements the old API from Vega.jl (http://johnmyleswhite.github.io/Vega.jl/), or something like it, but uses the infrastructure from VegaLite.jl.
Might make sense to split this discussion into a separate thread, given that this is really more about VegaLite.jl then compile time.