Roadmap for a faster time-to-first-plot?

Just to clarify, I did not claim that it was. I just included it as an example that (mostly) does what I want at the moment.

Thanks for the explanation. I will follow AbstactPlotting with interest, although the dependency on FileIO makes me wonder if it can be made fast, as AFAIK that package had a lot of issues with compilation times.

I think they got resolved:

julia> @time using FileIO
  0.119187 seconds (145.57 k allocations: 9.097 MiB)
3 Likes

Have you looked into Compose for this? As a side note I was actually curious whether there could be a Compose backend to Makie for svg (to have a pure julia solution).

5 Likes

Specifically, the Compose native Julia SVG writer is in this file: https://github.com/GiovineItalia/Compose.jl/blob/master/src/svg.jl. If you wanted to avoid whatever overhead might be added by the Compose package, you could rip that out directly and lightly modify it to work independently. That said, if the only reason for doing so is to reduce time-to-first-plot, I don’t think you’ll be too super successful. First SVG output is still slow in Gadfly.

1 Like

Are there some updates about solving “time to first” problem?
At what Julia version could we expect load-time/compiler latency improvements?
Are there some obstacles?
Few months ago many were very optimistic about that compile speed could increase “an order of magnitude” as written in some posts.
PackageCompiler is nice, but it works well with few packages, including plotting, which really is fine workaround for time-to-first-plot problem, but some other packages is not compiling so well, also snopping based on runtests.jl doesn’t cover particular use cases, there also exists Fezzik package which uses PackageCompiler API to do more practical snooping based on what user use on REPL, but it seems that doesn’t speedup packages same good (at some cases; probably not all functions gets compiled AOT?) as PackageCompiler (in my case simple application using Makie, Blink, Interact startup time still is about 20s, https://github.com/TsurHerman/Fezzik/issues/4)

2 Likes

According to the compiler priorities, time-to-first plot is the next thing after multithreading, which is expected to be released in 1.3 (of which we just got an alpha version and 1.2 is very close to release, so … maybe 4-6 months?). Currently much work is done on that front and I can imagine that this might continue in some form even after 1.3. After that, there should be more movement on this issue, but I am in no position so say anything about potential timeframes.

5 Likes

That seems about right. It’s already gotten much better in each release since 1.0 and now it’s the top priority for compiler work.

20 Likes

Not sure if this is orthogonal, but I think TTFP is just a shorthand for start-up times right? A lot of people in bioinformatics are used to building tools / writing scripts that are invoked from the command line, and this is my one remaining pain point in julia. Time to first plot itself doesn’t really bother me anymore because (a) it’s much faster than it used to be and (b) I’ve embraced the workflow of just leaving my julia session open. But developing a command line script is a bit annoying, since every invokation may take 10-15 sec to even get started.

Even with this issue, I still advocate julia to everyone. Once this gets solved, 98% of the objections I encounter will go away. Really looking forward to it - thanks everyone to everyone adding to the effort!

17 Likes

For my part I totally agree with usefulness of scripts having quick startup. As a stopgap I am currently using the Julia command option « —compile=min » inside the script header, it cuts off somewhat the « using » time. But I worry about possible side effects, so if somebody could comment about this being a good or bad idea ?

The side effect is that it makes everything slow, and by a potentially very large factor. Compare the performance of sin with normal Julia:

julia> using BenchmarkTools
                                                                                                                                                                                                                                                                                                                                      
julia> @btime sin(x[]) setup = x = Ref(1.0)
  6.217 ns (0 allocations: 0 bytes)                                                                                                                                                                                                                                                                                                   
0.8414709848078965

and again with julia --compile=min:

julia> using BenchmarkTools

julia> @btime sin(x[]) setup = x = Ref(1.0)
  18.946 μs (54 allocations: 1.05 KiB)
0.8414709848078965

With --compile=min, computing sin(x) is ~3000 times slower.

1 Like

Yes I understand it - but for scripts using very short time and small datasets (in the 100-1000) the « usings time take up half of the time (about 25secs), by using « compile=min » it goes to half of it, so much better. I understand that with larger computation it would be much worse. My concern was about consequences about precision and/or precompiling issues - you get the Recompiling message with the REPL, but not with a batch script … And what it it should be recompiled and is not, or in a « min » mode?

Ah, I see. As far as I know, the answers you get should be precisely the same, so if using compile=min helps in your case, then I don’t see any problem with it.

1 Like

That’s useful to know, esp for development (where I’m often re-running stuff on test data over and over).

OT, it looks like you’re trying to quote code using «». Instead, try using back-ticks (`):

`like this`

Gives you like this.

Thank you for the reassurance. So, except for possible precompilation woos (I happen to also use same modules with REPL, so without min) it seems ok for dev/quick scripts.

Thank you for the quoting tip then.

yup. i develop things as command line tools taking command line arguments. Startup time is about 2x my run time. A bit painful when debugging.

However this seems to indicate my debugging method is probably flawed. My reading of this thread makes it sound like maybe i should

  • open a julia shell, using Revise.
  • Put together a “command line” as an intermediate function taking direct arguments, and run that …
  • hoping Revise will keep recompiles to a minimum.

Does that sound about right ?

I’ve done something similar in the past - I mostly use ArgParse.jl for getting the arguments, and it essentially returns a dictionary, so I just hard-code the test dictionary and do essentially what you describe.

Can someone explain why the plotting library needs to be written in Julia? Why not consider building a plotting library in C (optimized for use in Julia) or using an already available library, (I believe what GR.jl uses). If features are missing the Julia community can try to fill in the gaps?

1 Like

VegaLite.jl is an example of a plotting package that wraps a plotting library written in a different language (JavaScript). It has pretty good startup times and a huge feature space (because there is a whole team working on the underlying plotting library). I think there are similar packages for other third party plotting packages.

2 Likes

That’s right. I usually use Vegalite.jl however I find the problem with Vegalite is the initial data processing to get it into a "long’ format as well as the plot setup. Sometimes this formatting takes longer than the 30 seconds using Plots. :sweat_smile: