Roadmap for a faster time-to-first-plot?

Nobody said that?! It should work perfectly fine, minus the normal issues with static compilation in Julia, which hopefully get fully resolved.

Let me summarize what was brought up here, plus a few corrections:

  1. @Tamas_Papp, a pure Julia plotting package will be slow depending on the number of features. Something simple as GR.jl is low-level and fast, but the more you add higher level API features to it, the slower compilation will get. PGFPlotsX.jl is not actually a pure Julia plotting package, but still just a thin wrapper around another plotting API, that actually does the heavy lifting. I haven’t looked thoroughly at PGFPlotsX, so there is a good chance that it does more than I think, and is just written in a way easier digestible by the compiler. If that is the case, your dream of a pure lightweight Julia plotting package is one PR to AbstractPlotting away, to improve compilation times - it is exactly what you describe. Anyone could try to e.g. remove type parameters or add @no_specialize to speed up compilation and see how it goes. CairoMakie is exactly the “just put things on an SVG canvas” library, and should compile lightning fast, and will happily draw your AbstractPlotting plots :wink: (sorry for the confusing package names, but I decided that Makie should work out of the box, so it’s taking on all heavy dependencies, while AbstractPlotting is the actual implementation of a backend independent plotting library).

  2. Makie is targeting WebGL, publication ready SVGs & PDFs for 2D/3D and fast interactive 3D/2D plotting. The status of those depends purely on the backends - The fast 3D/2D OpenGL backend is the most developed, followed by the Cairo backend for SVGs & PDFs…The WebGL backend doesn’t exist yet, but just got a great boost forward, since I figured out a way to wrap THREE.js in a much simpler fashion.

  3. Creating a backend for Makie is not complicated. A simple prototype can be as little as 100 lines of code written in less than one day. This is because every plot is composed of just a few primitive graphic types - as of now these are just Mesh, Lines, LineSegments, Scatter and Image. If you can draw those, your static backend is done. If you want to add interactivity, you also need to redraw those and hook up window signals, but that’s all there is to a backend :wink: Of course covering 100% of Makie’s features will take longer, depending on the state of the drawing framework you use. That’s also why the WebGL backend is taking so long: there simply hasn’t been something like WebGL.jl, that I could just use to draw my primitives.

13 Likes

Just to clarify, I did not claim that it was. I just included it as an example that (mostly) does what I want at the moment.

Thanks for the explanation. I will follow AbstactPlotting with interest, although the dependency on FileIO makes me wonder if it can be made fast, as AFAIK that package had a lot of issues with compilation times.

I think they got resolved:

julia> @time using FileIO
  0.119187 seconds (145.57 k allocations: 9.097 MiB)
3 Likes

Have you looked into Compose for this? As a side note I was actually curious whether there could be a Compose backend to Makie for svg (to have a pure julia solution).

5 Likes

Specifically, the Compose native Julia SVG writer is in this file: https://github.com/GiovineItalia/Compose.jl/blob/master/src/svg.jl. If you wanted to avoid whatever overhead might be added by the Compose package, you could rip that out directly and lightly modify it to work independently. That said, if the only reason for doing so is to reduce time-to-first-plot, I don’t think you’ll be too super successful. First SVG output is still slow in Gadfly.

1 Like

Are there some updates about solving “time to first” problem?
At what Julia version could we expect load-time/compiler latency improvements?
Are there some obstacles?
Few months ago many were very optimistic about that compile speed could increase “an order of magnitude” as written in some posts.
PackageCompiler is nice, but it works well with few packages, including plotting, which really is fine workaround for time-to-first-plot problem, but some other packages is not compiling so well, also snopping based on runtests.jl doesn’t cover particular use cases, there also exists Fezzik package which uses PackageCompiler API to do more practical snooping based on what user use on REPL, but it seems that doesn’t speedup packages same good (at some cases; probably not all functions gets compiled AOT?) as PackageCompiler (in my case simple application using Makie, Blink, Interact startup time still is about 20s, https://github.com/TsurHerman/Fezzik/issues/4)

2 Likes

According to the compiler priorities, time-to-first plot is the next thing after multithreading, which is expected to be released in 1.3 (of which we just got an alpha version and 1.2 is very close to release, so … maybe 4-6 months?). Currently much work is done on that front and I can imagine that this might continue in some form even after 1.3. After that, there should be more movement on this issue, but I am in no position so say anything about potential timeframes.

5 Likes

That seems about right. It’s already gotten much better in each release since 1.0 and now it’s the top priority for compiler work.

20 Likes

Not sure if this is orthogonal, but I think TTFP is just a shorthand for start-up times right? A lot of people in bioinformatics are used to building tools / writing scripts that are invoked from the command line, and this is my one remaining pain point in julia. Time to first plot itself doesn’t really bother me anymore because (a) it’s much faster than it used to be and (b) I’ve embraced the workflow of just leaving my julia session open. But developing a command line script is a bit annoying, since every invokation may take 10-15 sec to even get started.

Even with this issue, I still advocate julia to everyone. Once this gets solved, 98% of the objections I encounter will go away. Really looking forward to it - thanks everyone to everyone adding to the effort!

17 Likes

For my part I totally agree with usefulness of scripts having quick startup. As a stopgap I am currently using the Julia command option « —compile=min » inside the script header, it cuts off somewhat the « using » time. But I worry about possible side effects, so if somebody could comment about this being a good or bad idea ?

The side effect is that it makes everything slow, and by a potentially very large factor. Compare the performance of sin with normal Julia:

julia> using BenchmarkTools
                                                                                                                                                                                                                                                                                                                                      
julia> @btime sin(x[]) setup = x = Ref(1.0)
  6.217 ns (0 allocations: 0 bytes)                                                                                                                                                                                                                                                                                                   
0.8414709848078965

and again with julia --compile=min:

julia> using BenchmarkTools

julia> @btime sin(x[]) setup = x = Ref(1.0)
  18.946 μs (54 allocations: 1.05 KiB)
0.8414709848078965

With --compile=min, computing sin(x) is ~3000 times slower.

1 Like

Yes I understand it - but for scripts using very short time and small datasets (in the 100-1000) the « usings time take up half of the time (about 25secs), by using « compile=min » it goes to half of it, so much better. I understand that with larger computation it would be much worse. My concern was about consequences about precision and/or precompiling issues - you get the Recompiling message with the REPL, but not with a batch script … And what it it should be recompiled and is not, or in a « min » mode?

Ah, I see. As far as I know, the answers you get should be precisely the same, so if using compile=min helps in your case, then I don’t see any problem with it.

1 Like

That’s useful to know, esp for development (where I’m often re-running stuff on test data over and over).

OT, it looks like you’re trying to quote code using «». Instead, try using back-ticks (`):

`like this`

Gives you like this.

Thank you for the reassurance. So, except for possible precompilation woos (I happen to also use same modules with REPL, so without min) it seems ok for dev/quick scripts.

Thank you for the quoting tip then.

yup. i develop things as command line tools taking command line arguments. Startup time is about 2x my run time. A bit painful when debugging.

However this seems to indicate my debugging method is probably flawed. My reading of this thread makes it sound like maybe i should

  • open a julia shell, using Revise.
  • Put together a “command line” as an intermediate function taking direct arguments, and run that …
  • hoping Revise will keep recompiles to a minimum.

Does that sound about right ?

I’ve done something similar in the past - I mostly use ArgParse.jl for getting the arguments, and it essentially returns a dictionary, so I just hard-code the test dictionary and do essentially what you describe.

Can someone explain why the plotting library needs to be written in Julia? Why not consider building a plotting library in C (optimized for use in Julia) or using an already available library, (I believe what GR.jl uses). If features are missing the Julia community can try to fill in the gaps?

1 Like

VegaLite.jl is an example of a plotting package that wraps a plotting library written in a different language (JavaScript). It has pretty good startup times and a huge feature space (because there is a whole team working on the underlying plotting library). I think there are similar packages for other third party plotting packages.

2 Likes