Roadmap for a faster time-to-first-plot?

I think Plots.jl is just especially bad at load times, it is just littered with design decisions that make it load slow, as far as I can tell. A bit of a shame that the main plotting package for julia is such a collection of bad patterns in terms of load times :slight_smile:

3 Likes

That would be the most perfect addition, atleast for me. Might it even be possible and feasible to interface it with Plots? For me this essentially solves the time-to-plot problem. I have no issues with using an external library.

Going even further, if ggplot is written in C, would it not be possible to use it directly from Julia? That way, we’d only have to maintain the wrapper, have access to a robust library, and solve the time-to-plot issue. It even has the added benefit that R users switching to Julia might have an easier and most comfortable transition.

1 Like

The problem with just using ggplot is that it’s actually not that versatile. When I stopped using R I read a comment from the developers that they were postponing adding interactive stuff to it until they could make time to just do that. So production will still need to be done somewhere if not in Julia. Also, multidimensional geometric support is critical to this which ggplot is not really known for.

That’s precisely what happens when you do something like using GR and GR’s API directly instead of the unified Plots meta-package. Of course, this isn’t ggplot, but it’s the same idea — it uses an external binary to do most of the heavy lifting. GR can also be used through the Plots ecosystem, but by using it directly you can skip a lot of code. TTFP with GR is on the order of 5s, even if you need to recompile the GR.jl package.

The Plots ecosystem is powerful with its backend-agnostic API, but currently that power comes at a cost.

I think at some point the plan had been ggplot2 would be replaced with ggvis (i.e. that might have been ggplot3), which is actually based on vega under the hood. I think that plan is dormant right now. But at least in general I think Hadley had planned to also utilize vega as the backend engine for the future of tidyverse plotting.

1 Like

Yeah but those early prototypes were really limited and looked awful. Then plotly came along with a ggplot wrapper API and that was it.

There are two related questions here:

  1. Having a fully native plotting library has a lot of advantages: better language integration, fast startup times, easier maintenance. Such a plotting library could be pretty fast with current Julia, but writing a nice plotting library is an enormous amount of work.

  2. Plots.jl is not slow to start up because it is written in Julia, but because it contains a lot of conditionally loaded glue code.

1 Like

It’s not really the conditionally loaded glue code but all of the function-barriers and inference failures in the pipeline code. GR isn’t even conditionally loaded since about 8 months ago?

GMT is not so feature rich as matplotlib but competes (and many times wins, specially in mapping) in quality with any other plotting library. Unfortunately the GMT.jl wrapper still has a TTFP only slightly less then 15 s. I which I understood more on the why of this so it could be cut by a factor of 2 or 3.

I’m pretty limited when it comes to the details of programmatic graphics, so I apologize if this is completely off and feel free to correct me. But I do think this summarizes the problem space:

I think the path forward on the second point will be solved as compilation latency continues to improve (which we’ve already seen) and the plotting library back end is really figured out. There’s a lot of smart people here so I have no doubt that it will get solved.

However, I’m worried about the first point because what it really comes down to is some sort of compromise. I’ve sat in on discussions like this for smaller projects and the conversation typically goes through these options.

  1. Use a dedicated plotting library (matplotlib, ggplot, plotly, vega, etc.)
  2. Use a bulky, well developed, and specialized graphics system that caters to data science (e.g, VTK)
  3. Use a graphics library that’s not data science specific at all (e.g., OpenGL, Unity, etc.)

The first one is what we currently have in a couple forms. We obviously want it to do more faster but there are people that are actively making that happen. If we stop here then we are pretty limited in being able to do high performance 3D graphs (I know there are implementations of these but I’ve always gotten a lot of lag using these).

The second one is what R and Python did when they got the first one figured out. It works but is ultimately a highly segmented ecosystem that doesn’t communicate together and requires large complicated libraries (also, you’ve ever had to download VTK and ITK before on Windows your soul will die a little).

The third option is the best if you can get to a final product. There are major commercial efforts to contribute to graphics libraries and the burden of installing and maintaining a large graphics system on multiple operating systems is typically already handled by other parties. However, people that actually know what they are doing have to write a lot of well thought out glue code to make these work. This is what Makie is doing and seems like an ambitious but exciting effort.

In a perfect world some company trying to break into the market (like Google’s Stadia efforts) would pour tons of funding into a pure Julia graphics system. But I wouldn’t hold out for that.

3 Likes

Yes I think that summarises it perfectly. I imagine the situation for people looking for 2D data visualisation is not too bad apart from time to first plot, in that there are several full featured plotting packages. But for 3D scientific computing plotting the situation is miles behind Matlab (or Mathematica). I know of a lot of potential Julia users that are sticking to Matlab for this reason.

A good example is say plotting a live movie of a heat map on a sphere. This is easy and pretty fast in Matlab. I don’t think any package apart from Makie can do it in Julia. But Makies still not as easy as Matlab, with the time to first plot being horrendous.

2 Likes

Tell your friends to do this with Matlab :smile:

https://docs.generic-mapping-tools.org/dev/gallery/anim07.html#anim-07

https://docs.generic-mapping-tools.org/dev/gallery/anim08.html#anim-08

1 Like

Just use the MATLAB Wrapper?

If you want to know what your data looks like, just use UnicodePlots. If you need to save a plot, use Gaston.

I completely rely on Gaston and UnicodePlots for visualization. They work very well.

2 Likes

This question finally has a solution, just use MATLAB :rofl:

EDIT: Wait, that’s a wrapper for using GMT inside MATLAB…

1 Like

Yes, there is a wrapper to use GMT inside Matlab but there is also one to use it from Julia. That’s the GMT.jl that I referred above.

I thought the suggestion was a wrapper to use Matlab inside Julia, which might actually be reasonable. The “tell your friends to do this in Matlab” statement misses the point: they don’t want to do what Matlab can’t do, they want to do what Matlab can do, as easily as it does it, in Julia. Pointing to what appears to be a command line based plotting package specifically for maps with confusing docs is not going to cut it.

1 Like

Well, my answer was mostly to this comment. Yes it can be done with GMT from Julia. But for this particular module that uses internally many self generated scripts it’s going to be tough not to use the the GMT’s terse syntax.

As far as I can tell, the only strategy that will provide the experience you describe of interpreter-like latency for first-time users is using an interpreter and doing compilation in the background.

I would argue that caching of binary code (if feasible) would also provide that kind of experience. Like most of us, I’m used to first-time Julia user’s complaining that Python get’s them the first plot faster - but I think they really mean “in a new session”. I think most users would be fine with things taking a long time for the very first plot (i.e. in the first session). Particularly the kind of user that will not do heavy package development, so cached code could stay valid for quite a while (a few days between waiting a long time for a first plot should make most people happy).

9 Likes

I do have such a use case right now. It’s about running some Julia script (simulation etc.) from an external test framework. The simplest thing to automate such stuff is to just have one command such as julia my_script.jl. Precompiling packages into the system image helped a lot in reducing startup and JIT overhead. But when trying to also compile the function calls used in these scripts into the image (using compile_incremental) always fails, and resolving one error just leads to the next one etc. If this would be working, one could use all “cases” for snooping once, compile everything into the sysimage, then run repeated calls to those cases with zero startup+JIT overhead.

One idea here was to just have a Julia session (or multiple sessions?) running and somehow call into them to trigger stuff (the simulations, i.e. test cases). What would be a good way of doing this? (I know, quite off topic, but where to put it? Usage > Performance?)