Suggested minimal requirements for short-term plotting solution

For consideration (as discussed on Slack), here is a proposed list of requirements to try to find the best short-term (i.e. until stabilization of Makie/Plots/etc.) solution for teaching plotting for new users, students, and exploratory tutorials. Implicit in this is the assumption that: (1) the long-term solution will not be stable before a v1.0 launch; (2) there is a benefit in trying to coordinate those teaching new users towards the same first plotting library; (3). this is a starting point for building familiarity with Julia in a pain-free way (potentially as a student or evaluator) and does not need to be the one (of many) they stick with down the road.

Suggested Minimal Requirements

The following is, what I think are the bare minimum requirements

  1. Completely idiot-proof setup on Windows/OSX/Linux desktops with minimal installation steps required and no flakiness
  2. Plots do no disappear when closing and reopening Jupyter notebooks
  3. Works and supported on juliabox
  4. Can Print Preview and then print to PDF from Jupyter (for submitting assignments or printing results)
  5. Plotting works in the REPL and in Juno
  6. Can do multiple plots in the REPL and Juno. Acceptable if it has to open up new windows for now instead of using the plot pane
  7. Can copy/paste the same code between Jupyter, the REPL, and Juno and expect it to work. e.g. calling display(...) in one and not in the other is extremely confusing for new users
  8. Basic LaTeX latex/title support on REPL, Juno, and Jupyter (again, with a minimal setup…)
  9. Some way to download an image (note: right-click, Download as png fulfills the requirement)
  10. Starting point for tutorials and user documentation exists

Nice to Have Requirements

The following are nice but not deal-killers

  1. Fast to first plot is nice, but the bugginess issues dominate until that point.
  2. Large numbers of existing code and tutorials can be adapted with little change (e.g. backend change from pyplot() to plotly() could be trained easily)
  3. Basic ability to programmatically save an image without a fragile setup. If can Download manually, not required immediately.
  4. Works in Jupyterlab (including not having disappearing graphs)
  5. VSCode support, including plotting
  6. Nice if multiple plots show in Juno plot pane(
6 Likes

Here is a summary collecting some of the thoughts on existing packages and whether they fulfill the requirements:

PyPlot and Plots.jl with pyplot

  • Easy to reject: fragile local installation, especially on OS/X
  • Using PyPlot directly has almost no user-centric tutorials, and all the fragile installation problems.

Plots.jl with GR

  • Pretty close and stable installation
  • Don’t believe it fulfills the multiple plots in REPL/Juno requirement.
  • It has some trouble/inconsistency with latex strings (https://github.com/jheinen/GR.jl/issues/93)
  • Anything else that breaks requirements?

PlotlyJS.jl

  • Yesterday we would have rejected because of the disappearing plots in Jupyter and the inability to print preview…
  • However, the following seems to work and solve both the disappearing plots and the print preview issues
  • Add in a cell in the workbook
using PlotlyJS
init_notebook(true)
  • Put in your plotting code in a separate cell, e.g.
plot(rand(3))
  • There is a small tweak to Blink that would make the installation idiot proof (https://github.com/JunoLab/Blink.jl/pull/117)
  • The docs have a good starting point for tutorials: http://spencerlyon.com/PlotlyJS.jl/
  • Saving figures is fulfilled by right clicking on the image and letting the plotly.js code “save as png”
  • It is very fast, especially compared to Plots.jl based solutions.
  • While there are some good examples and documents, there is much less code out there than Plots.jl based solutions.

So, at this point, I see no of the missing requirements?

Plots.jl with Plotly

  • At this point, the disappearing notebooks and the print preview issues would preclude a plotly() based solution. However, perhaps the same tricks that helped PlotlyJS.jl solve it could be used with Plots.jl?
  • If so, and if there are no other issues, then this has the advantage of a large amount of pre-existing tutorials and code which could be easily adapted (e.g. could just swap out the gr() and pyplot() backends and users get a lot of material)
1 Like

Great points! I have never been able to get PyPlot working on my Linux machine. I prefer some thing pure Julian.
I have tried Gadfly, and I really like it. However, it takes seconds to save an image as PNG. Also, compared to ggplot2, I did not get much feeling of grammar of graphics from using Gadfly.

Plotly has some good features, as long as the performance is OK, it should be a good choice. I used plotly in R, but it seems I need to register on the website to use some functions, which is weird.

Yeah, after telling my students to use pyplot and getting burned, I can say that it is definitely not the right intro solution… for creating publication quality, it may be, but that is not for the intro users.

Gadfly is something that several people have brought up, and I have absolutely no opinion on it if it fulfills the constraints above. I haven’t used it because none of the example code I have found recently uses it (which I took as a signal…)

I think the recipes are something that should not be ignored. They are the developer target for plotting, and if you’re using packages you want to make use of pre-existing tools. DiffEq, ApproxFun, ControlSystems, etc. all have pre-defined plot commands via recipes that allow one to then customize using the full attribute system of Plots.jl. While Plots.jl has startup time issues, the extensibility of this library is unparalleled. Then you have all of StatPlots.jl and PlotRecipes.jl which create plots that I would not be able to do myself, yet are essential. Users may not be writing recipes, but they are great to use.

5 Likes

Agreed. My general inclination is to try to patch Plots.jl backends to fulfill the requirements…

Even if both PlotlyJS and Plots with a plotly backend work, I think the recipes are enough to make the extra Plots.jl time worth it (for intro users). More advanced users can be pointed to PlotlyJS and others as they get familiar with julia.

Plotlyjs does not work at all on fedora (or is at least very hard to get working) , see https://github.com/JuliaPlots/Plots.jl/issues/345#issuecomment-227107155

Does plotly backend for Plots?

Yes

I can mention that we make heavy use of Interact.jl/InteractNext.jl sliders in our education, with very positive reaction from both students and teachers. GR is more or less the only backend to Plots.jl I have tried that is performant enough to make the resulting graphics truly interactive and smooth. GR has some drawbacks, but installs without any issues for us and the fact that it enables use of Interact have made it our default backend in student facing code.

I seem to remember that PlotlyJS, due to the way it displays the plot in Jupiter (which is a way to solve the disappear on reload issue) causes heavy flickering when used in combination with Interact, not sure how that can be fixed. It’s not a performance problem though, it would be possible to display the plot in a different way and that would solve the problem but bring back the disappearing issue.

I don’t know for sure, but if PlotlyJS was migrated to HTTP.jl, things might be better on Fedora.

Edit: I take this back. PlotlyJS does not depend on HttpServer so I’m confused what the issue is and whether it is even an issue anymore.

Edit: I’m guess the issue is with Blink (if it is still an issue). Would be great to add a BrowserDisplay.

I was very enthusiastic about recipes first, but now I see them as an artifact inspired by the fact that Plots.jl is a meta-plotting package. In theory they are an amazing solution, but in practice, they inherit the shortcomings of Plots.jl.

IMO most plotting problems can be decomposed into simple elements: get a bunch of coordinates and map that to visual output. So as long as I can convert arbitrary objects to, say, a vector of coordinates and metadata (which I then map to marker size/color/style, etc), I find it conceptually very easy to just build plots up from elements. That said, I am the kind of person who struggles with GoG too.

Regarding the requirements of the topic: I don’t think they are minimal at all, I would rather use the term “utopian”. Eg a non-flaky idiot-proof installation on Windows: many users struggle with that even for a basic Julia installation. The other issues (Jupyter/Juno/VSCode/png) are orthogonal, mostly the matter of supporting an svg / pdf / png output, which is easy to solve, and integration of that into the IDE, which can be tricky but just needs to be done once for all plotting packages.

In the long run, for static plots I think the best solution would be to use a backend that can emit vector output (Luxor.jl comes to mind), then “simply” implement a high-level plotting package based on that. Which is still a lot of work, but would be native Julia and have low overhead and startup times. But that’s not an answer to the topic’s question, of course.

4 Likes

You’re right, PlotlyJS seems to work from atom now. Plotting from the repl or jupyter still errs with

ERROR: connect: connection refused (ECONNREFUSED)

That’s where Makie is at, with GLVisualize being the graphics backend it’s built on. Basically Makie is a high-level plotting API for GLVisualize, but its high level constructs are backend-generic so there’s going to be some work plugging in other routes as well, but privileging the GLVisualize one as the “standard”. I hope we still have things like an easy switch to Plotly for interactivity, but now have a more in-depth reference backend implementation to rely on when we want the full (interactive) GUI.

Concerning your remark about recipe, I believe a “meta” plotting package should provide two things:

  1. A way to go from vector of coordinates + fully detailed metadata (ticks, labels, transparency values, line widths, colors etc…) to a displayed plot. This should first go from the metadata to basic “atomic” operations (draw a line, draw a shape etc) which would in turn be implemented by the backend. This set of atomic operations could be referred to as a “backend interface”: once a backend overloads these methods, it can be used to make any plot

  2. A smart way to fill in defaults: the full specifications of a plot has too many attributes for the user to explicitly give all of them. The “meta” package should be able to generalize from “vector of coordinates plus some metadata” to “vector of coordinates plus all metadata” in a way that makes sense most of the time, to simplify the user experience.

Plots.jl is pretty good at doing 2) and can often successfully “read in my mind” and do things the way I want them without me having to specify much, though some may say that the pipeline to achieve this is overly complicated.
OTOH I agree with you that at the moment Plots.jl does not fully accomplish 1): the backend interface is not optimally designed and happens at a very high level (rather than at the level of the atomic components): this, in my view, is the source of many bugs and makes it challenging to maintain all the backends.

Regarding point 1), AFAIU it will be addressed in Makie but even before that we could maybe try to formalize the backend interface in Plots and reduce the amount of things that need to be implemented by every backend.

1 Like

One thing I’m still not convinced about is the idea that swapping backends is really desirable or even necessary? The amount of difficulty trying to make this work seems to overshadow whatever benefit comes from being able to switch backends.

It would make sense if the two backends were either Cairo or GPU based, because those are low-level. But with the number of JavaScript libraries available, the JavaScript library itself ends up dictating how things get rendered by virtue of their API.

What am I missing?

6 Likes

The problem with supporting recipes that translate to atomic operations with a package that uses various backends is that the latter have non-overlapping functionality in practice (at least that’s what happened for Plots.jl).

IMO

  1. using a meta-package that delegates the work to various backends as an interim solution, and
  2. supporting recipes

are both great ideas, it just turned out that they don’t mix well, despite a lot of work.

I see your point. Indeed in my view some of the issues with say PlotlyJS backend stem from the fact that their API is very high-level: for example we delegate to Plot.ly the creation of bar plots, but then it becomes impossible to fix this kind of issues. In principle I don’t think that the big gains of Plots design come from adding a different syntax to a well-developed independent plotting package (PlotlyJS native syntax works just fine and, had I started using Julia in a moment when I knew PlotlyJS would work reliably and export to vector format reliably, I would’ve continued using it). It turned out that it was useful to do things with Plots because when one backend stopped working, I could move to another…

The real gains from this solution would come from low-level backends: Cairo, OpenGL, maybe WebGL or Luxor. Even in early comments by tbreloff he mentioned that he developed the pyplot and gadfly backends to lure people in with something familiar, hoping to discontinue these backends eventually, in favor of something more low-level (like Compose).

Once a “backend interface” is defined and widely accepted, there are two main advantages:

  1. Every package for data visualization can simply rely on that interface, even if the preferred backend changes over time (a new one comes along that is faster or more interactive or what not)
  2. Developing a backend becomes much more feasible. I have the impression that now developing a plotting package is a herculean task (and indeed the most used plotting package ever, matplotlib, is quite old and not many equivalently popular alternatives have been developed since). If developing a plotting package became equivalent to “implementing a backend interface”, this task would be much simpler.

In my mind the big question is how feasible it is to change in this direction in Plots, focusing more on low-level backends like GR rather than high-level ones like pyplot or plotlyjs. Should it not turn out to be feasible at all, one would need to wait for Makie to implement this strategy (from what I understand, Makie plans, in the best case scenario, OpenGL, WebGL and Cairo backends).

I’m not commenting on the difficulties to get this working, but the ability to select multiple backends looks appealing to me. With the same code you can have multiple outputs (PNG, PDF, LaTeX, etc), that you may need for different things. This is similar to what gnuplot can do, where you have multiple “terminals” and usually with little changes you can reuse the same code to get multiple outputs. I like this feature very much.