No working GUI backend found for matplotlib

I don’t see the purpose of this discussion becoming personal. Julia has a great number of plotting packages, and quite many that are very good, including PyPlot.
In the specific case, if PyPlot does not work on clusters, that does point to a context where the possibility of easily changing the plotting package is an advantage. Of course, as @tknopp states it would be even better for all use cases if a good solution exists for PyPlot.

This problem with PyPlot has been known for a long time. I posted a thread where I was also trying to run PyPlot on a similar machine about half a year ago, and haven’t gotten that working yet. @stevengj linked that it’s a known issue from 1 and a half years ago, which still doesn’t have an “automatic solution”. The solution in the thread there is to modify LD_LIBRARY_PATH, which can be difficult to know what it’s supposed to turn into when trying to link to a module on a cluster (I’m still not sure how to proceed here on a cluster), and even in thread it’s mentioned that doing so will break other things like RCall.

However, there is a ready 1-line solution if the user was using Plots.jl: switch to the GR backend using the command gr(). This definitely “provides a solution to the mentioned problem”, because you can just open up Plots.jl, use the GR backend instead of the PyPlot backend, and plot on remote machines.

So I think it’s very safe to say that if the user was directed to start off by using Plots.jl, they would be in a better spot because it has a direct and immediate answer to the problem. It just so happens that this occurs for most plotting questions on the Discourse and on the Gitter. And that’s why I would always recommend it.

That said, I do understand that there is a development issue of “sidestepping bugs” by just switching backends. My solution does not fix the PyPlot problem, it just offers a way around it. But when I am a user, I tend to not want to deal with issues, and so just sidestepping them is a good option.

The OP specifically mentioned:

If I use some packages like Gadfly, each run can take up to 1 minute or even longer.

Of course, when I read that, I am not surprised because I had startup issues with Gadfly as well (it’s not half as bad as before, but still noticeable last time I checked. One thing I do remember is that timing it is a pain, because it will take awhile to actually open the plot after the timing is already posted, so the times it gives are off). Plots.jl also has long first-time-to-plot times due to its lazy loading, but when doing complicated plots (say 10^6 points?) I found using Plots+GR to be quicker than Gadfly, and the startup time problems will compound if you are repeatedly opening up a new session. So from the information which was given, one thing I know which would help would be to ditch Gadfly. Of course, the fastest here though would likely be to use GR directly, or slightly more of a startup time (but still less than Plots.jl) would be to use PyPlot directly. But most of the answer to the Reddit question wasn’t related to this, it’s just one small thing which would help.

Again, this is off topic and we could have a separate thread to discuss this if you like.

I use PyPlot directly, and I would say that it is a good solution for anyone who is familiar with Matplotlib/Python and/or comfortable reading the Matplotlib manual. And I like the fact that, with Matplotlib, essentially anything I might want to do with 2d plots is generally possible and is described somewhere that Google can find (as long as you are able to translate the syntax from Python, which is not a problem for me).

But it is also true that using PyPlot (or another backend) via Plots.jl provides a more Julian syntax (and Julia documentation) for a large subset of the Matplotlib features. It is certainly a good package for many users and I don’t hesitate to recommend it.

5 Likes

There’s a basic problem with binary dependencies and debugging installation issues: there’s a huge variety of potential configurations that can arise on user machines. Also, relatively few users have the multi-language skills or the understanding of build systems that is required to debug issues with non-Julia dependencies.

This particular problem (with the runtime linker finding an old libz.so ) hasn’t showed up on any machine that I or my students use, and none of the people who encountered it on their own machines have had the time or inclination to dig into it themselves.

I have only noticed this problem on clusters where you have to load standard software as modules. So clusters with schedulers like SGE and Slurm. But I have noticed it in every cluster like this. These include UC Irvine’s cluster, SDSC Comet, Stampede, and PSC Bridges. As noted in the issue thread, XSEDE gives out trial allocations to Comet, so that could serve as a test machine.

I do think we should get this all worked out when 1.0 comes out, and get standard installations on the XSEDE + Blue Waters clusters.

Fully agree. I still think that this does not justify active recommendation against a well established library that works in 99% to the full satisfaction of most Julia users. This is something I have not really seen so far in this community.

@stevengj The computer cluster that I am having the issue is the engaging cluster, at MIT where I am a graduate student. I use SLURM to submit jobs to it.

I am a beginner to Julia and to programming in general and I don’t think that I can solve this problem in a useful amount of time. However, let me know if there is any information that I can provide for any of your students to tackle this issue.

Let me clarify a little bit. I don’t think the issue is that it’s using a job scheduler, I think the issue is likely the interaction with the way clusters with job schedulers use “modules”. Usually when using Python on the clusters I mentioned, you need to load system library via commands like module load ... . Many times somewhat standard libraries like GTK need to be loaded in like this. Is the cluster you’re talking about like this?

My first few tries was just to call combinations of loading these extra modules (along with some compilers just to be safe), but that never worked. In fact, PyCall doesn’t work using the system Python modules when I tried it that way, but it did all work via its own Conda installation using Conda.jl. The libraries which installed themselves via Julia tended to work, avoiding the whole module business entirely.

GR works when I load the compiler module because it did its own installation like this and avoided system libraries. PyPlot on the otherhand I think tries to use system libraries, but I believe the paths are different on this kind of cluster setup (I’m not too familiar with the details). So the fact that @stevengj linked to an issue about library paths means I’m certain it’s on the right track, though I’m not sure how to find the right way to setup the paths in this setup.

I’ll get in contact with the support teams and see if changing the LD_LIBRARY_PATH can fix the issue. Though one think I would like to double check is:

version `ZLIB_1.2.3.4' not found 

I wonder if clusters just have an older version? I had an issue with GR before since it wasn’t able to compile on the clusters, since CentOS 6 had too old of compilers.

I’m sure if the solution is found for one of these clusters, it can just be mimicked on all of them.

I think there is a better fix for this, of just having PyPlot manually dlopen the correct libz library before loading Matplotlib, but I would need someone who actually sees this issue on their machine to help out with implementing/testing a PR. (Comment on the github issue if you are interested.)

@stevengj I can help if you want. Let me know the best way to do it. I will send you an e-mail so that you can have my contact.

thank you

I just wanted to report here that using the following command on the cluster terminal that I am using solved my problem:

LD_PRELOAD=${HOME}/.julia/v0.5/Conda/deps/usr/lib/libz.so julia

this was found on the following discussion

3 Likes

I have been unable to resolve this problem using the proposed fix(es).

This is affecting both PyPlot and Pandas. Although I work with Plots, I am currently locked into PyPlot and Pandas due to dependencies on domain specific python packages that make use of both (my julia code is importing these python packages and calling functionality that either displays plots or returns Pandas dataframes).

How do you know you have the same problem? Not all installation failures have the same source.

What is versioninfo(true), what Python distro are you trying to use, and what is the exact error message?