Error: Error during initialization of module CHOLMOD on HPC running Scientific Linux 7

Hello all,

I tried to load Julia on the HPC I use (in the Netherlands managed by Wageningen). I get this error after loading julia and initializing REPL.

I guess I don’t even know what to do, any ideas? Has anyone else seen this and can anyone advise me on what to tell my HPC support? They don’t seem to do a ton of Julia and I have no idea why this error came up. They are running Scientific Linux 7 I guess. Anything else that could interfere here?

how was Julia installed?

I’m sorry, I really don’t know. Admins on the HPC would have installed some time ago. Just don’t know how to help them with this. Not sure what could have caused it. Thanks. Just looking for general help.

When I last used Julia on a cluster (a while ago now), I downloaded it directly to my own workspace and then used that local installation. If you’re able to do that, it may be less work than debugging the issue via support ticket with a sysadmin who knows less about Julia than you do…

3 Likes

Hopefully it can help,

  1. you can use wget and download from this page the newest Julia (use Linux means choose Generic Linux on X86, make sure your HPC machine is X86, check it on terminal)

From terminal type:

wget https://julialang-s3.julialang.org/bin/linux/x64/1.8/julia-1.8.5-linux-x86_64.tar.gz
  1. Extract it to a special designated place / path, then adjust the environment variables:
cd ~
vim .bashrc

Add:

  • export JULIA_DIR="\path-to-JULIA/julia-1.8.5
  • JULIA_DIR/bin to PATH

If you are confused, you can see my .bashrc below, thus at terminal after I type:
(base) browni@browni:~$ cat .bashrc
This is my .bashrc :

export JULIA_DIR="/home/browni/julia-1.7.3"
export PATH="$PATH:$JULIA_DIR/bin:$CUDA_DIR/bin:/home/browni/.local/bin"

See… I am still using Julia-1.7.3, if you want to stay with this version I can give you the tar.gz from my computer , otherwise use wget or download it with other method to your HPC to get new Julia.

SuiteSparse’ CHOLMOD is really troublesome, I haven’t even successful installing that in my Linux OS too… Find a replacement for SuiteSparse if it is taking too long to use it.

You are missing libcholmod as I read it again from your screenshot. See the /usr/lib and if there is no such library, you either need to install again, to make libsatlas.so.3 and libcholmod available. It is a dynamic library so no need for extra work like static library.

cd /usr/lib
cd libsatlas (type tab for autocompletion and showing suggestion)  

if there is no suggestion meaning you have no such libsatlas in that directory. Install SuiteSparse properly this time, make sure the lib will be installed at the default place for library.

Another case, the admin or someone might install the lib in another place which can be moved later on. For full computer search at terminal type:

grep -nrl libsatlas

it might take a while with a nice tea or Cappucino. If there is such file, then go to that directory and move to the default directory to call library.

1 Like

Thank you so much. Your wget command worked great, I moved it into a folder and yes I’m familiar with the PATH variables and exporting. Did all that just fine. Opened Julia and got the same error message still.

As for the rest of it, I’m still pretty lost. I tried to do pkg> add CHOLMOD but didn’t work… also with libcholmod but didn’t find it. I guess these aren’t Julia packages which is confusing to me.

When I go into /usr/lib/ there is nothing (with autocomplete).

I was able to go into pkg and add SuiteSparse package. But still gives me this error at startup. Is there a different way to install SuiteSparse?

I’ll post my grep when it finishes. I’m still pretty lost on what I need to do to fix this… I can’t run any batch scripts until this is fixed. Don’t have any issues on my Ubuntu desktop or my Macbook Pro…

@giordano - maybe you seen such problems before?

When I go into /usr/lib/ there is nothing (with autocomplete).

That means the library isn’t installed at the default library folder or even worse not installed yet. Better try to install SuiteSparse independently. SuiteSparse, CHOLMOD, BLAS , LAPACK They are a house of their own…

This is the official SuiteSparse website:
https://people.engr.tamu.edu/davis/suitesparse.html

the github:

download SuiteSparse source code:

wget https://github.com/DrTimothyAldenDavis/SuiteSparse/archive/refs/tags/v6.0.3.tar.gz

extract it:

tar -xvf [name-of-file]

at the github the creator of SuiteSparse explains how to download it on Linux or Mac, using Makefile, you can follow it easily. So just use terminal. You install whole SuiteSparse with its packages like CHOLMOD. Then linking them, check again whether the so-called library installed then hopefully there will be no error after that.

My problem is probably due to my mistakes when install BLAS LAPACK first, then get error when compiling SuiteSparse

“Compile SuiteSparse get ‘.rodata.str1.1’ can not be used when making a shared object”

Due to changing prefix of installation, but get busy with Julia and Calculus, I might go back to fix SuiteSparse installation in my computer later.

I like the motivation in SuiteSparse’ research page:

The large matrices that arise in real-world problems in science, engineering, and mathematics tend to be mostly zero, or sparse. Sparse matrix algorithms lie in the intersection of graph theory and numerical linear algebra. A graph represents the connections between variables in the mathematical model, such as the voltage across a circuit component, a link from one web page to another, the physical forces between two points in a mechanical structure, and so on, depending on the problem at hand. The numerical linear algebra arises because these matrices represent systems of equations whose solution tells us something about how the real-world problem behaves. Google’s page rank algorithm, for example, requires the computation of an eigenvector for a matrix with as many rows and columns as there are pages on the web.

1 Like

Start Julia with

LD_LIBRARY_PATH="" julia

Thank you very much for your help on this.

So going into Pkg and adding SuiteSparse won’t do it? I downloaded through Pkg, but still had the same error.

I will forward this to our system admin on the HPC as I don’t have much control on this HPC of course. Hopefully they can get this done.

It’s this a little involved for just using base Julia? I’ve never downloaded any other language and had to compile code from source like this… Not in R or Python or Go or several others at least… Seems like a major issue to me that the devs need to fix. Are you that person or do I need to contact someone who may be responsible for this? Or Julia just isn’t tested thoroughly enough on enough Linux system I suspect… I don’t have issue on Ubuntu and others…

@ Freya, some of your suggestions might be a bit misleading. A lot of what you are saying is reasonable for a single-user Linux PC, but what Austin is describing is a cluster environment where a modules manager is used to bring in various pre-installed modules. And either way, that level of complicated custom compilations is certainly unnecessary to setting up julia.

Austin, if you are relying on the modules system, this is really something your sysadmin should address. There are a lot of intricacies in how that system is set up, and it can certainly break things. I am certain this forum would be full of people who would try to help, but a lot of system details would need to be known to fix that setup.

If you are instead completely disregarding the modules system (which I believe is a good idea with more modern languages like Rust, Go, and Julia), I think you will have better luck finding help here.

Could you confirm the following: In a clear terminal, without any loaded modules, after downloading julia (e.g. with wget) and running the downloaded version (e.g. by launching it directly or checking that which julia indeed points to your freshly downloaded one), you still get that error? Even after following giordano’s instructions of launching with LD_LIBRARY_PATH="" /path/to/your/newly/downloaded/julia?

I would be extremely surprised if the above example does not work, for the same reasons you already mentioned: modern languages like Rust, Go, and Julia are fairly self-contained and do not require modules imports and local compilations.

To your comment about testing: Julia on linux is extremely well tested, including testing the entire ecosystem when there is a non-negligible change to the language, not just the base language. The problem here is almost certainly a weirdly setup modules which is why you have gotten a few advices already in the direction of just downloading julia yourself.

Edit: And to the comment about whom to contact: If you are using the modules system, that would be your sysadmin. If you are downloading the precompiled executable (you only need to download and unzip, nothing more), then there are a lot of volunteers here and on the github issue tracker that would be happy to help. If you want something more than volunteers, that would be hiring a consultant and paying them to help with setup (like with any other open source project).

3 Likes

@Krastanov Thank you so much for your reply.

Yes, you are correct, it’s a cluster (I guess I call an HPC as they call it that my university). I will forward all of this to our admin, they have not responded to either of my emails so far. I just wanted to help them along, perhaps they will post on here too for help, I don’t know.

At first I was using module but now downloaded myself and still having issues.

Did I do your command correctly?

Or did I mess up?

I’m not mad or demanding anything, I just think maybe the core devs should know this came up as I’ve never ran into anything like this for any other major language… Of course not going to pay a consultant, I understand everyone here is a volunteer, I’m not yelling at anyone or mad, I’m very glad for any help, just very strange error and I have no idea what is going on. I guess the only outcome is people can’t use Julia on this cluster until we fix it somehow. I would like to write some pipelines with Julia. Hopefully we can get this solved soon, perhaps it’s just this OS, I’ve never ran Scientific Linux…

Also, where would I send my sysadmin for help? Here? Thanks in advance.

There should only be one “Y” in LD_LIBRARY_PATH.

To add a bit of explanation to what others have suggested: a version of the binary CHOLMOD library compatible with Julia is shipped as part of the Julia distribution (at least up through Julia 1.8). But using the environment modules (or other “helpful” configuration by the sysadmins) which set LD_LIBRARY_PATH forces the system to select a (probably) incompatible substitute. Evidence for the latter is that the foreign libcholmod links to libsatlas (see the error messages above) whereas Julia’s links to conflicting alternatives.

1 Like

No, you are right (except for a typo, the double Y). I would have been just as surprised and as frustrated as you if I saw what you are seeing! Except for the minor typo in the (esoteric and not well known) environmental variable, you are doing everything right. Hopefully it would work after fixing the typo.

As to why you as a user have to bother with even knowing about this esoteric variable: I am seconding Ralph’s suggestion and explanation. Julia is expected to obey linker overrides like the LD_LIBRARY_PATH (and probably a bunch others that I do not know about), so when such an override is set to point to an incompatible library version, all julia can do is complain and shut down.

Given the propensity of well meaning but misleadingly configured modules systems on HPC services, it might be worth it to have better error messages when things like this happen. After all, it would be really nice if the error message is “For some reason, probably related to how your computing environment is set up, Julia was asked to use an incompatible version of so-and-so. If this was a misconfiguration, please stop commanding Julia to use an incompatible version by removing the SOME_ENV_VARIABLE override.”

Edit: Pointing the sysadmin to this thread and Ralph’s comment would probably help them with fixing the issue. Unless some specific module is loaded, there really should be no active LD_LIBRARY_PATH overrides. Hopefully they will not be too annoyed at our dislike for module systems (we know that everyone has different, slightly contradictory needs).

Edit 2: I submitted a issue on github about having clearer error messages in such a situation. I doubt it is something that would be considered a high priority or addressed anytime soon (after all, it is a misconfigured system, not a problem with julia proper), but it is probably a configuration issue that julia can detect and warn about.

2 Likes

Perfect, you are correct, I just was playing around and decided to try again a few minutes ago before I saw this and it worked all the sudden… Thank you guys so much. That was concerning and I really wanted to get to work on Julia (need the much larger GPUs).

Ahh, I had no idea about any of this. Never had this message before and googling didn’t get me too far.

I will point them to this thread and hopefully they can find a solution here this week.

Thank you for submitting, seems like I’m not the only one… I appreciate all your help here!

1 Like

Thank you, yes I messed up… Got it working finally. Got it, I’ll pass along to them. Thanks!

Sorry, I didn’t know that HPC is not a single-user Linux PC like personal computer. I think you are more expert in this. Thanks for helping this.

1 Like

No problem, thanks for your help.

1 Like

Hi,

I’m somehow having a similar problem.

but the fix does not help:

It seems the library it finds is missing something, is that possible?
Can anyone help me with this please?

David