Mention of Julia for the re-implementation of PETSC

If you have ideas and opinions, go see: https://gitlab.com/petsc/petsc/-/issues/645

1 Like

The excitement amount the PETSC developers for that idea was, to say the most, “low”.

It would be very good for the long-term stability of the language and growth of the community if a DOE lab moved one of their major projects to Julia. Doesn’t look likely in this case.

I think Julia would be a good “wrapper” language for the redesigned library, IMO. Better than Python.

Agreed, but I don’t see any traction for that idea.

2 Likes

I mean do we really need PETSc in Julia? Seems like we just need to port any distributed data structures and functions they have to DistributedArrays.jl if they are not already there. Then we get the rest of the features for free because of the generic nature of the relevant Julia packages, e.g. IterativeSolvers/KrylovKit, DifferentialEquations, Optim/NLSolvers, etc. Minimal plumbing work may be needed to make dispatch happy but it should be significantly less work than doing everything from scratch in a Julia implementation of PETSc. This looks like one or two GSoC projects and it should be done.

1 Like

There is a lot of the distributed-computing plumbing in PETSc. I am not sure we need all of it, but my guess the majority is relevant to sci-comp.

The DOE codes (PETSC, Trilinos, Sundials) have 100s of person-years behind them and are proven to be scalable on top-5 computers. Having these things easily accessible from Juila can’t be bad.

5 Likes

Agreed here. If there was a massive benefit to having PETSc in Julia someone would have started maintaining a nice wrapper by now (like there is for thousands of other packages), but there’s a reason why SciML, CLIMA, etc. somewhat care but only in passing. Having a monorepo come in and do all of those things half well wouldn’t really be a great idea. PETSc would have to be very much reconstructed to be useful. It would be an organization with a a very strong distributed array package that has matching distributed linear algebra, and the rest would just come from the package ecosystem. Then it would need to make sure that distributed MPI array integrated well with DiffEq, Optim, etc.

The Julia package ecosystem has 10,000’s of person-years behind it in a language where people tend to move 100x faster. That isn’t to say that we’ve done everything better, we definitely haven’t. Their bread and butter of how to scale basic linear algebra to 10,000+ cores is fantastic and we haven’t matched that. The issue is that their designs tend to not be too composable, so you get these monorepos like PETSc where they have some fantastic basic tooling for the underlying operations, but when you get to “the things that call the linear algebra” (optimization, differential equations, etc.), not only does the offering tend to get a bit meh at best (you can tell it’s not where the vast majority of time is spent) or wrappers of things already wrapped and widely used in Julia. None of that stuff even needs to be specific to PETSc since you’d have the same speed from the algorithm if it’s generically written and uses the right array primitives.

So a PETSc in Julia wouldn’t be one repo but instead some very fast primitives that connect to everything else. What we need is someone to build a very nice distributed array package with extremely fast distributed linear algebra on it. The ecosystem could take it from there. Maybe the best way to do that is to just wrap PETSc or Trilinos as it is.

4 Likes

Wrapping the general routines for easily building parallel matrices (the core Petsc vec/mat functionality) and handling distributed meshes / automatic partitioning of unknowns is the feature that I miss. I agree that if that were done and robustly maintained we might see other Julia libraries be built on top of it in a more composable manner. That said, I haven’t looked at the Krylov solver implementations in Petsc; are well-optimized parallel Krylov solvers/preconditioners really completely identical to a serial solver except in the array / matrix type, or do they require further algorithmic changes to work optimally in the distributed setting? If the latter, that would seem to require much more work in Julia to catch up to Petsc, and I can imagine for some things at least like domain decomposition and/or multigrid, more specialized implementations are needed. Petsc’s integration with other numerical libraries is also fantastic, each of which would seemingly require separate Julia bindings to make available…

The perspective of the Petsc developers in that thread makes sense; they want to be as widely available to be used in other languages as possible, in which case keeping the core in C/C++/Fortran seems reasonable. Those are all easily callable from most major scientific computing languages these days, and avoid some of the drawbacks of Julia (I have the impression Julia still needs some work for building compiled shared libraries).

I would love to see a benchmark of all the parallel linear algebra options available in Julia including PETSc. Then a list of which packages are incompatible with which types of parallelism can be made which the community can then address one by one. A benchmark is a good place to start to identify where we stand. This is a good GSoC/JSoC/MLH project idea with a clear extension: fixing the parallelism support in different packages.

2 Likes

I see this comment there (also they’re really thinking ahead with “Fortran 20008” :slight_smile: Fortran 2018 already added some TS 29113: Further Interoperability of Fortran with C):

Julia has a lot going for it, but I’m not aware of successful packages written in Julia to be called from C/C++/Fortran. The runtime is relatively heavy and load time is nontrivial.

The former objection, on the runtime, is correct.* I just note, I see quite a few Julia packages being called from Python (e.g. for SciML packages) and R (convexjlr, and as an example wrapping in other direction: Alpaca.jl). At least 2+ for each language (I’m not sure about adoption), while I forget the names, and wouldn’t know about adoption. So that’s certainly possible. It all works because Julia can be embedded in C/C++, so should be usable from those languages, but who cares, I wouldn’t want to use those as the main language. [There’s also a thread here now on using Julia from Java.]

* About load time, Julia startup is 0.2 sec, but if you’re thinking about the packages, then I suppose (eventually) the packages you need could be put in the system image.

So, I agree about Julia the better wrapper language (than Python, at least when a bit more work has been put into latency/precompilation), or only language, as who cares with the eventual Julia world domination.

What was missing to make this a student project was the PETSc_jll, since the binary isn’t easy to vender. But @jdlara-berkeley finished that so that we could ship it with Sundials_jll, so now it’s automatically built on all operating systems. At this point, a new PETSc.jl should be a simple exercise, so I plan to promote it for a GSoC next summer. I only want the array stuff though, and maybe some TS stuff so we can benchmark against it again and just show the advantages of distributed development (again, back when PETSc.jl worked we did a few things so we want that reproducible), but the array and linear solver portion is what is really useful and a great GSoC.

4 Likes

I’d focus on the following comment by Jed Brown’s comment:

Julia has a lot going for it, but I’m not aware of successful packages written in Julia to be called from C/C++/Fortran.

Does anyone has a non trivial example how to integrate Julia into C / C++ which works across Windows, Linux and macOS?
Something like passing to Julia an array of data and a struct of parameters and apply some non trivial hand written function?

The documentation is really sparse on this. Some community made examples will be great.

1 Like

I guess something such does exist, but I don’t see any advantage of doing that. Julia already works on all three operating systems and C/C++ hardly simplifies any operations you’d use Julia for. It’s a wrong idea that author of the quote made.

1 Like

I think people miss how crucial this is.
In many places there is an existing code in C / C++. No one will write in from scratch in Julia.
The question is what will be done with new modules?
If there is an easy way to embed Julia in new modules it means it is a candidate.
Since there is none, it won’t.

In practice people make pragmatic choices, not ideal ones.
It means if there is an existing code, your path in is to be able to integrate into it.

Julia is missing that. It is not a coincident it is mentioned in a C / C++ / FORTRAN project as PETSC.
Julia needs this to penetrate to this market.

2 Likes

I don’t think this is the case at all. As you can see from the Python ecosystem, it’s C/C++ that gets wrapped into a high-level language. Julia is a suitable choice for that. It doesn’t require rewriting any C/C++, just wrapping it.

1 Like

That’s because Python’s performance is crap for anything numerical. C/C++/Fortran have the benefit of already being fast (which is why Python and R wrap routines in those languages). No one in their right mind writes heavily-numerical code in an interpreted language like Python or R, or if they do, they use a JIT like Numba or Cython.

As @RoyiAvital says, there is a lot of existing numerical code written in C/C++/Fortran that is still actively developed. Those projects would stand to gain a lot by writing new modules of functionality in Julia, but unfortunately we just have an absence of tooling for making that easy. PackageCompiler isn’t an answer here either, because it relies on a full-fat Julia runtime for codes that may never even call into the runtime. Julia is not lightweight, and existing projects don’t want to deal with the added bloat just to get a few benefits from writing new code in Julia.

So let’s please stop assuming that just because Julia is cross-platform and PackageCompiler exists that projects will start writing new codes in Julia; they won’t until we fix our stack to better fit into their workflow and requirements.

11 Likes

That’s because Python’s performance is crap for anything numerical.

That explains why Python projects wrap C/Fortran. But, not why Python is adopted as the user-facing language in the first place. Essentially no big project with a Python front end is going to switch to Julia. And almost no new project that wants users with a broad background will choose Julia as the user-facing language; they’ll choose Python. What a project might do is use Julia in places as a substitute for C/Fortran (or Rust, which some projects are willing to do), for example to write Python extensions. But, not until Julia can produce slim, linkable object code with no runtime, like C or Rust. All this is widely understood in the Julia community. I’m just restating this for this thread. But, I see the latter, when it is possible, as a wedge to drive Julia adoption in more contexts.

1 Like

No one would wrap an existing code base in Julia.
Let’s say I have a system which is doing some analysis of signals and now I want a new module to address new model of a signal.

The whole system is written in C / C++. Yet if there is a language which can be more efficient (In development time) it could be great to use it as long as:

  1. It offers similar run time to C / C++.
  2. It can be embedded easily into the existing code base.

As you can see in the PETSC they consider Rust for those reasons though I am not sure it indeed offers better productivity.
In other places (Commercial businesses) they use MATLAB + MATLAB Coder exactly for that.
MATLAB Coder solves the problem by generating a portable C code (Instead of the MATLAB Run Time).

For first step I’d even embed the whole Julia Run Time in the projects, but I can’t find documentation / example for that for the non trivial case. The documentation shows how to evaluate simple string. It is also oriented to Linux.

Is there an example to embed the Julia Run Time in case of non trivial usage?
Like passing a big array of data and some structs, process few Julia functions from several modules and getting the results back?

My recommendation is to only ever interact with the Julia code via @cfunction pointers. Then your Julia code will behave just like a C library from the embedder’s point of view. Obviously you still need to start the runtime and extract the @cfunction pointers but that requires a relatively small amount of embedding code. The documentation in https://github.com/JuliaLang/julia/pull/28886 might be helpful but notice that it does the setup with a dynamically loaded libjulia. That adds some complexity in exchange for making Julia an optional runtime dependency instead of a build dependency.

1 Like