You are right.. Let me address them.
Motivation: We got a project to evaluate the potential of GPU for a type of power system simulation in the timescale of milliseconds. Those problems are very large differential-algebraic equation systems comprised of hundreds of models, each instantiated thousands of times, and coupled through nonlinear algebraic equations. I want to investigate the upper limit of computational performance for such problems. A systematic approach, instead of rewriting some bottlenecks, is needed to be convincing.
Julia seems to be the right language: it has sparse matrix support, DiffEq solvers, GPU support, plus automatic differentiation. Equally important is that I am somewhat confident that the project is doable in Julia in a couple of years, but I am not confident to do it in Rust, which seems to require decent development in scientific computing infrastructure. Before starting the project, I implemented a prototype library. Julia’s ergonomics and performance was impressive.
The initial plan was to generate kernel code in Julia (since my Python does modeling in SymPy), so that I only need to write a thin wrapper for DiffEq. Later, architectural issues from the Python library surfaced, and I decided that a rewrite may be worthwhile. As of now, I haven’t gotten the GPU part done. Sparse Diff on GPU is.. not within reach without more research and implementation. But I would say, I now got a better platform for research and publication.
Performance: The new library is very promising. With a proper variable step integrator for stiff ODE, the simulation performance is probably 100x faster than my Python version, both in small test cases (e.g., for one with ~100 variables, which the Python version takes 1 sec to run and the Julia variant takes 10 ms) and large ones (e.g., another one with ~300k variables, Python’s is 1 hour + and Julia’s is 60 sec.). This is very impressive.
But it takes a a lot of efforts to get there, specifically, to get to near-zero allocation in hot loops with AD. The program is inherently hierarchical, and I as I mentioned in the post, I used NamedTuples a lot and still have JIT delays to deal with. I still don’t think, in a multi-layer program, I can reason the type interfaces well-enough to a minimize allocation. For me, it has to be done through profiling, isolation, and trials and errors.
Usability: I am struggling a bit with the usability, but not as much as the tooling. Currently, precompilation time is like 50 seconds, and JIT time is down to a decent 10 seconds (for a program with AD, multiple models, Select OrdinaryDiffEqBase solvers). But for most researchers who use small test cases, this is much longer than the total run time of the Python version.
Another uncertainty is the distribution. My community uses Python heavily (@ufechner7 , you may comment on this, and I welcome disagreement :D), Still thinking about attracting researchers to Julia. It should be distributed with Python binding (like diffeqpy or PySR) and a sysimage, I guess. At the end of the day, most of the researchers are graduate students, and Python skills are more marketable.
The Python version takes one line to install (pip install andes, and the CLI too andes is snappy). For scripting use, is more ergonomic in VS Code or Jupyter – dot + can help discover methods easily. I really hope @xgdgsc 's demo on auto completion will move forward.
A strong claim without reasoning: to make Julia more marketable, it has to be more distributable (short scripts should be quick to run, and large programs should be installable by one line), and that means becoming more static. BTW, why do we not have a pip and has to run julia, activate and install? We need more “Apps”.
Another usability issue for developers. I noticed that packages in Julia are quiet. Even with JULIA_DEBUG on, debug outputs are limited. This may have to do with the deliberate choice that debug loggings above @info are loud (and shown by default during precompilation with line info shown), so to reduce noise to end users, packages chose not to use them (related and still open, Provide a clean way to add custom LogLevels · Issue #33418 · JuliaLang/julia · GitHub). For example, I want to see intermediate Jacobian matrices for debugging a four-variable system, but DiffEq does not have those @debug, so I had to find the proper places to add print statements. The bugs are almost always in my user code, but more debug outputs can help.
Another issue in a Monorepo structure is Pkg.resolve(). When I add a package in a subrepo, the program gives a dependency error, which can be fixed by a Pkg.resolve() at the root repo. But Pkg.resolve() says no change in the top-level Manifest.toml file, which is absolutely correct and at the same time concealing, because it does modify the sub-repo’s Manifest.toml file. I don’t know, I didn’t dig deep enough to find the root cause, but explicit should be always preferred over implicit. Below is AI’s summary about this issue
1. You add a dependency to a sub-package (e.g., `SparseMatrixColorings` to `SubPkg/Project.toml`)
2. Precompilation fails with: "Package SubPkg does not have SparseMatrixColorings in its dependencies"
3. You run `Pkg.resolve()` from the root environment
4. It reports "No Changes" but the dependency issue is fixed
The message only reports changes to Project.toml and Manifest.toml files. It does **not** report:
- Internal dependency graph updates
- Resolution of dev'd package dependencies
- Cross-package dependency linking
In a monorepo with dev'd packages:
- The root Manifest.toml tracks all dependencies
- Each sub-package has its own Project.toml declaring its dependencies
- When you add a dep to a sub-package, the root Manifest may already have it
- `resolve` updates the internal linking without modifying files
This is documented in [Pkg.jl Issue #3066](https://github.com/JuliaLang/Pkg.jl/issues/3066):