Analysis of largescale python vs largescale julia?

Does anyone have recommendation on article for largescale python codebase vs largescale julia codebase ?

I’m fairly convinced Julia unlikely to replace OCaml / Scala for me, but I’m curious on reading about managing largescale julia codebases vs largescale python codebases.

The main issue I have with largescale python codebases are (1) dynamic typing and (2) GIL.

For Julia, the two issues I’m curious about are:

  1. I’m quite convinced JET.jl is not going t match OCaml / Scala typing any time soon. I’m curious if Julia is better than Python for managing large scale codebases.

  2. I’m also curious how Julia’s multi-process / async compares to Python’s GIL.

Thanks!

Julia has a number of advantages over python for large codebases.

  1. The package manager. Good package management makes it a lot easier to use external dependencies and large projects can often be effectively organized as a set of independent packages
  2. Lack of 2 language problem. Most big python codebases have to call out to C at some point if they want high speed. This adds a bunch of complexity.
  3. Built in multithreading. There’s no GIL. Julia supports async, multithreaded, and multiprocess code.
  4. A lot faster. If your programming language is (~50x faster than python is roughly correct), you often don’t need multithreading/multiprocessing at all.
3 Likes

Python has a static typing system and static checkers with huge corporate investment. Julia’s is new and marginal.

  • For IO-intensive work, Python’s async/await is fine.
  • For process-based CPU-intensive parallelism, tools like Dask work well, similar to Dagger.jl.
  • For thread-based parallelism in CPU-intensive code, Python libraries written in C/C++/Fortran work well; pure-Python code does not work well for CPU-intensive tasks, because of the GIL.

IMO process based parallelism is almost always a mistake unless you’re doing multi-computer HPC. Threads are a ton lighter weight, and sharing memory is a lot more efficient than copying data around.

Ehm, the Python language itself has only dynamic typing (and optional type annotations, but those are ignored by CPython)? Perhaps you mean the optional static type-checkers, like mypy?

1 Like

unless you’re GC limited :slight_smile: and Julia GC sucks in some edge cases and multi-process is a way to get free parallel GC

1 Like

I can imagine they removing it

1 Like

Yes, static type checkers (Mypy, Pytype, Pyright, Pyre) are optional, but they are widely used and well supported.