Why is Python, not Julia, still used for most state-of-the-art AI research?

I totally agree. You can write Julia like MATLAB and probably they will be as readable.
The problem is when you want people to engage. The bar is high. It is not a shortcoming of Julia. It is a property. The dynamic range of the complexity of Julia is very high. Package developers who choose the upper part of the capabilities of language means they target smaller part of the Julia users.
The whole question was why does Python get more engagement. So it could be a factor.

2 Likes

Don’t forget Chainer! PyTorch has a rich heritage and very much started off of an evolution of those two libraries.

I would go so far as to claim (controversially) that Flux’s continued emphasis on experimentation and “big ideas” to the detriment of stability and polish negatively impacted adoption for the entire Julia DL ecosystem after the initial hype subsided. Thankfully, Flux has found more of a middle ground now and Knet is gradually gaining name recognition, so I’m hopeful that some of the stereotypical criticisms around stability, completeness, docs, etc. can be assuaged.

If anything, this is testament to how subjective “easy” is. As someone who has no “hard science”/engineering background outside of CS, I found MATLAB bizarre, unintuitive and extremely frustrating to use. For example, array indexing, slicing and function/class definitions are completely different from what you’d see in other programming languages, whereas I’d be hard pressed to find some part of Julia syntax that doesn’t exist in some mainstream programming language. Again, past experience matters a lot here.

6 Likes

Then I suspect you haven’t run into too much code that has been vectorized in absurdum to get the last bit of performance without resorting to mex files or attempts to implement algorithms for arbitrary dimensionality. And you never know what some crazy PhD student might come up with. This is maybe somewhat challenging to understand: https://github.com/GunnarFarneback/spatial_domain_toolbox/blob/master/polyexp.m

9 Likes

Coming from Matlab I have the opposite opinion. The hoops you have to jump through to make Matlab ‘packages’ performant tend to make them very hard to read. Everything must be contorted into maximal vectorization, while Julia library code can simply and straightforwardly implement the concepts in a natural and expressive way.

I have read my own ‘performant’ Matlab libraries, and they are like looking into nightmares.

8 Likes

So far Flux gave me the best experience I have ever had doing DL barring one issue - stability. Has that gotten better or does everyone still lock pkg versions? If so I would be willing to take another crack at it.

4 Likes

Same for me. My PhD supervisor was an absolute wizard at MATLAB, every time I got stuck just waiting for MATLAB to solve something I’d pop into his office and he would do some arcane fix to speed it up x1000. With about 200 more lines of code (slight exaggeration).

I switched to C++ because it was easier to get working in realistic time than optimized MATLAB.

I can sort of see why a MATLAB refugee might like the under-the-hood python, your eyes get used to skimming over reams of vectorisation and performance tweaks and lighting on the good stuff.

1 Like

To be honest, even though people keep saying there is no need to avoid for loops in Julia, I still prefer to vectorize my code as much as possible. Vectorization feels more natural to me.

2 Likes

IMO it depends a lot on the problem. Some problems are really obviously vectorized problems. For those, vectorization is great. The problem is that lots of problems are technically expressible in a vectorized form, but doing so isn’t obvious. The nice thing about Julia is that it gives you fast versions either way, whereas low level languages like C only have loops, and older high level languages only have vectorization (if you care about speed).

8 Likes

There is no contradiction here: since “vectorized” code is not special, you are free to do what you like.

Generally this point comes up when comparing to languages which provide “fast” vectorized kernels, usually written in C, compared to “slow” control flow of any kind (eg R). So someone who would prefer for loops in some context is still forced to go for vectorized code if they want performance.

Julia simply gives you the choice back between the two alternatives. Use what you prefer.

7 Likes

Absolutely this.

I’m still glad for my MATLAB days because nothing makes you so absolutely rock solid on the basic operations of linear algebra as using MATLAB for anything!

With Julia I tend to vectorise if the code object represents a mathematical object that you’d naturally write as a vector. Not if it doesn’t.

1 Like

It’s a choice, but the alternatives aren’t always equal. In some (maybe many) cases there is no efficient way to express your computation in terms of a set of prepackaged vectorized kernels. Then, you either write loops, or the code will be slow.

5 Likes

That’s of course true, but the type system of Julia puts an extra wrinkle into this. Specifically, using a for loop, usually you need to allocate a container for the result, which requires an element type. In a lot of cases, it is trivially known, but generally it may be tricky to obtain. Using a functional style (map, (map)reduce, (map)foldX) can sidestep this issue.

A lot of exciting experiments are happening on this front, in packages like

https://github.com/JuliaFolds/Transducers.jl

for loops of course remain a valuable tool in the Julia programmer’s toolbox.

2 Likes

Microsoft seems to already be using Julia for neural networks, at least one of their employee does in “slimgroup” collaboration work (and a paper on work building on this names him and Microsoft, and Julia):

Authors

[IBM also has lots of ML Julia packages, latest I’ve seen: GitHub - IBM/Lale.jl: a Julia wrapper of Python's lale automl package “a Julia wrapper of Python’s lale automl package”.]

I decided to look into this after seeing new package (that is not yet registered, while Microsoft’s older OptimSim.jl package is, and is pure Julia): GitHub - microsoft/AzureClusterlessHPC.jl: A Julia package for clusterless distributed computing on Azure

AzureClusterlessHPC.jl is a package for simplified parallel computing on Azure.
[…]

Applications

[…]

  • Generic batch, map-reduce and iterative map-reduce examples
  • Deep learning with AzureClusterlessHPC.jl and Flux.jl
  • Seismic imaging and inversion with COFII.jl and JUDI.jl

https://arxiv.org/pdf/2101.03709.pdf

Our implementation relies on InvertibleNetworks.jl (P. Witte et al., 2020), a recently-developed memory-efficient framework for training invertible networks in the Julia programming language.

Here, we heavily rely on InvertibleNetworks.jl, a recently-developed, memory-efficient framework for training invertible networks in Julia.

Memory efficient convolution layer via matrix sketching

[…] This package contains two implementation:

  • A julia implementation that overloads NNlib for the computation of ∇conv_filter .
  • A PyTorch implementation that defines a new convolution layer Xconv2D, Xconv3D .
15 Likes

At this stage in Julia’s life cycle, not having any suppor and/or experiments from large players that make money from HPC & ML would be surprising, not the opposite.

5 Likes

When I started this thread in Sept. it was puzzling why Julia not mentioned much ML/AI papers.

You see at JuliaComputing.com under “JULIA USERS AND JULIA COMPUTING CUSTOMERS” the logos from all the major (software) companies, even Apple, but I don’t know in most cases what that means. I was curious if it relates to ML, and I thought people might be intrigued. There’s also some Julia here:

I think academic and industry use may be different beasts here. Certainly for ML research, most of the operational pain points in real world implementation do not apply because of e.g. clean benchmark datasets, while there’s arguably more focus on getting people up to speed quickly and using as much existing code (regardless of quality or performance) as possible to save time. This is pure speculation, but I think lack of mature distributed training functionality is also a blocker for larger research groups (which drive much of what gets adopted) to get involved with the ecosystem.

1 Like

People spend a lot of time and effort learning a programming language in depth, and then they stick with it. It’s hard to recognize that what you’ve done isn’t as good as it used to be or as you thought. And it’s hard to start from scratch again.
People prefer to use what the majority uses, you can find most things you need already done.

4 Likes

Programming languages together with the ecosystem are more or less isolated silos, it’s rather hard to switch.

Consider a single professional, having worked in language A for five years, learning the ecosystem, getting work experience. You have bills to pay, children to feed, whatever.

The professional sees language B that is a little better in some ways. Will she spend the time to learn it, get a new job in that language with zero work experience? Or maybe convince the whole company to switch?

2 Likes

Especially when the majority of job opportunities will be based on experience in language P… ython

3 Likes

Well, and there is also a very popular J… … …ava language on the job market.