Julia motivation for machine learning

tk3369 · April 18, 2019, 2:13am

Out of curiosity, in the domain of machine learning and NLP -

what would be the reasons to choose julia over python?
What would be the reasons to choose python over julia?
What are some of the gaps in julia as compared to python?

baggepinnen · April 18, 2019, 4:41am

1.) When coding in julia, you machine learning library is likely in julia. You can inspect the code, understand it, modify it, debug it. You do not have to write your code in a DSL using a subset of the host language, you can write pretty much any valid julia code and use it together with AD. This holds to such an extent that learning works even if it’s an afterthought, using a model that was originally built for simulation only.

You can write fast code easily.

oxinabox · April 18, 2019, 7:29am

If we consider a traditional NLP pipeline:

Sentence Segment / Tokenize (WordTokenizers.jl; though useless if you need to do Chinese etc)
POS Tag (WIP https://github.com/JuliaText/TextAnalysis.jl/pull/131)
Parse (We got nothing, pycalling NLTK at least works)
Named Entity Recognize (we got nothing)
Word Sense Disambiguate (We got nothing, but nor does python. WSD remains an open problem.)
Coreference Resolution linking multiple references to the same named entity e.g. pronouns. (We got nothing. Python has a few things)

Then you can see we have a number of gaps.
This GSOC @avik and I and others will be mentoring a few students to try and close up some of those gaps.

Of course, in a deep learning, throw out the last 90 years of linguistics approach.
Actually you need very little of the standard pipeline.
And can be happy enough with your Tokenization,
plus some pretrained embeddings (Embeddings.jl).
and a kickass Deep Learning Library (strong vote for Flux.jl),
and and a useful set of data tools (MLDataUtils.jl)

Just to throw out a few more packages

TextAnalysis.jl: has a number of things, including LSI, various text cleaning and Sentiment analysis
WordNet.jl which is a fairly reasonable WordNet front end
CorpusLoaders.jl has some loaders and predefined datadeps for some data (so does MLDatasets.jl)
MultiResolutionIterators.jl is how i think text data should be represented.

anon92994695 · April 18, 2019, 9:47am

Look at all these opportunities for someone to write useful libraries! If I weren’t up to my armpits in my package/work/research I know I’d be looking at some of these gaps.

Tamas_Papp · April 18, 2019, 10:37am

I think that most Julia packages were written mostly because the author had a problem to solve (in work/research), not because they wanted to make a contribution to fill a gap.

So I imagine these gaps will be filled when someone needs the functionality bad enough.

anon92994695 · April 18, 2019, 10:49am

You make a good point. My package was half born out of the fact that I needed random forests, regression, etc IN JULIA. And yea, Julia is currently my research base for good reason. I do ML research in general though, so pretty much any gap in ML capabilities are of interest to me.

Making a package to make a package doesn’t have the same heart and soul in it. Unless someones genuinely interested in the problems contained therein. Good chance they won’t maintain/improve it as time rolls on.

To be fair, most of the things available in python could readily be improved in julia, that’s motivation enough in my opinion to get cracking :D.

oxinabox · April 18, 2019, 1:29pm

Many of those packages I listed are me building things after I needed them.
Like I will complete a project, and some time later be like:
“That was a hell of a hack, lets make the stuff I wish I had when I started”

Topic		Replies	Views
Has Julia met your need for AI and ML? New to Julia	26	1721	October 7, 2024
On Machine Learning and Programming Languages Machine Learning	48	8663	January 25, 2018
Advice to a student. Should I sacrifice advanced python for julia? Offtopic	9	811	June 30, 2025
arXiv: "The State of Julia for Scientific Machine Learning" by Berman & Ginesin Machine Learning python , machine-learning , jax	31	3195	December 15, 2024
Where does Julia provide the biggest benefits over other ML frameworks for research? Machine Learning	34	10460	September 16, 2019

Julia motivation for machine learning

Related topics