Not precisely where this discussion should sit, but this seems a good place to start.
Kaggle.com is a neat site for data science, hosting competitions in data analysis and machine learning, as well as instructional material and learning resources. Users can also post code and descriptive text in jupyter notebooks and run them as “kernels” on Kaggle’s servers.
The site hosts predominantly python and R code, but in principle it supports julia code. That said, it’s clear this isn’t actively supported, since the jupyter notebooks don’t give a julia option, and even scripts that run their example code fail out with an error when
using DataFrames (see here for kaggle discussion).
I’m wondering if it would be worth it for someone at julia computing or someone with more knowledge than me to reach out to Kaggle and offer some support - they don’t currently have the user demand to work hard on their end, but it seems like it could be a useful avenue to increase interest among users in using julia for data science.
Just went on http://kaggle.com/contact and sent this
Dear Kaggle team,
What can the Julia community do to support you in making Julia available as a language for playing with your amazing data sets in your wonderful competitions?
Julia is a young language taking aim at a lot of the strengths of python, matlab and R, with a focus on fast, simple technical computing. It supports notebooks, too!
Any answer welcome, including here Getting julia support to Kaggle. Julia would love to be part of kaggle!
I don’t think I would know enough to support them directly, but hopefully someone here would – and would be cool if they reply!
That’s great, thanks! Someone (looks like from jupyter) also responded in that Kaggle thread suggesting that they file an issue about their issues with notebooks, so they’re getting hit from multiple fronts
For what it’s worth, there is a “Getting Started” competition titled “First Steps With Julia”. However, as I recall, even when I went through it a year or so ago (around the time of the Julia v0.4), their tutorials weren’t exactly “canonical” Julia. If someone wanted to provide an updated version of this tutorial, that might be a good place to start.
Time is passing. No updates on this?
Now that Julia 1.0 is out, maybe it would be worth revisiting this? Is there anyone at JC that is responsible for this kind of outreach effort?
I think Kaggle will be a place for the promotion of Julia in ML community, too.
Maybe the way to do it is up vote the following:
If they see massive support maybe they will reconsider.
Definitely not a priority which is really sad since it goes back to the chicken/egg problem.
OK, I’ve posted a reply. Just upvote mine or add your own
Would be nice to have more people comment on that post.
I posted. Hope they add it, though I’m sure the backend infrastructure for this is challenging.
Given that it’s jupyter based, (and julia put the ju in jupyter) I’d think it would be pretty easy actually 🤷
I think some work would need to be done to make the user experience as smooth as the Python version and not spend so much time on compilation, like using PackageCompiler(X) to bake the standard packages into the sysimg. But what’s standard, and how would it handle using different package versions? There might be other reasons I’m missing. The Kaggle team member did post “Supporting multiple languages adds a lot of work for us,” in that thread.
I’d wait until Julia has a major ML interface. MLJ looks promising, as soon as MLJ incorporates Knet, Flux, and options for hyper-parameter tuning I expect a lot more widespread use of Julia fro ML. Kaggle support will naturally follow.
What if Julia Computing were to host a ML competition on Kaggle? Maybe that could work.
I think that would be cool, but not sure it’s something julia computing is in a position to spend money on. Might be cheaper to offer a year of free support or something to the kaggle Dev ops team or something…
Or maybe a company using julia (invenia, looking at you!) could sponsor a competition?
It would be good if we can get as many people as possible to chime in on the thread here:
I met the Kaggle CEO - Anthony Goldbloom - last year and he reiterated what the Kaggle staff said in that thread. They did not see sufficient Julia usage when they had support. After their acquisition by Google, they had to redo a lot of their infrastructure and ended up dropping Julia. When the time is right, and there is significant community demand, he said they would certainly revisit it.
A lot has changed in the last 2 years and the Julia community is at least 4-5x larger now.
From what I know, there are quite a few folks there who champion Julia. So it would greatly help for folks here to chime in there. Ideally with a somewhat detailed comment on why you would like to see Julia on Kaggle. Certainly upvote the topic and other comments you agree with at the very least.
The 2021 Kaggle Machine Learning and Data Science survey will close on Monday. It would be great for Julia users to mention their use of Julia in that survey. The latest Stack Overflow survey had the size of the Julia community about a fourth of that of R. If Kaggle sees a trend of the size of the Julia community approaching that of R, it might be a good argument towards including it in the future.