Developing a Beginner's Roadmap to Learn Julia High Performance Computing for Data Science

One way to proceed is to copy successful training from others. One of the best is fast.ai, which is produced and given away for free by Jeremy Howard. If you listen to his interview on the Lex Friedman podcast, he makes a couple of observations:

  • most analysts don’t process huge datasets that require networks of computers
  • most analysts are just working on their single workstation with a single GPU
  • deep learning is the most advanced technique available and it’s not too hard to create state of the art models today

The very first exercise has you train a model to recognize cat pictures (historical fun fact: the internet was built to share cat pictures).

The course currently uses Python. I believe he is also developing a swift version. There is also an effort to create a Julia version. See also forum post at fast.ai

So, one thing you could do would be to contribute to that.

Besides that, I think a “traditional” introduction do deep learning would start with tree models (CART), boosted trees, random forest, and support vector machines. All of those and more are available in the MLJ toolbox. One approach – a good one – would be to write tutorials that walk people through the MLJ toolbox. Now you might expect that has been done – and it has: Data Science Tutorials in Julia. You could add to that in either breadth or depth. Or, you might find the tutorials too advanced and create simpler baby-steps tutorials for absolute beginners in data science. You need to define your audience, find you niche, and go for it!

2 Likes