Hi everyone,
Here’s my brief question:
What do you all recommend as a good post-v1.0 “Learn Julia” book for a non software developer / data analyst? By “non software developer” I mean someone who lacks both patience and natural talent for low level, “low productivity” languages like C, C++, and Rust (Java and C# are only slightly better). By “data analyst” I mean someone who regularly deals with large amounts of data (10s of GB to 10s of TB, usually unstructured or semi-structured), but has little to no need for ML/DL/AI related algorithms. Huge bonus points if said book includes practice problems/exercises (I learn best by doing), and if it covers both using built-in functionality/modules and developing new functionality (including wrapping existing C/C++ code).
Does such a book exist? If not, how do people like me learn Julia?
Here’s the longer version:
My computing experience started with being primarily taught Matlab as an undergraduate engineering student. Though I took a couple of computer science courses, C and C++ always seemed too low-level for my liking; pointers and indirection required more thought than I wanted to give them, and it always took much longer than I expected to accomplish something useful. So Matlab was where it was at.
As a graduate student, I rather quickly ran headlong into some of Matlab’s issues/weaknesses, chief among them its cost. I also rather quickly found that there’s often only one way to make Matlab code perform well (ie, figure out how to vectorize your code), which caused problems. In looking for an alternative, I discovered Python, which I fell in love with, built much of my dissertation on, and which I’ve done most of my subsequent “computing” work with. For the most part, it’s a great language: well designed, strongly biased towards the human side of computing (rather than the machine side), huge community, lots of libraries/modules, easily found threads with solutions to similar problems, etc. And I love the interactive nature of Jupyter Notebooks / iPython - I do a lot of my work in those two environments.
I’ve occasionally run into Python’s weaknesses over the years, chief among them the GIL and its implications on multi-threading performance for CPU intensive algorithms. I’ve always either stuck with single-threaded code and eaten the performance hit, or gone multi-process and taken less of a performance hit.
Then, several months ago, I needed to run a fast multi threaded regex search, and hit a very hard wall. So I’ve been looking for an alternative to Python that has most of its strengths but also good performance and strong multi threading support out of the box. Of everything I’ve looked at (Rust, Nim, D, Go, and revisiting C/C++), Julia looks like it checks the most boxes. The community isn’t as large, but that makes sense; it’s still a relatively new language. Jupyter Notebook support is fantastic, there’s out-of-the-box support for modules, performance looks pretty darn good (compared to Python), etc.
That being said, I have yet to find a good resource for learning Julia. I’ve played around a little with Julia’s regex capability, but as I recall it’s based on libpcre, which is…quite slow. And the library/module that wraps RE2 (I think it’s RE2) hasn’t been maintained and doesn’t work with Julia 1.5. So I’m rather quickly finding that, to do the thing I want to do, I’m gonna have to leave the realm of “use what’s in the box” and move into the realm of “develop something new” or “fix an unmaintained thing.”
Thus my question about how best to learn Julia.
Thanks.