Learning Julia for scientists who are beginning programmers

Many of us coming from a biological field have taken 0 programming courses. So we will need to learn from books and online manuals that teach programming with Julia as the only language to understand. Most of us never deal with types in R. We understand the basic numeric or character but that’s it. I’m still not sure how to optimize my code and I’ve read the manual and still I struggle with no programming background. It’s hard to even understand the language used.

8 Likes

At that level, I wouldn’t worry about optimizing your code. Learn to write code that is correct and flexible. Call mature libraries for your core computations, and trust them to do your optimization for you — Julia’s performance potential has enticed many people to develop high-performance libraries. Over time, as you get more used to programming and learn what the bottlenecks in your code are, you can learn how to improve performance in small chunks if you want/need to.

Most code is not performance critical, computers are fast and suboptimal code is often fast enough, and premature optimization is a well known pitfall in software engineering.

25 Likes

I agree, I really rely on other packages to do most for me. However there are novel algorithms that do exist and only within our field, so at times we do need code that is high performing and memory efficient. Most of our good programmers are terrible about building and maintaining these packages. Many have but moved to Julia over Fortran (animal breeding is the technical field name I’m in). All packages are historic and built in Fortran in the 90’s or early 2000s.

Do you have a best “learn programming with Julia from the ground up”? (Book or other)

My biggest complaint learning Julia is right off the bat they say something like “this is like C, lisp, Fortran”, well that’s great but I’ve not used any of those lol. I’ve slowly picked up where things like structs come from in other languages and how to use them but it’s very slow for me. Feel like I should just go learn other languages first and come back…

2 Likes

Considering “best” is fairly subjective, I like the style of Think Julia. It’s possible to buy a hardcopy, but it is open source so the PDF and web versions are free.

(added edit) If you prefer tutorials to follow along, check out the JuliaLang YouTube playlists. Before every (maybe?) JuliaCon, there are workshops, some are very general and introductory, while some are more focused (on DataFrames.jl for example). Or just search for “juliacon workshops” perhaps?

5 Likes

I’m also a biologist without any formal education in computer programming. At most, I took a class that taught me to use R beyond the usual quick analysis. So, I learn a bit about code reuse, scripts, things like that, but very basic. I’ve been using Julia now for years, but very sparingly, so I’ll consider myself an intermediate user, not striving to be an expert anytime soon. And I learned by doing. I’m sure there are more tools, tutorials, videos and other resources now, but when Julia was in 0.5, 0.6, the hurdles for newcomers were even higher than now, I believe. However, I was tasked with a project, decided to use Julia for it just because I wanted and was allowed to use whatever I wanted, and did just that. I’m sure my code is still messy, unoptimized, maybe not general for public use, but if I need to program something now, I go for Julia, I haven’t touched R in years and I feel I can read other people’s code and get at least the gist of it. But the best part, at least for me, is that I know that I can play with any repo locally, break it if I want to, trying to understand how it works, and things are usually in Julia. I don’t dig deep enough to get into C++ or C, and that never happened in R. I realize that this doesn’t apply for everyone and we do need more tutorials, better on-boarding docs, and better tooling for teaching the language, but Julia really offers a nice playground to get in it, break thing, explore, learn and do.

7 Likes

Imho, Structure and Interpretation of Computer Programs is still great for learning programming. While the book uses Scheme – a simple Lisp dialect – the presented concepts are language agnostic and generally useful.

2 Likes

I’m working on a (free) book that aims to fill this gap. However, a preliminary first version won’t be available until later in the year.

You can skim through it here, although I emphasize that the book is not even close to being finished in terms of topics, writing, code, etc.

Any feedback is more than welcomed!

13 Likes

That’s great! I can see you already have quite a start though, looks very helpful to people like me. I will keep following along and will get through it. Even the basic stuff on VS Code is very helpful as I struggled when I first started. I’ll share with students as well. Thanks!

We were all beginners at some point, though I wasn’t a scientist at the time! Nor am I now.

I can recommend: https://www.amazon.com/Tanmay-Teaches-Julia-Beginners-Springboard/dp/1260456633

It assumes no programming experience, and the “Springboard to Machine Learning for All Ages” part of the book, and that subtitle, is only about a third of the book, if I recall. Neither assumes any ML/AI knowledge.

Tanmay was only 15 years old when he wrote the book, far from a beginner, actually a child genius. I admit I only scanned the book, and all the Julia books I’ve bought, because I’m not really a beginner, don’t need the books, kind of bought for others at the company:

From the Publisher

Tanmay Bakshi is a 15 year-old author, AI/ML expert, TED and keynote speaker […] His goal is to help at least 100,000 aspiring coders in learning how to code through his workshops and seminars. Tanmay’s YouTube channel called “Tanmay Teaches” has earned him numerous recognitions like IBM Champion for Cloud and Google Developer Expert for Machine Learning.

He has also written another book on another language “Hello Swift!: iOS app programming for kids and other beginners” and one on Go (with another author) and is a coauthor on “Cognitive Computing with IBM Watson: Build smart applications using artificial intelligence as a service” on that book:

This kid coder’s journey began at the early age of five, when he developed with DOS batch files, C, FoxPro, and Visual Basic and then created iOS apps. When he was nine, his first app, tTables, was published to the iOS App Store. This news inspired many kids who always wanted to code but never knew how to take the first step.

Now, are there books for scientists, for programming, in Julia or not? I see many domain-specific Julia books already, but I’m not sure any of them are beginners books or best for that, maybe for e.g. beginners in linear algebra (I forget how much programming experience that book assumed).

I feel like Julia can be a great first language, not based on my own experience, though I read some confirm it worked well that way. But quoting “we want Julia to be your final language”! Most computer programmers/computer scientist learn many languages. It’s unrealistic to learn just one, and expect others will not replace them, or at least in your practice.

While people have started with Fortran, C, Pascal and many others, I would avoid those and others that are not interactive (with a REPL). About “this is like C, lisp, Fortran”, I would also avoid Lisp, for other reasons.

Some languages are meant to be educational languages, more than actually used. Scheme, a variant of Lisp, is one such. The book “Structure and Interpretation of Computer Programs” is I understand great, uses Scheme, though note " MIT Press published the JavaScript edition in 2022."

It’s well known that JavaScript has (good and) bad parts, so it’s I guess a sign of the times, since Scheme getting relatively unknown/unpopular, and JavaScript extremely popular. I’m very skeptical scientists should use JavaScript even as such a learning experience. This book should be rewritten for Julia… :slight_smile: Which is almost perfect with (almost) no bad parts.

The world is very different from when I was 9 years old, didn’t know English saving for my first computer and learning BASIC at 10 years old by learning English to read the Oric-1 manual that came with it to learn BASIC, that came with it like many 8-bit home computers.

There was then value in learning BASIC, since everyone had it, and noting else, and no web or community like Stackoverflow. Sometimes it helps going with the flow. If I were learning machine learning I would be tempted to start with Python, not go against the grain, I’m conflicted should I rather use Julia? Eventually very possibly. Python is a very plausible starting language, why Keith Packard made Snek (previously named Newt) language, a simplified (Python) language/implementation for embedded programming for kids to teach robotics on Arduino.

Python helps to learn Pandas… and it has excellent docs, and I guess also other learning materials.

You want to learn dataframes, but people argue Julia’s DataFrames.jl are already better than Pandas, more logical, to learn end use.

For dataframes R is a very plausible language to use or learn. Many concepts translate, and with Tider.jl even the same syntax, but then you need that package for dataframes, instead of using the more common packages and “native” syntax.

https://pythonlang.org/

They use triple quotes which doesn’t make sense for a comment.

[This applies to Julia too, and I thought it was clever, well it’s an abscure hack, that I like, and note some criticism [there] will be subjective.]

Wide Spread Because Of The Wrong Reason


  • Python is seen as “the beginner’s language”, and it really should not be. As said earlier on this website, Python has numerous issues that stop the newbie from quickly getting used to other PLs, by lacking basic functions.
  • Python should only be used if you wanna handicap yourself into an inferior PL, just to see what you can do. No more, no less.

If you must use Lisp, or want to learn, consider Clojure. It (may or) may not help for biologists, some scientists. It can’t be as fast as Julia, because of immutability (at least be default, I’m just not sure it’s easy to opt out. But it IS better to default to immutability for correctness of code). It’s designer Rick Hickey is a thought leader, and he makes the best programming talks, e.g his PLOP talk. That one and many others are programming-language agnostic, though other very Clojure-specific.

1 Like

If you prefer video lessons, there’s a good Youtube channel called doggo.jl which has lots of lessons, ranging from absolute basics to relatively advanced stuff:

5 Likes

I am a biologist, but I do have a computational background as well. I work as a Software Engineer at a biomedical resesarch institute.

My advice is find some local help if possible. US RSE is quickly growing and trying to fill this very gap. Reaching out that to your local Research Software Engineer organization may be fruitful. You could also see if there are any local meetups near you.

There are some online resources such as Julia Academy that may be helpful as well.

Frankly, the thing that I found most helpful is hanging around here and Slack / Zulip. Then ask a lot of very specific questions - and work on figuring out how to ask better questions.

11 Likes

Thanks for your well thought out answer. Still looking over some of it again. I think Julia can be as well but it’s been much more difficult than I thought at first. Coming from mostly R and bash. Thanks for the book suggestions. I’ve been reading a few but slowly. I don’t use Julia every day. Perhaps that’s my problem, just give up R and go to Julia as much as possible. But difficult when R has so many great packages and it’s more user friendly. Just RStudio alone is easier than VS code imo. I’ll keep learning, thank you.

Thank you. I knew about Julia academy and forgot. One of my comments came from one of these I think. It was an intro to Julia webinar thing and I still couldn’t understand what the symbols meant in the functions so I got lost. Just a million ways to do things in Julia and makes it complicated for a part time guy with no programming background.

It’s not your problem, it makes sense to use the tools that you’re productive in. While I can attest to the fact that taking the plunge and using Julia exclusively will help you learn faster (and I personally think that I am now much more productive than I ever could have been with the mix of R and Python I was doing before), there was definitely a period where I was incredibly slow. I was lucky to be in a context where I had the luxury of taking my time.

That said, a couple of suggestions if you want to spend more time in Julia:

  1. If you do data science like stuff with tidyverse packages, use Tidier.jl
  2. If you use RMarkdown in RStudio, consider switching to quarto, which can also run Julia (I think you can even do it within rstudio)
  3. Use the interop packages - RCall.jl to run R code in Julia and JuliaCall to run Julia in R.
  4. Don’t be so hard on yourself, and come here or the other social channels to ask for help - IMO the best thing about Julia is the community!

And as an aside, really great to have more biologists interested in Julia. We’re still a small community, but growing!

11 Likes

I’m sure I would get better with full time use, but as I said, the package ecosystem is much more established in R so I almost always have to use R for something. Most of my field has just started to explore Julia and only a few packages exist for us so far. This makes the changeover difficult until more statistics and field specific packages come along, for pure data science work, I agree, I think I could be equally as productive. I had started to explore Tidier.jl, looks nice. I do use Quarto, but when I used, it seemed a bit awkward for some reason. I have not tried RCall.jl, also seems like every time I try to run these in any language I get goofy issues popping up constantly so I often avoid, maybe RCall is exceptional, I will try soon. Discourse has been very helpful, but also realize if I need to be productive and have a short turn around, having to constantly post on boards to learn is very slow for newbies… So I’ve constantly had an on-again off-again situation with Julia :). Documentation has really started to improve with Julia Academy, books like the one from the DataFrames.jl author, and Think Julia, so I will try to go back to Julia and maybe try to do a full analyses with Julia if possible. Thanks for all your help!

2 Likes

I will also put in a plug for JuliaHub here. They have an excellent search capability for packages, docs, and even code. For example, here is a JuliaHub search for “biology”. Obviously biology encompasses a huge range of topics, but maybe it could help your decision just to see what packages exist (or don’t) for your subfield.

2 Likes

Despite the overhype, LLM and ChatGPT are excellent tools for learning. Ask ChatGPT how to perform a task in the language you’re learning, try the provided code on sample data, and ask for explanations when you encounter errors.

This comes very cheap in terms of mental effort, and it provides a good starting point. You see the code structure and what the answer should LOOK like.

Of course, ChatGPT doesn’t replace in-depth information from reliable sources. I start with a couple prompts and end with the manual. Still, it definitely serves as a helpful aid in the learning process, especially near the start when it’s easy to get overwhelmed. You have to learn first what is the ''content" of code?

1 Like

I use to share this resource about another on growing Julia bio-packages
https://gensjulia.pages.dev/biology/

1 Like

There is no doubt that there is a steep learning curve for learning to program. But it is well worth it. I think a scientist who can’t program has one hand tied behind there back.

As a practicing scientist, I’d suggest that you start with a book like Kaminski’s “Julia for Data Analysis” and spend about a year just processing data and creating plots. Learn to do everything you might be tempted to do in Excel or whatever plotting package you use in Julia. Learn the syntax, learn the idioms, learn the core packages. Just become comfortable with working with the language. Learn which packages are useful and how to use them. Don’t even think about developing your own package.

Then after a year or so, try to make your own package to do one simple task. At first just make it work. Don’t worry about optimization or elegance. Once it works and you’ve developed a broad test set to prove it works, then think about improving the code quality and optimizing it. Make sure at each step that 1) you haven’t broken anything and 2) you’ve actually improved the code. Speak to others. Ask for advice.

Then you may be ready to branch out.

It will take time. In fact, learning to program takes a lifetime. I’ve been learning for 40 some years now.

3 Likes

Thanks, I do use several of the LLMs, and I get mixed results. When I first started, ChatGPT couldn’t get hardly anything correct, I commented on Discourse. I think a guy named Logan fixed but not sure… I have not tested it as extensively as before but seems to be better. I do not pay for GPT4, so who knows, maybe that one is far better yet. It can get simple things I think, but I really struggled with any external packages. I presume it was not trained on slack or discourse results maybe? No idea how they select data to use for training.

I do like to explore with these, but find books and manuals more helpful to my learning and understanding how to get good at programming in Julia, just need more time and practice. Thanks for your comment.