Preaching Julia to biologists

yakir12 · September 20, 2018, 6:22pm

Nice, thanks for the feedback. Yea, you’re right about how people intent to use the functionalities of a language. And to be honest, I’m not set to switch them on to programming in Julia or from R/MATLAB/Python to Julia, as much as I want them to understand why we prefer it.

I ~~will however~~ mention that Douglas Bates switched. For one, I didn’t know that. Secondly, one of the attendants is heavily into stats (but not necessarily programming) and he’ll appreciate that fact (as do I). Thanks!

ChrisRackauckas · September 20, 2018, 6:38pm

I honestly think that talking about programming language details will only be helpful when talking to people looking to do some heavy package development. To everyone else, show them packages and applications, and discuss how Julia enabled it. Otherwise you (I) end up giving one of those talks where you stumble on a definition of metaprogramming while trying to explain why code generating code is helpful, to someone who hasn’t programmed long enough to care.

yakir12 · September 20, 2018, 6:43pm

The main reason I’m not showcasing some of the top packages (listed in posts such as What package[s] are state-of-the art OR attract you to Julia, and make you stay there (not easily replicateable in e.g. Python, R, MATLAB)?) is that most/all of those packages are relevant to people in: machine learning, systems biology, numerical analysis, etc. The people I’m dealing with are not working in any of those fields. It doesn’t mean they wouldn’t be able to appreciate what Julia did in those cases, but it would take as much work to explain Julia’s contribution as it would to explain the (I think, pretty simple) concepts already in my presentation. But please, let me know if I’ve missed some obvious packages!!!

And again, thank you all for the awesome feedback!!!

Zach_Christensen · September 20, 2018, 6:50pm

Honestly, the gap between usability in python and R is constantly becoming more narrow. I was in a lecture yesterday where someone was arguing that R wasn’t as good with file systems and and regular expressions as python is. Because this is all built in to R the advent of the tidyverse has made it as easy to do in python with some minor syntactic sugar (this is of course subjective). Discussions around syntactic ease would likely yield little progress in any audience familiar with coding. The best syntax is the one they’re used to.

I think a focus on the implications of Julias unique performance advantages is key (it’s what got me) and how that would affect their workflow.

I’m in neuroimaging research and it’s just not feasible to expect the comparably small community that is specifically interested in computational neuroimaging to be able to recode everything in a reasonable amount of time. We can’t stop all of our research and redo our entire workflow while Julia works out the kinks in porting every C++ library. However, because julia integrates with other languages so well I can call Julia home without losing all of my furniture in the move.

yakir12 · September 20, 2018, 6:58pm

I agree. Trying to stay clear of generalizations like that.

I guess you’re referring to: speed and flexibility? Please be more specific if I’m wrong.

Here you mean the zero overhead in regards to PyCall.jl, JavaCall.jl, RCall.jl, CCall.jl, Mathematica.jl, MATLAB.jl, right? I mention this here.

ChrisRackauckas · September 20, 2018, 7:03pm

What can performance do for you other than make you go “oh wow, that’s fast!”. That’s important to highlight. How long does a differential analysis take to do on a single-cell RNA-seq dataset with 200 cells? 2000 cells? How are microfluidics devices improving to increase the number of possible cells, and how is that making computation the bottleneck? In this sense, 10x performance advantage might mean being able to handle the dataset from new Illumina machine X with paired with cell sorter Y and get the analysis done in 2 weeks, while with current speeds it can take 3 months! That’s how you highlight where speed matters in a domain-specific way.

And it’s always a give-and-take like @antoine-levitt said. You just agree that yes, for basic statistics and such you will not have a notice a competitive advantage over other researchers with your newfound speed. But it does come into play when you’re dealing with big data that needs speed + parallelism, and problems where this applies is X, Y, Z.

Pick the packages that excite you. Bio.jl has really cool sequence structures that seem to be pretty fast. Find a nice benchmark against other tools. Find (or make) some domain-specific packages that will excite the people in your audience/department. Then point outside of the domain for pizzazz.

KZiemian · September 20, 2018, 7:05pm

Really nice presentation. Can I barrow few your concepts if I will do something similar? But for different audience.

yakir12 · September 20, 2018, 7:06pm

Please do!

Zach_Christensen · September 20, 2018, 7:23pm

Yes speed is huge. I’ve spent a large amount of time porting libraries/functions from other languages into R/python. However, the combination of speed and flexibility (interfacing with other programs) is what makes a difference. If all I cared about was speed I’d just use C, but I need to easily connect all the different code others have written.

I did notice this. I just wanted to build on why that was such a big deal to someone that may already feel comfortable in their current workflow. It looks like a great presentation.

In my field there’s enough memory usage that speed quickly translates into money. My current project is going to end up using about 50 TB of storage. You can imagine how much memory it will use to preprocess and analyze that.

yakir12 · September 20, 2018, 7:26pm

I really liked this, I added it to the notes. Awesome, thanks again.

ChrisRackauckas · September 20, 2018, 11:18pm

Oh, don’t make me imagine. How long does it take with conventional methods? For your types of codes, what kind of speedup do you usually see? What’s the actual speedup that happens in this case? If you run through that and show “look, 30x here means a day instead of a month”, I feel like it has a lot larger of an impact.

Tamas_Papp · September 21, 2018, 5:22am

My impression then is that your audience is mostly users who are not really interested in coding other than scripts that use existing, mature libraries, and at the same time Julia may not (yet) have all the tools they are used to.

In this context, I would hesitate to recommend Julia to them at this point. No matter how much we like a language, we should keep in mind that it may not be the best option for someone else at this point. Not everyone is an early adopter.

I think you are having a difficult time preparing this talk (as demonstrated by this discussion) because of this. Perhaps you can switch strategy, and instead of telling them why they should like Julia, explain what you like about it. Then they can decide for themselves, and it is perfectly fine if they just wait it out, at least they learned about the existence of another language they may explore in the future.

yakir12 · September 21, 2018, 5:30am

I really agree with this. I’m gonna add a slide or two mentioning the libraries I like (as Chris suggested) and talk about that this is for early adopters and might not be for everyone. Apart from that I feel like the presentation does show why I love Julia (I could be more clear on that though).
Thanks!!!

kevbonham · September 21, 2018, 4:26pm

Another way to think about this though is that, if it’s already relevant and being used by people in such disparate fields, it’s not far fetched to believe it will be useful in theirs. And if they can use a single language that some of the best new packages are being written in AND it’s good for all the stuff they already do AND the community is great AND it’s fast, that all seems like a pretty decent argument

yakir12 · September 21, 2018, 4:30pm

For sure. Yea, I think it’ll go down well. I’ll be reporting back here to let you all know how it went.

stevengj · September 22, 2018, 7:26pm

ODE integration is one example where even if you use mature libraries in Python or R, performance is a serious challenge because user code is in the inner loop. This is one area where Julia really shines, and is relevant for many biologists and chemists because they are often trying to fit rate constants and ODE solvers are the bottleneck.

yakir12 · September 25, 2018, 12:52pm

All’n all it went well. People with more programming experience “got it” more than people with less, which is understandable (the programmer kept mouthing “wow, wow, wow”, while a non-programmer asked if Julia will suggest the correct model to input into a GLMM).

Thanks for all the help and support!

mkborregaard · September 25, 2018, 1:38pm

would you share the latest version of the slides?

yakir12 · September 25, 2018, 1:40pm

It’s still in that github repo:

I should though add references to the people I stole content from.

kevbonham · September 26, 2018, 4:29pm

I plan to ~~steal~~ use your slides for a workshop I’m doing next month . Like I said earlier, Julia community is awesome!

Topic		Replies	Views
Convincing physicists that Julia is worth their time and effort Teaching & Outreach	34	5859	January 11, 2019
(Collaborate on) Resources for "Intro to Julia" talks Community question , proposal	20	2585	September 5, 2017
Review of presentation Data	12	1640	December 8, 2017
Designated Target Audience of Julia 1.0? Community	152	9942	July 25, 2018
Why is Julia so great? New to Julia	77	10815	April 16, 2023

Preaching Julia to biologists

Related topics