Preaching Julia to biologists


#21

Nice, thanks for the feedback. Yea, you’re right about how people intent to use the functionalities of a language. And to be honest, I’m not set to switch them on to programming in Julia or from R/MATLAB/Python to Julia, as much as I want them to understand why we prefer it.

I will however mention that Douglas Bates switched. For one, I didn’t know that. Secondly, one of the attendants is heavily into stats (but not necessarily programming) and he’ll appreciate that fact (as do I). Thanks!


#22

I honestly think that talking about programming language details will only be helpful when talking to people looking to do some heavy package development. To everyone else, show them packages and applications, and discuss how Julia enabled it. Otherwise you (I) end up giving one of those talks where you stumble on a definition of metaprogramming while trying to explain why code generating code is helpful, to someone who hasn’t programmed long enough to care.


#23

The main reason I’m not showcasing some of the top packages (listed in posts such as What package[s] are state-of-the art OR attract you to Julia, and make you stay there (not easily replicateable in e.g. Python, R, MATLAB)?) is that most/all of those packages are relevant to people in: machine learning, systems biology, numerical analysis, etc. The people I’m dealing with are not working in any of those fields. It doesn’t mean they wouldn’t be able to appreciate what Julia did in those cases, but it would take as much work to explain Julia’s contribution as it would to explain the (I think, pretty simple) concepts already in my presentation. But please, let me know if I’ve missed some obvious packages!!!

And again, thank you all for the awesome feedback!!!


#24

Honestly, the gap between usability in python and R is constantly becoming more narrow. I was in a lecture yesterday where someone was arguing that R wasn’t as good with file systems and and regular expressions as python is. Because this is all built in to R the advent of the tidyverse has made it as easy to do in python with some minor syntactic sugar (this is of course subjective). Discussions around syntactic ease would likely yield little progress in any audience familiar with coding. The best syntax is the one they’re used to.

I think a focus on the implications of Julias unique performance advantages is key (it’s what got me) and how that would affect their workflow.

I’m in neuroimaging research and it’s just not feasible to expect the comparably small community that is specifically interested in computational neuroimaging to be able to recode everything in a reasonable amount of time. We can’t stop all of our research and redo our entire workflow while Julia works out the kinks in porting every C++ library. However, because julia integrates with other languages so well I can call Julia home without losing all of my furniture in the move.


#25

I agree. Trying to stay clear of generalizations like that.

I guess you’re referring to: speed and flexibility? Please be more specific if I’m wrong.

Here you mean the zero overhead in regards to PyCall.jl, JavaCall.jl, RCall.jl, CCall.jl, Mathematica.jl, MATLAB.jl, right? I mention this here.


#26

What can performance do for you other than make you go “oh wow, that’s fast!”. That’s important to highlight. How long does a differential analysis take to do on a single-cell RNA-seq dataset with 200 cells? 2000 cells? How are microfluidics devices improving to increase the number of possible cells, and how is that making computation the bottleneck? In this sense, 10x performance advantage might mean being able to handle the dataset from new Illumina machine X with paired with cell sorter Y and get the analysis done in 2 weeks, while with current speeds it can take 3 months! That’s how you highlight where speed matters in a domain-specific way.

And it’s always a give-and-take like @antoine-levitt said. You just agree that yes, for basic statistics and such you will not have a notice a competitive advantage over other researchers with your newfound speed. But it does come into play when you’re dealing with big data that needs speed + parallelism, and problems where this applies is X, Y, Z.

Pick the packages that excite you. Bio.jl has really cool sequence structures that seem to be pretty fast. Find a nice benchmark against other tools. Find (or make) some domain-specific packages that will excite the people in your audience/department. Then point outside of the domain for pizzazz.


#27

Really nice presentation. Can I barrow few your concepts if I will do something similar? But for different audience.


#28

Please do!


#29

Yes speed is huge. I’ve spent a large amount of time porting libraries/functions from other languages into R/python. However, the combination of speed and flexibility (interfacing with other programs) is what makes a difference. If all I cared about was speed I’d just use C, but I need to easily connect all the different code others have written.

I did notice this. I just wanted to build on why that was such a big deal to someone that may already feel comfortable in their current workflow. It looks like a great presentation.

In my field there’s enough memory usage that speed quickly translates into money. My current project is going to end up using about 50 TB of storage. You can imagine how much memory it will use to preprocess and analyze that.


#30

I really liked this, I added it to the notes. Awesome, thanks again.


#31

Oh, don’t make me imagine. How long does it take with conventional methods? For your types of codes, what kind of speedup do you usually see? What’s the actual speedup that happens in this case? If you run through that and show “look, 30x here means a day instead of a month”, I feel like it has a lot larger of an impact.


#32

My impression then is that your audience is mostly users who are not really interested in coding other than scripts that use existing, mature libraries, and at the same time Julia may not (yet) have all the tools they are used to.

In this context, I would hesitate to recommend Julia to them at this point. No matter how much we like a language, we should keep in mind that it may not be the best option for someone else at this point. Not everyone is an early adopter.

I think you are having a difficult time preparing this talk (as demonstrated by this discussion) because of this. Perhaps you can switch strategy, and instead of telling them why they should like Julia, explain what you like about it. Then they can decide for themselves, and it is perfectly fine if they just wait it out, at least they learned about the existence of another language they may explore in the future.


#33

I really agree with this. I’m gonna add a slide or two mentioning the libraries I like (as Chris suggested) and talk about that this is for early adopters and might not be for everyone. Apart from that I feel like the presentation does show why I love Julia (I could be more clear on that though).
Thanks!!!


#34

Another way to think about this though is that, if it’s already relevant and being used by people in such disparate fields, it’s not far fetched to believe it will be useful in theirs. And if they can use a single language that some of the best new packages are being written in AND it’s good for all the stuff they already do AND the community is great AND it’s fast, that all seems like a pretty decent argument :slight_smile:


#35

For sure. Yea, I think it’ll go down well. I’ll be reporting back here to let you all know how it went.


#36

ODE integration is one example where even if you use mature libraries in Python or R, performance is a serious challenge because user code is in the inner loop. This is one area where Julia really shines, and is relevant for many biologists and chemists because they are often trying to fit rate constants and ODE solvers are the bottleneck.


#37

All’n all it went well. People with more programming experience “got it” more than people with less, which is understandable (the programmer kept mouthing “wow, wow, wow”, while a non-programmer asked if Julia will suggest the correct model to input into a GLMM).

Thanks for all the help and support!


#38

would you share the latest version of the slides?


#39

It’s still in that github repo:

I should though add references to the people I stole content from.


#40

I plan to steal use your slides for a workshop I’m doing next month :slight_smile:. Like I said earlier, Julia community is awesome!