Stability of Julia

I’m not sure what part of biology you are in but as an ecologist things are good but mixed. I started pre-1.0 and compared to then the core language and main packages are very stable, and proper versioning and package manifests means you have less trouble with versions than in R - you can always recreate exactly what packages you used years ago.

But peripheral, field-specific packages may be less stable than in R, because people are still writing them and filling out the design space. For an example of things ecologists care about, we just overhauled the geospatial data interface in the last few months (although I guess even that is happening currently in R as well). Spatial data plotting is just being written for Makie.jl, and people slowly switching to it from Plots.jl.

Your code will generally be more reliable and reproducible in Juila if you are happy to use the version you started with for older projects. But using the hot new things will require some updates.

Also: in R there are macros literally everywhere, you just cant see them so don’t realise. Think about how defining a formula for lm even works at all. Yes, its a macro. In julia, the macro in lm is explicit with @formula. That means maybe a little more learning, but you will end up actually understanding the code, which is often not possible with R, as there is so much magic to make things “just work”.

And: never email devs, thats what the issue queu is for. A github issue is public so advice only has to be given once. Emails don’t help everyone else who has the same problem.

4 Likes

I think the OP conflates R with the Tidyverse in general, and RStudio in particular. And I get it, I was an R user for many years and I really liked how RStudio really centralized a lot of Data Analysis tools under one roof, with the clear leadership of Hadley. As a user, that was good, no doubt about it, despite the complaints from other corners of the community that the Tidyverse was basically different from R Base. So, yes, I understand why Julia might look different when that’s your experience (or Matlab, from what I hear), this is still a fairly young language with many packages still experimenting and communities slowly coalescing, in many cases without one clear leader ala Hadley. That’s obviously a bumpier road, but not uncommon in R in other parts of the ecosystem not touched by RStudio. I’ve seen plenty of abandoned packages in ecology and biology in general, some academic package that was developed for a paper and later abandoned. I think that CRAN helps clipping those off eventually, but if you’re working outside the Tidyverse, things look more like Julia. So, plenty of dead ends in R also.

So, in general, I understand the OP’s point as a user, but that’s not an R thing, in general, is an RStudio artifact, in that they successfully developed a nice product off an open source language. However, as a language, Julia has better tools for managing environments and ensuring reproducibility. I’ve seen posts here of people using Julia 0.3! In my work, dealing with a pre-Tidyverse data analysis pipeline was a nightmare and we had to keep a folder that couldn’t be touched at all with some old R version and tools like plyr and others. I feel that with Julia that’s an easier task.

Yes, Julia feels much more challenging, as a user, than R. And a biologist myself, with very little computing background, it feels hard sometimes. But at the same time Julia, for me, lowered the bar between user and developer and I feel I can deal with the rough corners more easily than I did with R. And I don’t have to deal with C++ if I don’t want to. I stopped using R consistently for about five years now, and don’t really miss it. However, sometimes I do wish Bogumil, Milan, Peter, Quinn, and all the other developers of data analysis tools could join in a “JuliStudio”-type company and work full-time in creating a very coherent ecosystem.

3 Likes

(Unfortunately?) there is no organisation like this for Julia. The bar for registering a package is not too high, which has it’s advantages as well as disadvantages.

3 Likes

For devs, advantage to save time, no documentation is fine. For users, 100% a disadvantage.

1 Like

Excellent discussion here. I completely understand they are different, but the tidyverse ecosystem is still a part of R and you can’t separate the two. Tidyverse comes into play from the first lines and Quarto (new RMarkdown) and RStudio as a product is far superior than any IDE I’ve used for Python or Julia. I absolutely hate the Jupiter notebooks, compared to RStudio they are really simple and annoying to me. But all of this falls under the “R umbrella” to any basic user for R. So yes, the rest of your comments I understand, but would still argue, R is the Tidyverse you have access to if you move to R. I remember days before Tidyverse, it was so crappy to do what we all do now in a few lines of dplyr. It’s so powerful. I think it sounds like DataFrames and DataFramesMeta are getting there, but I’ll have to use myself. Hadley just makes everything so easy and great for users. Regardless of what programmers think of his methods. I think the tidyverse is the best thing I’ve ever seen across Python, Julia and R by a country mile (as a user, not a programmer).

Thanks for your response. Again, as I explained to others. I do think it’s important to be able to maintain old code and fix versions of older Julia versions and packages. That’s one part, but I was far more interested to see there was enough people maintaining key packages that were not going to go dead in 1 year when we came back around to updated a pipeline or something. We still will likely stay in R for most data analysis, but need Julia for some heavy tasks where memory and speed matter.

I want to know if Makie will still be around in 1 year with the other packages you and I would use.

Good point on macros, I guess in that sense, yes. They do seem hidden from us as users and easier to understand than the many methods I see in Julia.

I mean if they don’t want emails, why don’t they just document the package to begin with? I get the public advice, I agree completely, but you’d think it would be easier to spend 1 day documenting to you avoid all that public or private to begin with. It always astounds me, I have spent a lot of time documenting others packages (outside R or Julia).

I don’t disagree with that and as a user, and someone without computer science background, I really enjoyed RStudio for a few years. And I taught R and RStudio to people in my lab that knew SAS. When I first found Julia, I was scared to go back to the terminal again, but my “aha” moment came after wanting to understand how lm4 worked behind the scenes. In R, I quickly ran into C++, while with GLM.jl I could more or less could the logic of the package.

That’s not important for most users, so I can see why the Tidyverse’s usability is such a huge success, but for users like me, Julia hits a sweet spot where I can be a very casual developer for packages that I use, fairly easily in my opinion, while also a user and possible collaborator with other parts of the ecosystem. That’s why I moved and have stayed. But as I said, I do appreciate RStudio’s work and Hadley’s vision.

4 Likes

Outside of the tidyverse, R documentation is really difficult to parse. Maybe its just me, but many times the PDF of the docs from CRAN its mostly just the API info with little tutorial outside of examples in each function doc. Every open source project struggles to get beyond this.

6 Likes

I don’t see the logical connection with email. Why not open an issue requesting documentation?

8 Likes

To answer these questions you should really get familiar with GitHub repositories: looking at contributors, stars, and general activity of packages and orgs.

The Makie.jl repository alone has 1.7 thousand stars and 147 contributors. That’s a lot more than the vast majority of R packages, and half of ggplot, which has been around for three times as long.

So it’s probably here for a decade at least. It’s development could slow, but it seems unlikely at this point, it’s self-generating, and we are all writing packages based on it, I have at least one.

As for docs: demanding that developers write documentation for you like that is also not so nice. Just writing packages is a difficult and occasionally thankless task. Of course, we eventually do need to write docs. But first,we need something worth documenting. And are you going to help with documentation at some stage? will you contribute documentation fixes and improvements? do you also make clear issues for any problems you experience?

This is a communal process, and no one actually owes you anything, but they will happily help 95% of the time if you learn how it works. Making github issues is usually appreciated, but personally emailing is not, because it makes a public conversation private. Please have a little more respect for open source devs and the time people put in, usually for free, to help you.

14 Likes

I think maintainers tend to overestimate the value of their code to users relative to the value of documentation because they don’t need the documentation but they do need the code. When you know the API by heart, it’s hard to see how much documentation matters.

Also Julia doesn’t have a way to identify public-but-undocumented API, so the tooling for detecting documentation problems doesn’t exist.

2 Likes

I don’t think it is only disadvantage. Registering the package means it wouldn’t disappear some day, that means you are guaranteed to get access to every registered version of this package for the foreseeable future, which provides you the stability you were asking for. For me as a user having to use some semi-documented (not un-documented as you imagine) packages may be an annoyance (but asking questions helps), but being able to use these very packages at all is a bonus. Also, registered packages are, well, registered and thus easier to find.

5 Likes

I just want to point out that this is absolutely false, in fact, many considers the hijack of R by tidyverse active harm:

https://blog.ephorie.de/why-i-dont-use-the-tidyverse


the observation is basically:

[…]NSE (non-standard evaluation) is really annoying to work with in dplyr/tidyverse codebases, and this definitely inhibits people from building on top of them.

I’m guess you’re mostly a user, of a few-liner / scripting visualization project in R.

imagine you can only write usable packages based on some package not just the base language, that’s what it feels like, but I guess both R and Python devs are used to this constraint.

Another analogy may be MATLAB or Mathematica, where there’s basically no package manager or registry, yes, everything is more “curated”, even better than CRAN!

5 Likes

Oh boy. You seem to have a very flawed mental model of how open-source communities (or also public clubs/associations/societies for that matter) work. The reality is that typically only a handful of people do almost all of the work (you might want to google “bus factor” or read this). And (very often) they don’t get paid for it. Not by some external entity and, in particular, not by you. Does that mean that you can’t voice criticism towards them? No, you certainly can, especially constructive criticism. But saying things like the above sounds very ungrateful.

Also, IMO, the way to happiness is low expectations which is why I try to have the following mental model: No one cares about your needs. If you want to see a package, a feature, or better documentation than create it, implement it, or write it yourself. No one else will. And if, in reality, someone already created/implemented (half of) my desired package/feature I’m grateful and happy that I have to do less work.

31 Likes

lol no I don’t. I have spent a lot of time documenting other peoples software. It’s not difficult. Just lazy developers.

You download it from CRAN, I don’t know what you are talking about lol

But why register it if you don’t want others to use it? And how can they use without documentation? Just keep it on your local machine to yourself then. I keep all my functions away from others, but I still document them for myself because in 6 months I won’t remember anything. Seems odd to me is all.

Good answer. Thanks. I agree. I think sometimes people put stuff up online and don’t realize others will find it. I’m always surprised by people who stop me and tell me they use my material. I think this is important for devs to remember. I don’t think it applies in the case I was talking about as it was CSV.read() and something everyone should be using. I couldn’t find the documentation for the equivalent to colClasses or col_types in R functions to read data… Very odd it didn’t exist right away… I did get an answer over email but I spent 2 days looking for it online and couldn’t find it.

I don’t think you can use that as evidence. Julia I find is stacked with devs vs few basic users. While R is full of dummies like us in biology who are basic users… We don’t know programming. I know more than most and still terrible programmer (I don’t even use C++ or Python, latter due to horrible package management). I use github and don’t star ggplot2…

Yeah, I’ll never agree. I’m not demanding anything… It’s just why would they spend a thousand hours on a software, put it in a public place, then not let anyone learn it with 0 documentation. I get this reaction from everyone in Julia who is too lazy to write documentation. I’ve spent many hours of my career documenting other people’s packages that would have cost them 1/100th the amount of time. It stil didn’t take me that long once I learned it, but with so little documentation it’s frustrating for most users. It’s just being lazy. I do it too, we all do, just was surprised years ago when very core packages to Julia were not documented. I don’t care, if Julia doesn’t want users, that’s fine.

Since this seems to be a real issue that’s plaguing new users, it would be good to have a list of packages that lack documentation and see if the situation may be improved

3 Likes