Designated Target Audience of Julia 1.0?

I’m not very well-versed in R but I find it quite horrible for statistics, for example in Julia if you want to compute the pdf of a Normal distribution with parameters (μ,σ) at value x you do :

pdf(Normal(μ,σ),x)

In R you do:

dnorm(x,μ,σ)

If you want to truncated Normal between zero and one you do:

pdf(Truncated(Normal(μ,σ),0,1),x)

In R you do:

google for a package
...
dtrunc(x, spec="norm", a=0, b=1, mean=μ, sd=σ)

If you want a mixture of two Gaussians:

MixtureModel([Normal(μ1,σ1), Normal(μ2,σ2)],[1/2,1/2])

In R you do:

google for a package
...

If you want a BetaBinomial:

pdf(BetaBinomial(n,α,β),x)

In R you do:

google for a package
...

In Julia you have nice atomic concepts that are composable, while in R you just have a bunch of functions with unreadable names and packages with no common semantics.

I would be curious to see how this translates in R:

[f(D) for f in [mean,std,entropy], D in [Normal(0,1), BetaBinomial(10,0.1,0.1), Truncated(Normal(0,1),0,1)]]

Ironically the biggest issue with Distributions.jl is that it uses Rmath, but hopefully that will get fixed in time.

19 Likes

I care somewhat (not about overtaking R, but about the “similar goals” :sunglasses:) , because I want there to be a vibrant enough job market, so that as a consultant, I can pick and choose interesting work, for as long as I care to continue working.

3 Likes

On a more basic level, its hard for me to write functions in R because it doesn’t have static typing. When I write code, I want to write f(x::Vector, y<:Real) etc. In the same vein, not being able to use generators to make arrays in R is a source of frustration.

As the average masters student start writing functions as opposed to just scripts, the faults of R will become more visible. But imo julia is already great for scripting, and would recommend it for intro to data cleaning and regression.

I will be working on the JuliaEconometrics organization which should make it close to state of art for econometrics and will be able to play well for machine learning too.

4 Likes

R requires so much boilerplate for setup and structure. In theory one has S3/S4 classes, but they are considered “advanced” concepts and thus not used widely. Consider, for example the variance function var

var <- function (x, y = NULL, na.rm = FALSE, use) 
{
    if (missing(use)) 
        use <- if (na.rm) 
            "na.or.complete"
        else "everything"
    na.method <- pmatch(use, c("all.obs", "complete.obs", "pairwise.complete.obs", 
        "everything", "na.or.complete"))
    if (is.na(na.method)) 
        stop("invalid 'use' argument")
    if (is.data.frame(x)) 
        x <- as.matrix(x)
    else stopifnot(is.atomic(x))
    if (is.data.frame(y)) 
        y <- as.matrix(y)
    else stopifnot(is.atomic(y))
    .Call(C_cov, x, y, na.method, FALSE)
}

The very last line does the calculation, coded in C (I think). This is completely opaque from just looking at the R source. True connoisseurs are invited to examine lm (for linear regression).

To be fair, modern R can be much saner (but still of course not like Julia). Also, the language design goes back to decades, and has a lot of legacy elements which more or less make it impossible to change or optimize without breaking a lot (and I really mean a lot) of code.

In contrast, modern languages like Julia and Rust went in the direction of making abstraction zero (or low) cost, and thus encouraging pervasively modular design. This really pays off in the long run. Usually people emphasize the speed of Julia because that is easier to quantify objectively, but the most important advantage is allowing nicely organized code without trade-offs.

5 Likes

I don’t want to diss on R too much though because the more success R has, and the more success the “intro to datascience” movement by Rstudio et al. has, the more of a userbase julia can draw off of. They are doing some incredible things and we get to piggy-back off of that.

1 Like

I’m guessing an academic paywall is no drama here:

https://doi.org/10.1016/j.jss.2017.06.095

This gives some perspective on the sheer scale of the R data science ecosystem. It’s not something Julia can directly compete with, its a social phenomena an order of magnitude larger.

Julia can compete on raw technical merits for solving processor intensive problems in abstracted, modular ways. That was what the promise of Julia always was to me and it does live up to it.

I’m glad people find Julia useful for stats as well, but personally stats always means stats+GIS and Julia is years away from competing with R, and I’m not sure why it needs to.

I am a physicist and I am totally excited about Julia :wink:

10 Likes

Hasn’t @ChrisRackauckas made a package for R users (and another for Python users) to be able to easily use his DiffEq magic?
There are also the RCall and PyCall packages to go the other way.

Julia doesn’t need to supplant R or Python to “win”, just replace them as the first “go-to” tool in a programmer or scientist’s toolbox, letting them still use whatever things are good from those ecosystems, just as people don’t have a problem using libraries written in Fortran, but they wouldn’t think of writing Fortran code themselves.

2 Likes

I think julia has great long-term potential for 3.0 to 5.0. (I am here for the static strong typing, programmability, and dual-language problem, too.)

alas, 1.0 needs good uses not just for us few dozens on discourse, but for a wider audience. yes, julia is open source, but if julia computing [jc] were to go away now, I think julia would die. it needs to get some traction and sooner rather than later.

for the short term, if I were an advisor to jc, I would advise tuning/focusing a few specific use targets and libraries, whatever it may be. this is not to overtake python or R (hopeless), but to be a viable alternative in some places.

here at UCLA, julia 1.0 is not viable for our finance MFE students. yes, it has a standard library—without data sets, graphics, and good data import/export. ergo, I cannot push julia onto them as a primary instruction language. they already learn two languages. pushing a third language into a 1-year program won’t happen. it doesn’t solve the 2-language problem, it would create a 3-language problem. trust me—I wish I could.

firms and universities are “sniffing” alternatives all the time. betting their future on an alternative is a much harder hurdle to overcome.

so, jc, please pick a few more good target uses for 1.0 and curate some love into it, whatever your targets may be.

/iaw

1 Like

Well, you are too pessimistic. While I agree with you that I wouldn’t suggest Julia as first language for students yet, Julia is already great for my purposes (dynamic simulation and control of complex dynamic systems = wind drones). The package manger is great. Much better than those for Python, that I used before. The performance is at least one magnitude better than Python+Numba. The code is easy to read and to maintain. So Julia will have a great future!

Uwe

7 Likes

Great point. It may be efficient to get Julia used in class first, and when the students get employed they will use Julia in industry. That is how R succeeded. Another thing is that R is easy to set up in class, for example, package installation and plotting are fast.

I emailed Julia computing last year to set up an API to Bloomberg and Wharton Research Data Services, but nothing happened.

I use Julia for everything except when I need rasters+stats for ecology, so rgdal, dismo with other stats/gis tools that just don’t exist in Julia in the same easy to use form, and seem to be years away (maybe I didn’t say that clearly).

The idea of using wrapping Julia packages in R/python is great, and I’m thinking of eventually doing that for some spatial modelling packages I’m working on.

1 Like

I just don’t understand this whole thread.

Julia already has best in class libraries for numerical work. It has a viable niche for a significant chunk of scientific work, and can build from there. I’m not sure what else you want.

8 Likes

Yes but this is very different from the tools that R’s core userbase uses.

OK, my point was more that making it easy for people writing in R or Python to access anything special in Julia, and vice versa, will make it more of a moot point which side you write the script in - you’ll write it in whatever you feel most comfortable with, Julia, Python or R, and pick and choose which packages you use from all 3 ecosystems.

There is a gdal.jl package and GMT.jl also let’s you interface with GDAL raster library (not the ogr one, yet).

We have recently started modeling with Julia in chemical engineering applications at a commercial R&D facility. Julia rapidly became our go-to tool for technical programming and works extremely well in combination with our in-house data processing platform, written in R. We find Julia and R to be very complimentary and not in competition. For competition I would look at Matlab. In that fight Julia is my firm favourite.

9 Likes

Leaving out the stats, what are the GIS operations that you miss in Julia?

Try to import a tiff raster in Julia.

You just cant compare rgdal and GDAL.jl/ ArchGDAL at this point. ArchGDAL needs some user facing wrappers as it’s pretty low level.

Importing and manipulating geospatial rasters and using them in stats models is incredibly easy in R. Average 3rd year biology students in our dep have no dramas whatsoever. Julia, not so much.