Quality of Julia code and speed - is this stressed enough?

oheil · April 23, 2019, 1:38pm

This discussion seems to degenerate into comparing apples and oranges.

Every now and than somebody starts a thread like this stating some kind of disappointment regarding performance. I believe the reason is, that people somehow hear about “new” julia and its outstanding performance (equal or better than C) and start downloading and trying it with only performance as criteria. The missed point in this approach is clearly that julia is not only fast but, and this is most important, it is easy and fast to code. It doesn’t make sense at all to compare some single algorithms in julia and C (or any other language) and create some conclusions from this about the complete language. The real world problems are mostly of higher complexity than a single well defined algorithm.

With real problems the real julia advantages come into play: easy and fast full functional prototype development time. This prototype is functional and already fast, nevertheless not as fast as possible. Some real world problems are already solved now, they don’t need to be as fast as possible. But even better, optimizing the prototype again is quite easy and straightforward with the Profiling tool. With a little extra work the prototype is already as fast as possible as for the normal programmer to achieve (without special knowledge about whatever needed now).

I many cases these discussions about performance in comparisson to other languages do not compare julia to e.g. R/python, but they compare native Julia with C algorithms called from R/python and even further, the comparisson is not Julia vs. C but it is in general julia generated (LLVM) assembly code with C compiler generated assembly code, the later typically highly optimized over generations of compilers and processors. Julia can’t win this comparisson nor any other language could.

The first benefit stated at https://julialang.org/ is “Julia is fast!”, wow, great, click on the link “high performance” and look at the benchmarks, fantastic, julia is as fast, sometimes faster than C (like Lua, Rust, Go and Fortran by the way) … the focus is already lost and completely biased to performance. Benefits like “Dynamic, but optionally typed” or “Easy to use” are not well acknowledged by the first time user.

For me performance is a second-tier benefit. If performance would be first, I would use C or C++. What I need is fast prototyping, easy to read and comprehend, easy to modify and easy to adapt to changing data and, after that, it should not be slow. This is Julia.

My answer to the OP: performance is stressed to much. And it is not the quality of the code, which makes it fast, only a special quality in respect to performance. E.g. readability of fast code is more important than being fastest but unreadable (except for some special cases). But I admit, nobody wants slow code, faster is better. Julia combines all these benefits and more into one language.

RoyiAvital · April 23, 2019, 2:21pm

DNF:

lobingera:

Ehm, just because you don’t understand it, it’s not an anti pattern …

The anti-pattern is that in Matlab this is the idiomatic and fast way to program for large arrays, x and y:
sum(sqrt(x.^2 + y.^2) > 1)
This creates 5 temporary arrays until you finally sum over the last one. It’s pretty crazy that they are able to make this fast, because it’s clearly suboptimal. The right way would be to take an element from each of x and y, operate on them, and then accumulate, creating zero temporary arrays.

The reason this is idiomatic Matlab code is that they batch operations to make them fast.

Given Matlab’s improved JIT I wonder if maybe a straight loop would also be pretty fast these days, but the vectorized approach is what everyone knows and uses for now, and it’s what some of them translate into Julia code, and expect to be fast.

Actually, MATLAB did better than that.
Behind the scene will generate a loop from this without any temporary but with Vectorization and Multi Threading.
I am not sure about the above case but they are working on it and it is working on many cases in R2018b and R2019a.
They have something like the Devectorization Julia had in its early days (Just not user controlled but JIT triggered).

stevengj · April 23, 2019, 2:48pm

Vectorized code is often convenient, and we agree that it should be made fast (and it is fast)! That’s why Julia has a whole syntax devoted to efficiently vectorizing arbitrary function compositions on arbitrary types, and can even vectorize in-place. But you need to use dot-call syntax in Julia if you want to exploit this — when you port code to a new language, is it so surprising that you need to learn slightly different idioms if you want to take full advantage?

The advantage of the dot-call syntax over relying on compiler optimizations like you would for sqrt(x.^2 + y.^2) > 1 in Matlab is that the latter can only work for a few data types and functions that are “built-in” to the compiler, as I explained in my blog post, whereas Julia’s approach is composable with user code and external libraries.

Compared to Matlab’s vectorization model, the biggest current limitation is that Julia dot calls are currently single-threaded. That should change before too long, as Julia is rapidly adding more extensive threading capabilities and there is nothing fundamental that prevents this from being used for dot calls as well (multi-threaded (@threads) dotcall/broadcast? · Issue #19777 · JuliaLang/julia · GitHub). And there will always be mature highly optimized libraries developed for other languages that don’t exist or aren’t as well-optimized yet in Julia.

But what won’t change is that you will have to learn something new to take full advantage of Julia, even if (perhaps especially if) you’ve spent years learning the performance model in Matlab. (C programmers generally have less difficulty writing fast code in Julia — C-like code is fast — but they have to learn to let go of specifying every type and write type-generic code with fast higher-order abstractions.) This is related to the Why can’t you just compile Matlab to Julia? FAQ — Julia’s advantages exist because its performance model is different (its syntax and semantics expose different information to the compiler) than other languages.

DNF · April 23, 2019, 3:16pm

My example code snippet wasn’t the best, it may not be that hard to optimize, with just a few built-in functions working on built-in types. It probably gets harder when you throw more complicated functions, user functions and even user classes into the mix.

mbauman · April 23, 2019, 3:16pm

And this brings us full circle back to the Julia is Fast notebook — one of the points I like to stress when I’m teaching with it is that Julia’s for loops are themselves fast. That doesn’t mean that every single possible for loop you write will be fast, but it does mean that it’s always possible to write your algorithm in a performant manner in Julia. In some senses, we’ve democratized the compiler tricks that other languages (sometimes) do for you with dot-broadcasting and the other points in the Performance Tips chapter. As with any new language, you do have to do a little bit of learning, but just knowing a handful of those is sufficient for pretty good performance. Julia’s performance is continuing to improve in the non-optimal cases, too — type stability isn’t nearly as important as it used to be if Julia can enumerate the possibilities.

RoyiAvital · April 23, 2019, 3:25pm

@DNF, I know you were talking in general.
I also answered in general that MathWorks is doing efforts to optimize those elements wise operation into fused operations. Something similar to the . notation in Julia (MATLAB as well for some cases).

I’m happy that Julia gave the user the full control of the process while in MATLAB it is up to the JIT engine to decide.
So yes, probably there will be edge cases MATLAB won’t optimize as well as a manual optimization of the user.

affans · April 23, 2019, 3:27pm

Agreed. I think a lot of users are getting introduced to Julia as a language that’s as fast as C. What they don’t realize is that for it to be as fast as C, certain computer science prinicples are required. I was in the same boat. Someone told me about Julia at a conference and said I can essentially write “matlab code” and have it as fast as C. I slowly learnt that’s not really the case. One of my lab members does not know even the basics of computer science (i.e. types, methods, functions). If this person was to code an agent-based model, they’d be better off in Matlab/R since Julia wouldn’t be a better option for them.

sdanisch · April 23, 2019, 3:40pm

I know that dot broadcasting is a very powerful answer to vectorized Matlab code, making that particular pattern easy to fix when porting from Matlab to Julia…

But normal Matlab code usually contains many more patterns, that don’t use mutation and instead just, at least on the syntax level, allocate huge matrices over and over again, which directly ported to Julia really stresses our allocator and makes it hard to match Matlab performance.
That code currently can’t be ported efficiently to Julia, without completely rewriting it - even though Julia does have powerful answers for most patterns, this needs cumbersome rethinking, which triggers a topic like this discourse thread.

Funnily enough, this problem is unique to people moving from Matlab to Julia.
People coming from other languages usually aren’t used to have this kind of code to be fast (R/Python), or already write code that easily yields good performance (C/C++, etc).

Edit:
Before someone takes me up on this, of course it should be “this problem is particularly bad for Matlab” It’s always effort to change languages and learn new idioms

pkofod · April 23, 2019, 3:58pm

lobingera:

x and y were rand(40000000,1) and f11
function a = f11(x,y)
a = sum(sqrt(x.^2 + y.^2) > 1);
end
But i’m confused now, you claimed that from your experience that this is memory-intensive, needs a special optimizer and the memory impact can only be read from the profiler.

What happens if you have a user-defined function myfun(x) in there before the .^2? In Julia you can still myfun.(x), is the Matlab jit clever enough to do that as well?

Jean_Michel · April 23, 2019, 5:03pm

I would love for the bug with closures be fixed so the functional programming model would become possible in Julia without cost.

lobingera · April 23, 2019, 5:09pm

The computer that runs matlab is away from here, but i did an example with

function a = f12(x,y)
myfun = @(x) sin(x) + 0.3;
a = sum(sqrt(myfun(x).^2 + y.^2) > 1);
end

and afaics this creates one intermediate of size(x).

KZiemian · April 23, 2019, 7:11pm

I agree with both statements. My concern is that manual is not best thing for beginner programer to learn language. Also, my perception is that Julia want to be "first or second language to learn’’ as Python is today, to achive that some outreach tutorials are needed.

I think that on the level of language design Julia has potential to be attractive to beginers.

Tamas_Papp · April 24, 2019, 5:30am

I am unsure about first. Second, maybe. Where did you get this idea?

Julia as a first programming language is possible with some guidance from an expert, but I think it is unreasonable to expect to write optimal and elegant code from the beginning. This, again, applies to most, if not all, computer languages.

I think that a subset of Julia can be great for this purpose, but as a whole it is overwhelming. And, again, a course built around this idea would carefully avoid advanced topics.

I am not sure about the utility of bringing up the idea of Julia as a first programming language in a topic that you started with C-like speeds, @simd, and high performance computing. Do you seriously think that people new to programming should start with HPC?

longemen3000 · April 30, 2019, 4:28am

As a personal experience, i attracted my cousin (biochemist) to learn julia, thanks to Unitful.jl and Measurements.jl, I didn’t mention anything about speed, but composition was the key

giordano · April 30, 2019, 8:58am

I also stopped talking about speed when explaining Julia’s features. I think it’s quite “dangerous” to it claim it’s blazingly fast, then you go to show a preview and constantly have to say “ok… well… here comes compilation”. I’m leaving the speed advertisement for future times when compilation time will be a smaller issue.

I truly believe that at a certain point features like multiple dispatch, composability, metaprogramming, and others are more important than pure speed and Julia should be suggested for these reasons.

Azamat · April 30, 2019, 9:22am

Quoting @jeff.bezanson’s answer from Julia vs R vs Python - #90 by jeff.bezanson :

This is an interesting point, and I agree with it — I like to think Julia has many selling points, and we should trumpet all of them. However, in practice, it is very hard to get anybody to adopt a new language. Performance is one of the few or only things that gets people’s attention. The other big thing, of course, is library support, but any new language will always have fewer libraries than existing languages, so that can’t be an initial reason to adopt a new language.

Anyway, try convincing somebody that language X has a nicer syntax or is easier to use than language Y. They won’t believe you, and even if they do it’s not really compelling enough to go through the difficulty of switching. Or try the default pitch of most research languages, which is that they will catch more errors at compile time. Well, it’s quite evident that a large percentage of programmers simply don’t care about that. But if you can take something that runs overnight and make it run in a minute, you have a real painkiller. If somebody doesn’t have any code that takes a while to run, getting them to switch languages might be impossible.

Performance is actually special. It’s not just another feature. All languages are Turing-complete so you can write anything in any of them. Performance is one of the only meaningful ways you can hit a wall with a language and not be able to do something.

Tamas_Papp · April 30, 2019, 10:09am

I think it is totally fine to sell Julia on speed, if at the same time it is emphasized that one won’t get fast code automatically in all situations. Writing fast Julia code is easier than most languages, but it is still a nontrivial skill that one has to learn.

Unless this is clear from the start, new users can have unreasonable expectations, turn away from the language after some initial frustration, and may not try it again for a long time.

Also, while compilation time will probably be improved, it is unlikely to disappear completely. Short and trivialy scripts may not ever be faster (in terms of total execution time) than some interpreted languages. This is again fine, as long as expectations are managed correctly.

cce · April 30, 2019, 11:16am

I believe that the design of the core system and its libraries are by far the most important predictor of things, because, frankly, value often comes down to usability/features. I see Julia as having the important core design correct, and this is important that as we choose to engage, growing an ecosystem around it. In particular, I’d call out Julia’s multiple dispatch, type system, and macro system as being particularly fantastic. These core features are going to enable libraries to be imagined that go far beyond the state of the art in other languages. I think performance will come over time with maturity, and increasing attention due to enthusiasm.

Topic		Replies	Views
Funny Benchmark with Julia (no longer) at the bottom Performance benchmark	149	5828	November 4, 2023
Blog post about my experiences with Julia Community	259	14733	June 19, 2022
Benchmarks game Performance	20	3739	May 13, 2020
[feedback] Make Julia as Pythonic as possible Meta Discussion python	12	4822	September 7, 2020
Help to get my slow Julia code to run as fast as Rust/Java/Lisp Performance	100	4548	August 6, 2021

Quality of Julia code and speed - is this stressed enough?

Related topics