Understanding Julia performance in simple finite difference code

For the last part, have you rebuilt your system image (or compiled Julia from source) and set the optimization level to -O3? Julia won’t use SIMD automatically without that, and that could be the final kicker.

My thinking at this point is largely influenced by Mathematica, where multiplying a list (1d array) by a scalar (e.g., Float64) simply applies the operation to each element in the list (multiplying by the scalar, presumably optimized for the Mathematica way of doing things). So, I’m having to retrain my thinking in several ways as I move into Julia (and away from Mathematica, C++, C, etc.).

Thanks for the tips - I see you also modified another line to properly apply broadcasting so I’ll add that too, but I think I’m now making good progress. :+1:

That I’ve definitely not done! I’ve so far played with the JuliaPro and the recently released 0.7, on Mac OS X mainly though I’ve played on Linux too.

Is there a convenient online guide to doing the system image rebuild?

I always end up googling Chris’s blog when I need to remember how to rebuild the system image (see “gotcha #7”): www.stochasticlifestyle.com/7-julia-gotchas-handle . Those instructions work for v0.6, but I haven’t tried them in v0.7.

Enabling O3 is just a matter of starting julia with the -O3 command-line flag. I’m not sure how to do that in JuliaPro.

1 Like

I launch Julia 0.7 with -O3 but I don’t get any difference in time.

If I understand the comments from rdeits this should be the same as system image rebuild, but let me know if I’m wrong!

Nope, -O3 is orthogonal to rebuilding the system image, and you need to do both to turn on every single fancy compiler optimization.

OK, got it - I’ll dig into rebuilding too

Oh, and one last point for the night (my time zone) - I turned off the file writing stuff in the code I posted, and that drops run time down to about 0.2 sec. It is a little hard to make an “apples to apples” comparison with the C code, because it includes some options for various kinds of I/O (it would be a little painful to go through and comment out undesired options). However, for test purposes, I’m running it set up to only print out a single file at the very end, so this is probably a reasonably fair comparison.

So, at this point, it is either a win for Julia or, at worst, a tie! :fireworks::tada:

Actually, it is a win for Julia either way, since the code is more readable and compact (at least as I start to grasp its nuances).


This is at least where it is in Juno

1 Like

Going to Chris’s blog is always nice, but this particular solution is in the documentation:


Forgive me if I’m wrong, but it seems like the purpose of the nNodes input argument is to hold the length of the other vectors. That seems like a risky way of implementing it.

Shouldn’t you rather remove that variable and instead have

nNodes = length(c)

in whatever function you need it (or use end in the indexing expressions)? Right now it looks really fragile.

1 Like

I see that lately a lot myself, comparing C to Julia.
I wonder if it’s time to change some of the things in the documentation that say “nearly as fast as C” :grinning:

You are certainly correct as the code is right now. In the long run there does need to be a parameter nNodes that is a value controlling the set up of the vectors containing material properties. It needs to be set either by the user directly or, possibly, it could be determined based on the number of values in input files too.

However, once that value is set, there are probably places where it would make more sense to use length() - more things to address to finish porting from C!

For more experienced folks it may be obvious, but it took me a little time to find this settings panel So, for other newcomers, you can get it by selecting the following menu items:


Don’t try the general Preferences option :grinning:

I’d highly recommend that blog post to anyone coming to Julia from a C/C++ like background (and Mathematica). The discussion of arrays in terms of pointers helps me to get a much better grasp of why changes in my code made such a difference.

Same thing with discussion of REPL and globals, as that was a key piece I’d missed before.

I’ve been playing around with gcc, but for the life of me I can’t get simple programs to run as fast as in Julia.
Gcc seems relatively (extremely) hesitant to use 256-wide instructions and prefers 128. It’s thus often 50-100% slower than Julia on my computer.
And when coerced into using 256-wide via -mprefer-vector-width=256, it suddenly gets even slower.
There’s probably something I’m missing, so I asked about it on the gcc help mailing list:

(Disclaimer: I haven’t tried to reproduce this on different computers.)

Overall, my impression is that it’s way easier for non-computer scientists to get blistering fast Julia code than it is C/C++/Fortran!
Although, I’d be willing to accept Julia being several times slower just for the sake of ease of use…

On the recent hackernews post by oxinabox, I saw someone say that Julia is probably much slower than advertised in practice, citing lots of StackOverflow posts and the infamous “Giving up on Julia”.
I think there’re just a lot less people moving from Julia to other languages. Probably because of my inexperience, C/C++/Fortran are all in all failing to live up to the “as fast as Julia!” I’ve implicitly been promised.


Any interest in writing a blog post to that effect? Maybe with a click-bait title :troll:


Hmmm - I tried first the official directions at docs.julialang.org, and had errors that prevented completion. I then used “Chris’s blog”, and it worked like a charm. This was with JuliaPro- I did nothing to try to fix the issues (except using the alternate form of statements from the blog):

Here is the error output using the official instructions:

julia> include("/Applications/JuliaPro-")

julia> build_sysimg(sysimg_path=default_sysimg_path(), cpu_target="native", userimg_path=nothing; force=false)
ERROR: MethodError: no method matching build_sysimg(::Void, ::String, ::Void; sysimg_path="/Applications/JuliaPro-
sources/julia/lib/julia/sys", cpu_target="native", userimg_path=nothing, force=false)
Closest candidates are:
  build_sysimg(::Any, ::Any, ::Any; force, debug) at /Applications/JuliaPro- got unsupported keyword arguments "sysimg_path", "cpu_target", "userimg_path"
  build_sysimg(::Any, ::Any) at /Applications/JuliaPro- got unsupported keyword arguments "sysimg_path", "cpu_target", "userimg_path", "force"
  build_sysimg(::Any) at /Applications/JuliaPro- got unsupported keyword arguments "sysimg_path", "cpu_target", "userimg_path", "force"
 [1] (::#kw##build_sysimg)(::Array{Any,1}, ::#build_sysimg, ::Void, ::String, ::Void) at ./<missing>:0 (repeats 2 times)
 [2] eval(::Module, ::Any) at ./boot.jl:235
 [3] eval(::Any) at ./boot.jl:234
 [4] macro expansion at /Applications/JuliaPro- [inlined]
 [5] anonymous at ./<missing>:?

I think that the method invocation changed a bit between versions, and I linked the latest one. Try eg the 0.6.3 version. EDIT: nope, it did not change at all, but I think what you have in the documentation is just the argument syntax, not an example invocation.