Julia gets mentioned in an article about FORTRAN

A different perspective.

I think “Python but fast” is a good way to explain Julia to programmers who have never heard of it. The main differences are

  • speed - fast enough that it doesn’t need C extensions. CPython is notoriously slow on many tasks. I think speed is the biggest difference between CPython and Julia, because it avoids needing C extensions in many cases. The original “Python but fast” is PyPy, whose speed has been hampered by maintaining compatibility with the old CPython C-API; recently the Hpy project has started work on replacing that API. I suspect (without evidence) that if PyPy had been available when the scientific Python stack was being written (around 2000), then Python’s scientific libraries wouldn’t need so many C extensions either, and that stack could’ve been implemented with much more Python and much less C.

  • multiple dispatch - has some ecosystem effects like encouraging more array types. I don’t think this has as much impact yet as it could, mostly because Julia lacks interfaces. Without an interface definition system, APIs are inconsistent, unreliable, and have unclear semantic relationships. I don’t think “automatic interoperability” is scalable in practice under an informal manual model. This is why Clojure has Spec and why Python has protocols/zope interfaces/ABCs.

  • macros - Allow more concise syntax in some cases. Python might get these soon though they often do more harm than good. Julia’s hygiene is complicated.

  • uniform packaging story - Having a single declarative packaging tool makes it easier to statically analyze and manipulate packaging metadata. Python has good packaging systems now with Poetry and Conda, but there’s a ton of existing code that doesn’t use systems like these.

Python and Javascript are maybe the most-used languages in the world, so it makes sense to compare to them when possible. But the comparison is also pretty natural.

On the similarities. I’ll pick some features and compare Julia and Python to alternative choices other languages have made.

  • Easy to start. print("hello") just works. Compare to Java which needs public static void main(String[] args) which can overwhelm a beginner before they even start.

  • Significant focus on scientific applications. Contrast with Common Lisp, a fast lisp with a smaller scientific user focus.

  • Syntax. Mostly M-expressions but with some reserved keywords like continue and while. Contrast with Scheme, which uses S-expressions only.

  • Built-in repl. Contrast with C++, which is often used in scientific applications but doesn’t have a repl.

  • Dynamic typing. Values are tagged with their class at runtime. Contrast with C++ again.

  • Static type annotations. Both have a system for optionally declaring types which helps with reliability and documentation when used. Contrast with C++ which has required static annotations and Clojure which doesn’t have any.

  • Mutability. In both languages, mutable state is widely used. Contrast with Haskell which mostly disallows this.

  • Ability to evaluate code as data. Julia has syntax sugar :(a + b) which can be written in Python as ast.parse("a + b"). Since the distinction between compile time and run time is fuzzy under JIT compilation, these are more similar than they look. Python also has builtins compile() and eval(). Contrast with Go which doesn’t have this.

  • Program distribution. Run under the preinstalled runtime executable. Contrast with Go which makes static executable binaries.

  • Strings. Strings are sequences of code points. Contrast to Swift which uses extended grapheme clusters.

  • Partial extensibility. Users can define the value of the booleanized version of their expression, but can’t control the behavior of if or overload dispatch. Contrast to CLOS which lets users control dispatch.

  • Garbage collection. Allocated space is collected automatically by a garbage collector. Contrast to C and Rust which don’t have this.

  • Batch garbage collection. Latency can be inconsistent under this model, which is relevant in some applications. Contrast to Go, which considers GC latency an “existential threat” and designed its garbage collector to minimize latency.

  • Hard to control low-level performance-sensitive properties. For example, there is currently no way to disallow allocations in either language. Contrast to C, which makes allocations more explicit.

  • Unrestricted use of foreign objects. You can access private data on any object. Contrast to Java which disallows this.

All in all, I don’t think Julia in 2021 is as different from Python as some users might – except for tuned speed – but that’s not necessarily a bad thing: Python is popular for good reason. Julia is immature in comparison and hasn’t yet figured out its story on a lot of things, such as traits and interfaces, which are promising future directions (among many others). Julia’s capabilities are likely to mature over time in ways we haven’t seen yet.

15 Likes

Julia lets you turn off/on GC using gc_disable() and gc_enable(). This can be really useful in some performance sensitive cases.

1 Like

Python has gc.disable() for the same.

I agree with this; also multithreading, debugging and profiling lags behind Fortran. This is what I meant by “tooling”.

6 Likes

i thought multithreading was easier in julia, when i talk to people who use fortran they mention mpi and other demons.

3 Likes

Almost, but that’s not quite there. That statement makes multiple dispatch seem like an addon, while if you really look at the design of Julia, multiple dispatch is the sole reason why it is able to look dynamic while compiling to fast code. It’s very integral to the whole story as to how the compiler is able to optimize the way it does, and the story would be significantly compromised without that mechanism.

That’s partially why there have been a few billion dollars into Python to only get a few JITs that are 30% faster: you simply cannot directly match the semantics of Python and make everything compile to fast code. You have to tone down some features and change some of the dynamic behavior into statically-definable behavior (i.e. multiple dispatch) if you really want to be able to map it all down to something statically well-defined for most use cases.

17 Likes

MPI is not for multithreading, it’s for distributed computing. You can still use MPI in Julia just as in fortran. Multithreading in julia is there, but it’s (as of now) slower and less well-integrated than OpenMP.

1 Like

But is OpenMP a part of Fortran language? I know it’s widely used, battle tested, and fast, but I thought it’s more of an extension/library, than a part of the language per se. Does Fortran have composable multi-threaded parallelism as a part of the language (similar to Base.Threads.@spawn)? Or is it maybe available as extension in OpenMP?

1 Like

That is a good point. However, I share the feeling with @antoine-levitt that resurrecting Fortran after the fact that Julia exists seems like a lot of energy put into the wrong places.

I am trying to be as impartial as possible here, but I will certainly fail this objective without realizing it (we are all biased). In my opinion, fixing the executable size in Julia feels like something that could be done somewhat easily by experienced Fortran developers and LLVM experts, whereas adding multiple-dispatch and other nice features of Julia to Fortran doesn’t seem easy nor productive. The Julia ecosystem is so smooth already that is hard to beat in feasible time (documentation, tests, binary distribution, domain specific packages)

Anyways, welcome @certik to our community. I hope both communities thrive in their own ways, and that human effort is put into the right places.

3 Likes

It’s not a part of the language, but it’s supported by all the major compilers, so what’s the difference? I don’t know how composable it is, but julia’s fast ubiquitous composable multithreading is not exactly a reality right now (that’s not to say it won’t be at some point in the future; I was just pointing out that right now julia’s multithreading is in flux and still lags behind openmp in several ways)

That is a good point. However, I share the feeling with @antoine-levitt that resurrecting Fortran after the fact that Julia exists seems like a lot of energy put into the wrong places.

To be clear, this is not a zero-sum game: all positive energy is always good, and certainly everyone benefits from having a better Fortran. It makes sense to support “legacy” languages and codes, just as it makes sense to develop new tools. But I definitely wouldn’t recommend anybody to use Fortran for any new project, unless the main objective is interoperability with existing code base, or in very conservative environments where using anything less than 10 years old for serious purpose is a bad idea (I’m not saying this pejoratively, there’s sometimes good reasons for this).

7 Likes

I have the impression that even the developers share this view, but I don’t completely get what is missing from what is implemented already. From a syntatic point of view, using @threads or even packages as FLoops is much easier and cleaner than using OpenMP.

@threads has a known overhead if applied to parallelize fast computations (is that what you and others mean by it is not quite there yet?). FLoops (as an example) solves this for at least quite a few problems I have faced.

My experience, as a user of those tools, is that using threding in Julia is already at least much easier than using OpenMP. But it is likely that my usage is very narrow and I don’t capture what is missing.

2 Likes

but it’s (as of now) slower

Can you provide an example? It’s interesting to have some numbers to understand the size of the problem.

1 Like

I didn’t mean to imply that there’s something wrong with Fortran or OpenMP. I’ve used OpenMP with Fortran, a few times, on very trivially parallelizable code and it was pleasant to work for that case. I was just curious if it’s a part of language such as coarrays and whether it has ability to schedule tasks depth first. Thank you for explanation.

1 Like

Can you give some examples about it? I don’t think Fortran can perform some secret compiler optimizations other languages can’t have. ML and LLVM community also puts really a lot of efforts into specialized compiler optimization, such as vectorization and polyhedral model, which is hard to beat.

1 Like

The last time I looked at it was

but it would be awesome to have more up to date benchmarks.

1 Like

Here is one example (not sure how representative it is):

Fortran:

Code
PROGRAM Parallel_Hello_World
USE OMP_LIB
REAL*8 :: partial_Sum, total_Sum
!$OMP PARALLEL PRIVATE(partial_Sum) SHARED(total_Sum)
    partial_Sum = 0.d0
    total_Sum = 0.d0

    !$OMP DO
    DO i=1,1000000000
        partial_Sum = partial_Sum + dsin(dble(i))
    END DO
    !$OMP END DO

    !$OMP CRITICAL
        total_Sum = total_Sum + partial_Sum
    !$OMP END CRITICAL

!$OMP END PARALLEL
PRINT *, "Total Sum: ", total_Sum
END

Result:

leandro@pitico:~/Drive/Work/JuliaPlay% gfortran -O3 omp.f95 -o omp
leandro@pitico:~/Drive/Work/JuliaPlay% time ./omp
 Total Sum:   0.42129448674541342

real    0m54,495s
user    0m54,476s
sys     0m0,004s
leandro@pitico:~/Drive/Work/JuliaPlay% gfortran -O3 -fopenmp omp.f95 -o omp
leandro@pitico:~/Drive/Work/JuliaPlay% time ./omp
 Total Sum:   0.42129448674645509

real    0m12,538s
user    1m29,020s
sys     0m0,016s

Julia using FLoops:

Code
using FLoops
function f()
    @floop for i in 1:1000000000
        @reduce(total_Sum += sin(i))
    end
    total_Sum
end
println("Total sum: ",f())

Result:

leandro@pitico:~/Drive/Work/JuliaPlay% time julia floop.jl
Total sum: 0.4212944867466973

real    0m40,729s
user    0m41,039s
sys     0m0,608s

leandro@pitico:~/Drive/Work/JuliaPlay% time julia -t8 floop.jl
Total sum: 0.4212944867465145

real    0m14,743s
user    1m4,074s
sys     0m0,751s

Julia using @threads:

using Base.Threads
function f()
    total_Sum = zeros(nthreads())
    @threads for i in 1:1000000000
        total_Sum[threadid()] += sin(i)
    end
    sum(total_Sum)
end
println("Total sum: ",f())
leandro@pitico:~/Drive/Work/JuliaPlay% time julia -t8 threads.jl 
Total sum: 0.4212944867465146

real    0m10,514s
user    1m11,253s
sys     0m0,544s

Note that Julia is faster than Fortran with a single thread (no idea why).

I split this in a new thread, I guess I don’t really understand what is going on when using atomic operations, and now removed the initial test, where I used atomic_add! here because it didn’t make sense.

3 Likes

I read those posts and it is still not clear to me why someone who is not working with a ton of Fortran legacy code already would start with the language now, even if you manage all the ambitious goals of LFortran.

Don’t get me wrong, I think that Fortran was a terrific language for its time, but most of its key ingredients have been adapted by other languages since. Eg all the highlights you list were already in Julia from day 0, except for generalized indexing which was added later, but required no fundamental redesign of anything deep.

I would tend agree with Thomas Clune quoted in the article and @ChrisRackauckas above: aggressively devirtualized multiple dispatch is the game changer for Julia. Having a REPL, macros, and other nice things are just there because it would have been weird to design a language without them after the 1990s, just like seatbelts and later ABS became mandatory in cars — it is to obvious a benefit to miss.

9 Likes

Obviously, I cannot speak for Certik, but I thought I would mention that there’s a huge pressure on the employees of the national labs to keep Fortran codes alive. The national labs have huge investments in massive codes in Fortran, and whatever keeps them going is well rewarded.

12 Likes

Lack of aliasing makes Fortran a little hard to write, but it makes it easier for the compiler to optimize a few things. When aliasing is allowed, you need a bunch of analyses to figure out whether two bindings could ever be pointing to the same array (which may eliminate some things like simd ivdep), while you can those for free if the language does not allow that to ever occur. Moving to a dynamic language like Julia makes that even harder than proving it in a language like C, and so the common “I got within 2x of C for a direct translation but getting to 1-1 is hard” usually ends up boiling down to whether aliasing or escaping can be proved.

6 Likes

Except one.

Being able to produce small (or not that small) stand-alone executables. IMO this is a major feature that non-interactive language users have troubles to comprehend.

I wish I could upvote this twice. In my opinion this is a major disadvantage of Julia compared to other compiled languages.

For example, consider the Pixhawk hardware which is one of the most ubiquitous platforms to build autonomous planes, cars, boats, etc. on. It has at most 2 MB of flash memory. By some accounts a simple “Hello World” binary made using PackageCompiler.jl can be in excess of 100 MB. Perhaps this size can be reduced, but I’ve never seen any documentation or methodology on how to do so.

I think Julia is one of the most elegant and powerful languages available today. What’s been done with Julia in packages such as the SciML framework is astounding. But unless Julia gains the ability to compile compact standalone binaries, there will always be the need in domains such as robotics and autonomy to also use a second compiled language for final implementation.

10 Likes