Discussion on "Why I no longer recommend Julia" by Yuri Vishnevsky

I am not sure I agree with that. For-loops are one of the simplest things you can do and in Julia they are fast. You don’t need any effort to get those benefits.

And while I don’t have any experience with that, I think that Julia could actually be a good introductory programming language. Possibly better then python. I think that the interactivity of Julia and type inference make Julia really easy to start with. At the same time the idea of type stability introduces just enough discipline in the code, that if the student later has to learn a statically compiled language, the jump is not so big as going from python. And most of time, making sure that the code is type stable is not that hard, e.g.: just make sure that if something will end up being an array, you don’t initialise it as an integer.

I partially disagree here. As I said many students haven’t even heard of Julia. Someone has to let them know it exists. Just like someone let them know that python exists.

I agree 100% with these two points.

3 Likes

It isn’t hard, but it isn’t obvious either. Understanding type stability and the global scope issue depends on getting some feeling about some concepts of programming. Either the students will find the “rules” just bizarre, or they will adventure in learning some programming at deeper level.

Note that these concepts are not an issue in python (slow) or in Fortran (fast). So, there is a learning curve.

All depends on if the users are just using a tool for a single time or are in the path of implementing custom methods and becoming regular programmers.

(Just for notice, I’ve been using Julia in courses for beginners just fine)

9 Likes

I use Numba to write simple fast loops when vectorized code wouldn’t make much sense, but there’s a lot of stuff I can’t write. Yes, there is active development and much room for improvement (gufuncs can’t be called in jit functions yet, ufuncs don’t specialize as readily as jit), but a fundamental limitation is the need to adhere to Python and NumPy. For example, instead of named structs, Numpy arrays have unnamed structured dtypes (similar to Julia’s eltype). But dtypes aren’t bona fide Python types, so you can’t make an instance of one; you can index a structured array, but the element is an instance of numpy.void no matter the particular structure. Likewise, Numba is incapable of making a vectorized ufunc that dispatches on or returns a structured element, and there doesn’t seem to be any active attempt to make it work somehow (Github issue #5329).

6 Likes

For loops are fast in a lot of languages (eg C/C++/…).

I see the comparative advantage of Julia in writing composable tools that can interoperate
to produce fast runtimes with little effort (compared to what it would take to do this in other languages), but no other language has combined the relevant building blocks before this way, so there are a ton of corner cases and problems one can run into. Especially in the course of a computationally intensive PhD thesis or something similar.

The Julia community, in general, is well aware of these problems, and people are working on solutions. Some can be managed at a package level, while at the other end of the spectrum some need new features in the core Julia compiler. So… things take time.

Julia is still a great language with amazing potential and it is eminently usable as is, but it is important to manage expectations: one should read the manual thoroughly, study some practical examples, make a lot of educational mistakes. Importantly, the user will run into bugs almost surely with nontrivial code, at least in packages, and then should be prepared to isolate an MWE, open an issue, and occasionally provide a PR to speed things up.

It is like a passenger train where you are advised to bring a wrench in case you need to fix stuff on the way to your destination. And of course wear clothes you don’t mind getting stained with machine oil.

22 Likes

Excellent!

Ultimately, Numba is an a posteriori solution, while Julia is designed from the ground to be compiled and interactive from the beginning.

For non-interactive, compiled languages sure. But if we want interactive and fast for loops Julia is really the only (that I know of) option.

At the end of the day, I think that these are just growing pains. I am sure that as the Julia ecosystem matures, these issues will disappear.

2 Likes

Hit the nail on the head. Numba is designed to recompile the compiled bytecode of Python functions that use a subset of NumPy, so you must write valid Python. To deviate from Python syntax, you need something like Cython where the source files are compiled differently from the start. Considering that Numba has an LLVM-based just-ahead-of-time compiler, if implementing non-Python/NumPy features wouldn’t sacrifice the important aspect of Python compatibility, I’ll bet that the developers would overhaul their multiple dispatch, implement syntax for type parameters, introduce named structu- oh this is Julia.

3 Likes

I had the impression that since Matlab introduced JIT that its for loops increased in speed. After a very simple test I found Julia 10x faster! Comparing the current version of Matlab with the latest 32 bit version, 2015b, which I keep around for a 32 bit toolbox I need I found a >2x speed increase.

In this case there is no speed improvement in Julia using a function, probably because everything is local in the for loop anyway.

julia> function temp(n)
           for i in 1:n
               sin(π*i/n)
           end
       end
temp (generic function with 1 method)

julia> @btime temp(1000000)
  8.696 ms (0 allocations: 0 bytes)

julia> @btime for i in 1:1000000
           sin(π*i/1000000)
       end
  8.883 ms (0 allocations: 0 bytes)

julia>

and the matlab version

tic
n = 1000000

    for i = 1:n
        sin(pi*i/n);
    end

toc

Elapsed time is 0.084775 seconds.   % R2022a
Elapsed time is 0.180900 seconds.   % R2015b (32 bit)

Conclusion, all the statements about for loops being slow in Matlab seems correct.

4 Likes

Those codes seem a bit unrealistic. They don’t return anything, so it’s anyone’s guess what the compiler could do with it.

I made two functions:

% Matlab
function out = foo(n)
    out = zeros(n,1);
    for i = 1:n
        out(i) = sin(pi*i/n);
    end
end

# Julia
function foo(n)
    out = zeros(n)
    for i in eachindex(out)
        out[i] = sin(pi*i/n)
    end
    return out
end

Timings:

% matlab
>> timeit(@()foo(1e6))
ans =
    0.0358

julia> @btime foo(10^6);
  14.493 ms (2 allocations: 7.63 MiB)

That’s 2.5x difference. Versions are Matlab R2021b and Julia 1.8-beta3

7 Likes

I can relate as someone more in that middle. I remember this from JuliaCon that there are a ton of extremely talented and smart people who both know science and programming really well. Then there is the group of people from academics with minimal programming experience. More average developers like me who spent a lot of time in industry but not don’t really do science and academic work seemed quite rare. As someone who writes Julia books and make course material that has kind of been the group of people I want to appeal to as I think Julia has a lot of benefits for regular developers in industry.

Honestly I have had a lot less of the problems described as I don’t work on sophisticated enough problems to combine that may packages in novel ways. If you use Julia in a bit of a boring way it seems to work very well to me.

If I compare with the work I’ve done with JavaScript, C++, Swift and Go I am not sure Julia is really worse. Quite the contrary. A lot of this has to do with maturity and I have worked with a lot of languages in the early phase. There tends to be a lot of problems early on.

There is also something to be said about ability to diagnose problem. The REPL environment at least for me makes it a lot quicker to drill down to the problems I experience.

22 Likes

Just me 2 cents thinking about the criticism. Not sure how much useful I have to add to this, but I feel Yuri may not do a proper apples to oranges comparison. Here is my hypothesis: Many Julia libraries and what you can achieve by combining them is extremely powerful relative to the man-hours that went into making them. I suspect when quality of Julia libraries is compared it is against solution which require far more manpower to build.

Yes, combining library A, B, C creates a number of permutations which are hard to test, but it also gives a lot of functionality relative to the effort that went into making these libraries. As I remark on in my story below I think the Apache Arrow project is illustrative. Correct me if I am wrong but that was the work of one guy in Julia. While dozens of C++ guys did the same. The Julia solution has to have a lot of bugs to be considered a bad tradeoff compared to the C++ solution.

As an old school C++ developer I never combined and used libraries so frequently and with such ease as in Julia, or any other language I have used. Our ability to combine and use more libraries give us more power but also expose us to potential for more bugs. I don’t think it is fair to say this power isn’t worth it or that we cannot overcome the problems.

If people can build massive systems with C, COBOLT, JavaScript etc which are of high quality then I am in no doubt that it can be done with Julia as well. I have faith in the future of Julia.

34 Likes

Just a small follow up on combining libraries: To me this is one of the killer features of Julia. Yet, compared to other languages there can be some gotchas in Julia.
Python: When using an AD kernel for a specific tensor operation, I can be sure that someone had a look at its code as it had to be written explicitly. In Julia I cannot, the specific combination might have been written by the compiler!
C++/Haskell: I pick those here as these allow similar generic programming. Yet, combining functionalities is more constrained by the type system in these languages. What is nice and reassuring in Haskell are the laws stated for each typeclass (concepts in C++ serve a similar purpose). They specify exactly what I can and cannot assume about a type and its operations, i.e., my generic code should just work for any type obeying these laws. As soon as I require any additional properties, e.g., commutativity instead of the stated associativity, I’m on my own and all bets are off. In Julia, I’m often not sure about what exactly I can/cannot assume for an interface. Maybe laws would make a nice addition either in documentation or even better as executable property tests (a la quickcheck).
In any case, having the ability to freely combine code from different libraries provides a huge kickstart for Julia. Just compare how much time/people were needed to provide the same amount of functionality in other languages, e.g., AD toolboxes for Python. Yet, guidance and assurance on the correctness of combined code might be improved …

11 Likes

My understanding is that Haskell typeclasses specify the above in type space. Which is somewhat useful, but still falls short of a complete contract.

To be concrete, consider

struct MyVector{T} <: AbstractVector{T}
    contents::Vector{T}
end

Base.getindex(v::MyVector, ix) = v.contents[ix]

Base.size(v::MyVector) = size(v.contents)

function Base.setindex!(v::MyVector, value, ix)
    if randn() ≤ 0.5
        @warn "I don't feel like setting the value at the moment"
    else
        v.contents[ix] = value
    end
    value
end

which fulfills every contract we can imagine in type space — a formal interface spec would probably would not find fault with this.

But it is still a stupid implementation, and encountering something similar in a package developed for practical purposes, I would consider it a bug.

Whether it is worth the extra complication for a language is a matter of opinion. The revealed preference of Julia devs seems to be that it is not a top priority at the moment, which I agree with.

6 Likes

Quite off topic, but this is only true if your type space does not have purity modelling, which Haskell iirc does have. So the path throwing an error taints the method, resulting in a compiler error due to not conforming to the expected interface, should that require purity.

Julia does model these things internally to some extent (or at least is starting too) with the up-and-coming effects system, but that’s a very orthogonal thing to formally specifying and checking interface compliance.

1 Like

True, the example can be misinterpreted. Consider instead something like

function Base.setindex!(v::MyVector, value, ix)
    if value ≤ 0.5
        v.contents[ix] = value
    end
    value
end

The point is that contracts about types catch some bugs, but are far from being a solution to the majority of bugs.

IMO their importance is overstated — programs that compile in languages that enforce these things still can and do have plenty of bugs. At the same time, this kind of formal interface spec complicates both the language and code written in it, and complexity itself can hide bugs.

But, again, I understand that some people like this, and there have been whole languages designed around the concept. It’s just that Julia, at the moment, isn’t one of them, and personally I don’t see a compelling reason to change this.

Instead, I think that most interfaces should have their own test suite, as suggested above.

15 Likes

Sorry, I should have been more precise here:
The typeclass itself specifies just the required types for each operation/function and is checked by the Haskell compiler … your examples above (except maybe for the exception) would satisfy those.
The laws are additional algebraic properties that the operations (morally) need to satisfy in order for an implementation to be correct, e.g., the Functor typeclass (generalizing map to arbitrary containers) has the two laws:

  1. fmap id = id
  2. fmap (g . h) = (fmap g) . (fmap h)

These also apply at the value level and are not checked by the Haskell compiler, i.e., they are just documented and should also be part of a test suite as suggested above. They do constrain possible implementations though, e.g., the first law would disallow the following implementation

fmap(f, v::AbstractVector) = f.(reverse(v))

even though it is typed correctly.

For your example, a property such as

getindex(setindex!(container, value, idx), idx) == value

should probably hold and be documented/tested.

6 Likes

100% agree. For example, the OffsetArrays problem is not one of missing formal interface definitions as far as I can tell. The “flaw” is that people writing algorithms assuming arrays start counting at 1 and harcoding that number in a variety of ways - many not captured by the interfaces at all - because it is much easier and what 99.99% of julia arrays start at 1. I don’t know fortran’s ecosystem especially well but I would be shocked if you can pick a random chunk of code from the internet (e.g. Burkardt’s wepage or TOMS or wherever people get code) and expect it to work with non-standard indices without reading the source.

What more formal interfaces would do is change the way that dispatching works. But it is unclear to me if Julia has the same problems or if something like traits would be better in Jula 2.x or 3.x etc.

C++ had a miserable time in the standardization of concepts (and eventually scaled them back last minute to the point that they help with static dispatching but had none of the fancy features that had previously been discussed). But C++ also had enormous hacks with SFINAE etc. which even scaled back concepts immediately helped with, and a stark difference between runtime and dynamic dispatching. Julia is a totally different model.

3 Likes

Lost in this discussion is that OffsetArrays offers a view with no offsets
OffsetArrays.no_offset_view which can be used to call functions that don’t support offset arrays.

10 Likes

I partially agree/disagree. When you mean with formal specification that this has to checked at compile time then I agree that this is not a requirement. On the other hand, when writing generic code it is important to know which properties of a type you can and cannot rely on (that is why I mentioned the laws in Haskell which are actually not checked by the compiler). In the end, generic code has to work across different concrete types, i.e., based on more abstract properties. Thus, you are right that code is “flawed” if it states AbstractArray in its signature and then assumes that indexing starts at 1 – as this property is nowhere mentioned in the interface for abstract arrays. In Haskell I would actually expect that code

  • stating Ix (the typeclass for abstract indexing) in its signature works correctly if passed an array with custom indexing
  • stating Functor in its signature works for any value from a type in that typeclass – no matter if you have a good intuition about how some particular instance is implemented or not.
2 Likes

Bugs are one thing. But it would be nice if the compiler told me which methods I forgot or didn’t know to implement.

2 Likes

For sure. I know where you are coming from.

I think the issue here is a community one (which is why I brought up that it would not have fixed the OffsetArrays thing in particular). Most Julia users do not have the expertise of haskell, CLOS, or C++ template developers, and I think the places where people are getting stung by the language tend to be more on ecosystem than the language itself. For me at least, I think that something like traits would solve immediate problems with dispatching. Beyond that, given how julia is used and who tends to write the code, I am concerned that making things more fancy with formal generic programming specifications might otherwise drive users away. The code already looks complicated to someone coming from python or matlab. Concepts in C++ solved a pressing need. Here I think the benefits of formalizing interfaces are less clear, even if it would be nice to have as an option.

5 Likes