Did Julia community do something to improve its correctness?

Feels like the discussion is running in circles. One of the things that bothers me a bit, is that discussing away bugs seems a bit simpler in Julia.
First, its interfaces are usually not formally defined or enforced and accordingly one can always argue whether its a bug or not. E.g., what would you expect when calling cycle on a stateful iterator: a) It caches all values and reuses those (which could lead to unexpected memory leaks – just search for lazy IO in Haskell) or b) it throws an error (making it well defined, but essentially useless) or c) it does something which you might or might not expect (putting it in the ream of undefined behaviour which is morally defensible as no correct behaviour seems possible anyways)?
Secondly, the assumptions behind generic interfaces are also not well specified and some implementations make slightly different choices than others. I would certainly agree that the implementation

reverse(z::Zip) = Zip(Base.map(reverse, z.is)) # n.b. we assume all iterators are the same length

is wrong as its assumption is squarely at odds with the docstring of zip

  zip(iters...)

  Run multiple iterators at the same time, until any of them is exhausted.

Generic programming enables very composable code, but also requires higher-levels of abstractions to work flawlessly, and at points, some Julia code appears a bit careless in what can be assumed and what not. While I don’t expect Haskell like abstractions and laws, some documentation around the assumed invariances in generic interfaces could be helpful at times.

7 Likes

Something like GitHub - Keno/InterfaceSpecs.jl: Playground for formal specifications of interfaces in Julia would be a great start.

It wouldn’t surprise me to find that Julia lets people hit edge cases a lot faster than other languages do.

I’m writing a mathematical modeling book, in it for one example I fit a posterior distribution over Chebyshev Polynomials using ApproxFun.jl and Turing.jl. Because of the science we expect these Chebyshev Polynomials to be non-decreasing I do a trick to make this work.

Thanks to the magic of ApproxFun I can just make a function f, then calculate g = Integral()*(f^2) to get an integral of f^2 as an ApproxFun. Could I do this in R? Not a chance. Could I use R and call out to Stan? Well, I asked the Stan people to add Chebyshev functions in 2013 or something, and they never did and just closed the bug after a while. Could I do this in Python? I don’t know but if I did it’d probably not be in Python per se, it’d be in numpy or some big library of C/C++ functions callable from Python.

It’s magically perfect in Julia.

So, for me Julia enables me to do things that make sense and are sophisticated stuff that’s virtually impossible in other languages in the same niche. But when the realm of things I might try is combinatorially bigger it certainly makes for a bigger potential bug space as well. I should pause probably before trying to take forward diffs of Matrix Algebra of Unitful units for example.

That being said, while I don’t program in it all day every day, I have never encountered a serious correctness bug that secretly gave me wrong answers that I’m aware of. I’ve definitely found some bugs that crash weirdly or similar, but never just returning a secretly wrong answer. I think a couple times I was able to do something like make a Loess.jl fit return NaN.

When I used to program in C/C++ decades ago, you’d constantly be running your code and debugging use-after-free or NULL pointers or type casts or memory leaks or whatever. In the last few decades doing data analysis and such in R I’d just never do anything actually requiring computing in R itself, everything is call out to some C/C++ code.

I suppose perhaps if there were an actual competitor to Julia and it were substantially less likely to give me wrong answers, I’d be upset about how Julia secretly returns wrong results. That isn’t the scenario I find myself in though.

I still think a repository that stress tests large swaths of the Julia ecosystem would be a good project though. I’d like to be able to say “in these 800 heavy tests of Julia doing statistical calculations and simulations, there’s exactly 3 bugs that have been found and 2 of them cause a crash while the third causes a minor bias in some monte carlo output that can be fixed by workaround X”

or stuff like that.

26 Likes

There is a difference between the incorrectness of the language and the incorrectness of the compiler. A lot of people here conflate those two.

Edit: Btw, what @Sukera had in mind is yet another category, incorrectness of the program. I think that Julia does not have an incorrectness problem as a language. The compiler has bugs, as every other compiler does. The general packages (and the standard libraries, in some cases) may have program correctness bugs. But show me a C++ library without bugs…

4 Likes

I don’t know. In Julia the compiler and the language are sort of integrated. In Fortran I had the opposite experience. You can even get Fortran comitee people telling you that what you wrote is not a Fortran program, even if the syntax is supported by a major compiler for decades.

4 Likes

Are there bugs in the Base language where code silently returns results that are definitely and objectively incorrect? Yes, sure, I even have one issue filed quite some time ago – skipmissing on Dict: inconsistent results · Issue #48379 · JuliaLang/julia · GitHub (adding to the small heap of yet-unfixed issues in this thread (: ).
But this is the only one in Base I encountered over a few years of very active Julia usage, such bugs definitely don’t affect my general perception of Julia.

Popular libraries can also have silent bugs of course, they aren’t common but can happen – and unfortunately sometimes can take embarrassingly long time to fix (eg https://github.com/JuliaStats/StatsBase.jl/issues/518). Again, this is the only silent bug in Julia libraries I encountered (looking at the history of github issues).

Of course, making software bug-free would be great, but that doesn’t seem possible for now in general. I wonder what are concrete suggestions that could help reducing the amount of bugs in Julia, to gain another advantage wrt other languages + libraries.

2 Likes

I think we again need to distinguish between the ecosystem and the language. The language has some correctness bugs, but I don’t think they indicate a fundamental problem with Julia–it’s a . Julia base is probably a bit buggier than Python or C++; I’ve found a few bugs in it. But I don’t think that’s all that unusual for a language with a community of roughly this size, and they do seem to be getting better over time.

What I’m much more concerned about is the Julia ecosystem, where I regularly see correctness bugs in common cases, not just unusual compositions of edge cases. As another example, today we discovered that the cumulant function in StatsBase is often just flat-out wrong for higher-order moments, and cumulants aren’t exactly a niche feature.

3 Likes

I see you happily discussing onwards in cycles, how about taking a step back and figuring out what you want from this discussion?

2 Likes

Seems (from your source) that Python (SciPy) has exactly the same « bug », but « hides » it by returning only 4 cumulants? So what about « correctness » of Python (ecosystem)?

Is this here really a « language/ecosystem » « correctness » issue, needing a so looong discussion?

Note : I do not imply SciPy is incorrect, I do not know the rationale to this 4 limitation (would be interesting to know of related discussion if it existed). What I mean is it is not positive for the language to search examples that show that every language has same troubles

3 Likes

Quoting just because it triggered my thought. This is not against the quoted person.

But discussion like these do affect the perception of Julia on a wider audience, and not to a good one.

OP started with a vague claim implying everything and nothing even putting responsibility on the community. This is a pattern happening now and then in the past few years. Sometimes OPs are happily steering the discussion into the negative or like this one, the community does it on its own. Nothing from OP since discussion started!

It’s even unclear what this is all about from an outside view. Of course every single person here has a clear view of their issues but the overall conciseness is quit low.

These discussion tend to be non productive. I can’t see how this brings any benefit and anything we can work on. At the end it’s another proof that Julia is not mature.

We should be aware of this possible outcome of discussions like that.

12 Likes

That’s a completeness issue, not a correctness issue, which is both different and much milder. Saying “Sorry, that feature’s not implemented” is a whole different kind of problem from giving the wrong answer. I really don’t think we want to go down the route of calling every missing feature a “correctness bug,” given how many features haven’t been implemented yet in Julia (compared to Python/R).

7 Likes

personally, what I would prefer to influence is community priorities, in that

  1. when there is a tension between fixing an edge case/ bug and performance, generally the bugfix is preferred
  2. when there is a tension between fixing an edge case / bug and semver, generally the bugfix is preferred
  3. when there is a tension between fixing an edge case / bug and genericness of implementation, generally the bugfix is preferred

I have seen it happen quite a few times that some pr pops in the tracker that fixes a bug (or something that is at least arguably a bug) and either stalls or is outright rejected because it leads to a 10% performance regression in such and such use case, or is pedantically minorly breaking semver, or would require runtime checks inside the implementation rather than relying only on method signature

15 Likes

Yes. On the other hand, it is not good to simply call the library perfect because it does nothing and therefore does not have bugs. The point was that Scipy also had the same issue, so it just cut it by limiting its own features.

I didn’t call the library perfect, I just said that this particular function isn’t returning incorrect answers (unlike the Julia equivalent). There’s a clear ordering here: wrong answer < error < correct answer.

2 Likes

IIRC Steve McConnell calls this “cargo cult software engineering”–things like:

  1. Thoroughly reviewing or nitpicking new PRs rather than addressing bigger issues in existing code (forgetting the purpose of code review is to fix bugs)
  2. Obsession with keeping packages “lightweight” by refusing to take on even widely-used, quick-to-compile dependencies (forgetting the purpose of lightweight packages is to reduce compile time and fragility)
  3. Religious adherence to written contracts of compatibility/API (forgetting the purpose of avoiding breaking changes is to reduce maintenance burden)

Even much more well-established languages are more willing to make big changes than Julia is, despite having much larger codebases to maintain. For example:

  • Python breaks some code every minor update; they just give 3 years’ warning, more than enough time for even poorly maintained packages to fix issues.
  • Dart literally switched from dynamic to static typing and abolished null pointers (two massive changes) in the 1->2 shift. A few months ago, they removed f(name: arg) for named keyword arguments.

Meanwhile, I think this is the most common reply to almost any request to fix a problem. And changes that don’t break the Official Contract provided by docs (most notably, offset indices) are treated as being non-breaking, without any mitigating measures (like scripts or tools to help people convert, @inbounds deprecation, or required ordinal indexing).

1 Like

In my experience, the PSA is often a response to mildly breaking feature requests, not to correctness bugs that may be fixed in a non-breaking manner.

11 Likes

Yes, that’s my point (and @adienes’ as well). In many cases, fixing a correctness bug is arguably mildly breaking for a handful of people who rely on incorrect or dangerous behavior.

(In other cases, any changes would be much more breaking and probably require a 2.0 release, e.g. a satisfactory solution to out-of-bounds array indexing, but the devs have ruled out a 2.0 release for the foreseeable future.)

1 Like

Isn’t there some rule of internet discussions that says referencing XKCD means the end of the thread? I think maybe Munroe’s law?

11 Likes

I’d say, this particular horse has been beaten to death some dozens of posts ago. Let’s close it.

8 Likes