Did Julia community do something to improve its correctness?

(Edited) It was fixed after beta1. Still tagged for backporting. It’s been merged into Backports for julia v1.10.0-beta2 by KristofferC · Pull Request #50708 · JuliaLang/julia · GitHub so it should be in beta2.

7 Likes

Recently I’ve got a project that’s big enough for a deployment and also fairly technical, I was torn between Rust (Rust isn’t built for technical computing but the ecosystem is catching up fast and other advantages provided by Rust are almost too good) and Julia (I always wanted to try Julia for some real work in science computing, whereas I’ve been using R mainly and MATLAB several years back); but my attempt to do some work in Julia has come to a halt mainly because of Yuri’s blogpost. I’ve followed the Reddit’s discussion on Yuris’ post, and an earlier Julia community discussion on the same post.

There are, by now, a good number of opinions exchanged between both sides, with regards to the correctness issue raisd by Yuri in all 3 discussions. But I would really appreciate a formal address from Julia Dev Team on the correctness issue in Julia, regardless that the team agree or disagree (Or probably there is one already, please kindly point me to it).

Some argued that we should let Julia mature and these types of bugs were so common that many languages suffered before and time would cure all. Opponent disagreed with examples of R and Matlab Core. The Core R and Matlab really has been robust for very long (abundant evidence from all 3 discussions and other places). CRAN (R official package management), is too pedantic (almost notoriously) such that the majority of graduate students would not even consider publishing packages.

Some said that it is the third party packages from researchers and graduate students who were being a bit careless when it comes to programming (rather than solving math problems using computer language). Yuri however brought attention to Base Julia near the very top of the post and they were raised in Julia 1.7 ish back in 2021 or 2020 (e.g., prod function from Base, raised in Jan 2021, Yuri’s post was written at least before May 2022). It’s the number of correctness issues Yuri found and the likelihood of similar issues still pertaining in Core Julia that worried Yuri, in the first place at least. I believe OP was worried much, henceforth asking about new measures for correctness being employed or not. This is the very thing I’d hope the team would address as well, as a potential Julia user.

Others mentioned testing suites available to users such as Aqua.jl or JET.jl; I am not sure I, as an user, ought to write unit test on Base Julia functions as singular as prod, or to set up tests only to almost frequently find out that it was the Base Julia I’d have to fix.

While most critical posts of Julia from Google were merely complaints about user experience and how hard they’d have to depart from old habits built in other languages, Yuri’s post, on the other hand, constructed valid points on one of the main objectives of a technical computing language IMHO; that is to be correct in computation result.

I like Julia since it came out (I’ve been playing around in bitesize since 0.8 but I’ve not really used Julia for anything); I appreciate and support Julia’s ambitions, be it to solve “two-language problem” or to offer syntax similar to MATLAB with performance almost on par with C/C++; and yeah, multiple dispatch as well. Hand on heart I’d reall hope Julia to thrive and be the future of technical computing.

5 Likes

while it’s hard to quantify for sure, but I figure someone could write a Yuri style blog by picking oldest things out of: https://github.com/rust-lang/rust/issues?q=is%3Aopen+is%3Aissue+label%3AI-unsound

the fact there are bug reports is a good thing because it means people are using it. But if you look at Rust or Julia’s list of “correctness” bugs, some are very-edge case, some are arguably correct just surprising. I think Julia itself is not more buggy than any major programming language kind project.

9 Likes

Could you define what you are looking for exactly?

  1. Who is the “Julia Dev Team”?
  2. What constitutes a “formal address”?
4 Likes

There’s not been a formal address from the project (it’s not really clear what such an address would look like), but many prominent contributors have indeed weighed in on the many discussions around these — in addition to the Reddit and prior discourse threads, see also the Hacker News discussion from when the post was first published.

I see the hardest issues in Yuri’s post being symptomatic of — as Stefan writes in the HN thread — “the flip side of Julia’s composability is that composing generic code with types that implement abstractions can easily expose bugs when the caller and the callee don’t agree on exactly what the abstraction is.” And that’s where things get interesting. There’s some really cool work to formalize what those abstractions are (e.g., [ANN] RequiredInterfaces.jl). But as I myself previously said,

Yuri really pushed on the ecosystem — and indeed most of the issues he listed are about the composability of packages. As someone who sporadically interfaced with him on-and-off in issues like these, I earnestly thought he enjoyed living on the cutting edge and slaying these dragons. That’s really the biggest loss in my view: that he burned out doing so and we lost his voice and work from the community.

So: Are there still dragons out there? Most definitely — especially if you find yourself bushwhacking through the weeds of packages nobody has tried together before. But it’s continuing to be easier and easier to stay on golden brick roads as Julia and its package ecosystem continue to develop and thrive.

33 Likes
  1. I am not sure who’d be the Julia Dev Team in this case; off the top of my head it probably could be someone, or some group who has had a focus on the development in relevant areas, who can taken ownship? Or probably from the authors/founders of Julia? I understand contributors to Julia Core have participated into the discussions of Yuri’s post.

  2. But isn’t this “community” discussion? community discussion for me surely isn’t formal. By “formal” I guess I’m inclined to imply that I can be certain there is/isn’t future effort or internal discussion from some contributors/subgroups who own the relevant areas (I may well have missed them and I’d appreciate if you could point me to them). As mentioned by @mschauer, this thread has come to an awkward end. What do we know for sure that can answer OP’s question, out of this thread? Probably the contributors who’ve participated and cared before, could come up with some agreed response to the community?

yep may well be; Yuri brought our attention to correctness and I agree that a good number of them were edge cases; there were not so edgy cases as well that’s what worries him I guess. I admit that I probably fret too much having reading Yuri’s post given how much I’d want to use Julia in the new project; it is the very intention that I wanted to try Julia that stopped me here when correctness has very high stake in a project. Yuri’s experience together with other public accounts, have really done a good job in validating their point.

1 Like

I recommend you give Julia an earnest try for yourself :slight_smile:

5 Likes

good points were made in this reply; not being a release blocker doesn’t mean we couldn’t add some documentation to avoid the issue, let alone the issue has been flying since 1.5.3; what’s it not to add some minimal documentation for an issue existed for 2.5 years? It’s swamped in the Jiras I guess and never had a high enough priority.

this is actually one issue raised by Yuri and it is one that’s not edgy. It was planned for 1.10 but removed because team has higher priorities (which is fair); but I probably wouldn’t call this issue “random stuff”.

Or it could be random given what’s ahead for Julia is really some game changer stuff and the team is really committed to get it out on time. I know how much pressure there is. It’s just that please give correctness issue some priority (25 labeled correctness issues, and 2 of them were in 1.10, which were removed thereafter).

This blog post is a good recap:

The most clear cut line that can be drawn is that there is a set of people who have commit access to the JuliaLang GitHub organization: there are currently 67 committers (36 active and 31 dormant). This set of people doesn’t really define the project, however, since there are many people who are prolific contributors to the Julia ecosystem but who do not have “commit bit.” The communal nature of open source makes it difficult to precisely define where the Julia project ends and the greater community begins, which is exactly how we like it.

4 Likes

@stucash, I made my peace with these kinds of issues. One can argue that many more issues are still more urgent than this - so the limited (human) resources are invested as seen fit by those who put in the time.

My experience is this:

  • as somebody who comes from F#, I find myself in need of writing 80-90% more tests. The thumb rule is to test everything. However, I never encountered - in real life - any correctness issues.
  • Julia is transparent enough so you can inspect almost everything you use (if that code is too weird, it is almost guaranteed that somebody here will happily answer any questions).
  • Julia community is a superpower: you have actual experts who will give you their time for free (and sometimes more than one of them will compete to ensure you are getting either the most accurate answer or the most accessible answer).

Yes - we can be (still) justified to complain about all kinds of stuff. But I feel that all those cons are redeemed by the excellent mix of power and helpful community.

Enjoy!

19 Likes

Is that really a correctness bug though? From my point of view, a correctness bug would be: Use a Julia function correctly, but get the wrong answer. Issue #39385 uses sum! incorrectly and gets an incorrect answer.

help?> sum!
search: sum! cumsum! sum summary cumsum isnumeric VersionNumber issubnormal get_zero_subnormals set_zero_subnormals

  sum!(r, A)


  Sum elements of A over the singleton dimensions of r, and write results to r.

To me, its clear that I’m supposed to pass two different objects, not the same one twice. So claiming that Julia “produces incorrect results” for sum! seems odd.

But the larger point is, if you care about getting mathematically correct answers, then Julia is perfectly fine, and not really and different than any other language.

11 Likes

Issue #39385 uses sum! incorrectly

I don’t think this is a fair statement given that there is no warning in the docstring to the contrary

To me, its clear that I’m supposed to pass two different objects

that is not clear to me. after all, I think the “naive” implementation of sum! would probably work correctly here

3 Likes

When I look at the docstring sum elements of A and write to r, to me implies that A and r are different things. Otherwise, for writing to the same location, I would expect sum elements of A and write to A. But the fact that this is ambiguous should mean that updating the docs is a good idea.

I am more questioning the idea of what “correctness bug” and “correctness issue” truly mean because that itself is ambiguous to me.

3 Likes

I have been trying hard to add warnings to the docs, but the PR has stalled. Help appreciated

12 Likes

Yes, of course:

julia> function sum!(a,b)
           for i in eachindex(a,b)
               a[i] = a[i] + b[i]
           end
           return a
       end
sum! (generic function with 1 method)

julia> x = [1,2,3];

julia> sum!(x,x)
3-element Vector{Int64}:
 2
 4
 6

At the same time, I also think that the importance of such an issue is overrated. Anyone really uses a mutating function like that without checking the result?

Important libraries like numpy and other are full of gotchas, and while I think that bug should be fixed in a good way (not only by documentation), I find hardly believable that someone writing some complex math and working out performance tuning (to arrive at trying to use this) would have a program with a hidden bug like that.

By the way, can someone explain this?

In [1]: import numpy as np

In [2]: a = np.array([1,2,3])

In [3]: sum(a,a)
Out[3]: array([7, 8, 9])

Seems that sum is not intended to sum two arrays (one thing is the array being summed, the other I didn’t understand from the documentation. Fair enough, but I think it makes my point: I do not trust my intuition about what a function should do… not here, not with any other language.

8 Likes

2 posts were split to a new topic: Strange behaviors in python and numpy

In the case of sum! the problem is that Julia initializes the result array before doing anything.

julia> x = [1,2,3];

julia> sum!(x,x; init=false)
3-element Vector{Int64}:
 2
 4
 6

The aliasing warning is noted in the documentation in development: sum!.

help?> sum!
search: sum! cumsum! sum summary cumsum isnumeric VersionNumber issubnormal

  sum!(r, A)

  Sum elements of A over the singleton dimensions of r, and write results to
  r. Note that since the sum! function is intended to operate without making
  any allocations, the target should not alias with the source.
7 Likes

FWIW, Numpy handles the aliasing without any apparent issue:

>>> import numpy as np
>>> a = np.array([1, 2, 3])
>>> np.sum(a.reshape((-1, 1)), axis=1, out=a)
array([1, 2, 3])
>>> a
array([1, 2, 3])
>>> np.prod(a.reshape((-1, 1)), axis=1, out=a)
array([1, 2, 3])
>>> a
array([1, 2, 3])

But notably, neither the init keyword or what it does in sum! are to be found anywhere in that same docstring. I don’t think I’m alone in only learning of its existence in mutating sum!/prod! today.

5 Likes

Last I looked into it, using numpy’s out argument doesn’t actually avoid allocations in the manner we’re trying to support here. Indeed, the goal of avoiding allocations doesn’t really make much sense when you’re using a predominantly “vectorized” API.

In any case, this isn’t a competition nor is it a zero-sum game. Some comparisons can indeed be helpful, but let’s not overdo it and pile-on here.

9 Likes