Alarming incorrect computation in Flux / Juno?

have a look at this bug https://github.com/FluxML/Flux.jl/issues/982

Am I correct in thinking that is is really serious error?
Conv gives not repeatable, quite reandom results.

I was able to repeat the bug running in Atom/Juno on Mac with Julia1.3.1+Flux0.9.

Thinking it was time to give up on julia and run!

However, running the same examples in the terminal REPL, the bug did not happen. This is using the same Julia that Juno uses, 1.3.1+Flux0.9. Also tried in Julia1.3.0+Flux0.10 terminal on Linux, no issue there.

So it seems like this is something to do with running in Juno???

An even smaller way to show the problem:

    batchsize = 2
    ii = randn(Float32, 5,4,3,batchsize)
    ll = Conv((3,3), 3=>3)
    maximum(ll(ii)-ll(ii))

Repeat that last line about 10 times. Usually the result is zero (correct) but sometimes it is a large number. Increased batchsize makes it happen sooner.

Since there is already an open issue, it would be best to discuss the bug there. You could check the package versions and the Julia version you have when running in Juno (pkg> st --manifest Flux) and the REPL, to see if there is any difference.

That is of course up to you; but bugs happen in all complex software.

I understand jaynick already! It’s a bit frustrating when Flux switches to zygote and a number of bugs come up. Before Flux (0.8.3) did what it was supposed to do (that’s my impression) and now - again from my point of view - it’s quiet about correction and further development.

But it is clear, I will not give up Julia! :grinning:

I believe this is not related to Zygote at all. The example does not use AD, and it does not happen outside of Juno In my test.

Yes of course. I let myself go a little bit because Flux “doesn’t run smoothly anymore”. :roll_eyes:

I posted about this at
https://discourse.julialang.org/t/seriously-incorrect-computation-in-juno-repl/33050
because it seems (to me) to be a problem in Juno, maybe.

Further discussion at the bottom of https://github.com/FluxML/Flux.jl/issues/982
it is caused by multithreading, not Juno