Simple addition using Transducers via single-threaded foldl , multi-threaded foldxt calculates two (slightly) different answers .
IOW Is the single-threaded foldl , multi-threaded foldxt, or None of them correct ?
Someone. like maybe @tkf , know what the issue is , or how to address ?
I know Transducers is in Beta-ish , but its really hard to imagine why results differ between single-threaded foldl and multi-threaded foldxt for the SAME Data Set, any ideas ?
using Transducers single-threaded foldl always returns (slightly eg +/- e-10 to e-12) different values than multi-threaded foldxt taken from Parallel processing tutorial @@ Tutorial: Parallelism · Transducers.jl
Issue in a nutshell / MWE
Maybe need more TDD (regression) tests like :
julia> xs = randn (10_000_000)
10000000-element Array {Float 64 ,1}:
julia> maximum(xs)
4.953824504715016
julia> minimum(xs)
-5.157454219276268
## @@ REPL command line per julia>
## Single threaded sum and foldl agree to ALL 16 significant figures / 12 decimals here.
julia> sum(sin(x) for x in xs )
-3847.143374357468
julia> foldl(+, (sin(x) for x in xs ))
-3847.143374357468
julia> foldl(+, Map(sin), xs)
-3847.143374357468
## Multi threaded sum agree to 8th decimal, but are all different answers after 9th decimal IOW after 1.0e-9
julia> foldxt(+, (sin(x) for x in xs ))
-3847.1433743577104
julia> foldxd(+, (sin(x) for x in xs ))
-3847.1433743577104
julia> foldr(+, (sin(x) for x in xs ))
-3847.1433743578973
julia> foldxd(+, ( sinpi( x) for x in xs ))
-283.66017340538497
julia> foldl(+, ( sinpi (x) for x in xs ))
-283.6601734052002
julia> foldxt(+, (sinh(x) for x in xs ))
-4193.185340837956
julia> foldxd(+, (sinh(x) for x in xs ))
-4193.185340837956
julia> foldl(+, (sinh(x) for x in xs ))
-4193.185340837231
## Possibly Related to Multi-threading changing results, but exactly how to address it ?
@@ Multi-threading changing results - #18 by tkf
## Multiple runs Inside a .jl program (i.e. NOT in REPL session) with just a few (10) randoms e.g. xs = randn(10) ## DEBUG
##
##3440-2 DEBUG r_multithread_foldxt - r_onethread_foldl :=-4.44 0892098500626 e-16
##3440-2 DEBUG r_multithread_foldxt - r_onethread_foldl :=0.0
##3440-2 DEBUG r_multithread_foldxt – r_onethread_foldl :=4.44 0892098500626 e-16
*julia> eps()10
2.22 0446049250313 e-15
*julia> eps()2
4.44 0892098500626 e-16
*julia> eps()1.0e+7
2.22 0446049250313 e-9
*julia> eps()2.0e+7
4.44 0892098500626 e-9
## NOTE TIP Above calcs using eps() are guesses at the error max upper bounds which agrees with empirical test results, but why don’t errors cancel in the long run over many millions of items in the Vector/set/matrix , so maybe this points to >> an accumulative systemic (not random) calculation error - Maybe like dropping accuracy by calculating or accumulating using 32-Bit aka 4-Byte Reals instead of 64-Bit aka 8-Byte Reals ??