Hope this is the appropriate place to put this, if not please feel free to move it.
I am wondering if Julia would accept a PR for additional @parallel
macros optimized for Lower (and Upper) Triangular Matrices.
The benefit is: loops finish in less wall time (in the limit, twice as soon).
The cost is another function in base
and a couple of new macros for what
some who knows what they are doing could replicate with user code and @spawn
My working macro name is @parallelLT
but I am not attached and would welcome a better (shorter) name especially if there are existing names for fairly allocating triangular domains I am unaware of.
You can test drive the @parallelLT
macro by building my fork at:
https://github.com/TomConlin/julia
A crude example of the wrong way to time code (in the global space);
julia> @time (@sync @parallel for i in 1:99999 for j in i:100000 z=i^j; end end)
138.581066 seconds (91.34 k allocations: 4.648 MiB)
4-element Array{Future,1}:
Future(2, 1, 129, Nullable{Any}())
Future(3, 1, 130, Nullable{Any}())
Future(4, 1, 131, Nullable{Any}())
Future(5, 1, 132, Nullable{Any}())
julia> @time (@sync @parallelLT for i in 1:99999 for j in i:100000 z=i^j; end end)
89.096365 seconds (171.68 k allocations: 8.759 MiB, 0.01% gc time)
4-element Array{Future,1}:
Future(2, 1, 137, Nullable{Any}())
Future(3, 1, 138, Nullable{Any}())
Future(4, 1, 139, Nullable{Any}())
Future(5, 1, 140, Nullable{Any}())