Tullio is brilliant, not only do you get threads and avx for free, you get a concise notation that makes your code super clear and way less bug prone. I’m immediately adopting it for whenever I need to write array twiddling!
I’ll test the speed and post a comparison in a few hours. Thanks for this!
Ok, just to see how well it works… here’s the Tullio version:
function matmultull(A,v)
if size(A,2) != length(v)
throw(DimensionMismatch("second dimension of A, $size(A,2), does not match length of v, $length(v)"))
end
B = copy(v)
@tullio B[i] = A[i,j] *v[j]
return B
end
ZERO loops for that undergrad to get wrong.
How’s the speed? EXACTLY the same as with my hand written undergradish loop code + @avx (and basically the same as built in *)
julia> @benchmark matmul($testA,$testV)
BenchmarkTools.Trial:
memory estimate: 14.80 KiB
allocs estimate: 24
--------------
minimum time: 990.039 μs (0.00% GC)
median time: 1.084 ms (0.00% GC)
mean time: 1.102 ms (0.00% GC)
maximum time: 1.707 ms (0.00% GC)
--------------
samples: 4508
evals/sample: 1
julia> @benchmark matmultull($testA,$testV)
BenchmarkTools.Trial:
memory estimate: 18.55 KiB
allocs estimate: 99
--------------
minimum time: 993.795 μs (0.00% GC)
median time: 1.054 ms (0.00% GC)
mean time: 1.089 ms (0.00% GC)
maximum time: 2.290 ms (0.00% GC)
--------------
samples: 4562
evals/sample: 1
@benchmark $testA * $testV
BenchmarkTools.Trial:
memory estimate: 11.88 KiB
allocs estimate: 1
--------------
minimum time: 831.787 μs (0.00% GC)
median time: 924.277 μs (0.00% GC)
mean time: 1.027 ms (0.00% GC)
maximum time: 10.739 ms (0.00% GC)
--------------
samples: 4811
evals/sample: 1
Thanks again!