Use the form of maximum that takes a function as the first argument, and / or make a generator expression out of the broadcast because that needs to create an array right now
You’re iterating over points and v but for a maximum to be determined, not all of the results need to be stored in a vector at the same time. It is enough to compute one by one. So either you make a function to call on each element of the collection you’re interested in, or you use a generator expression which also only ever computes one element at a time. As you have two collections I use zip here, you’d do the same for the functional version a la maximum(f, zip(points, v)):
julia> @btime maximum(abs($h * dot(_v, p) / (dot(p, p) + $eta2)) for (p, _v) in zip($points, $v))
2.237 ms (0 allocations: 0 bytes)
0.002499999788221453
The functional version:
julia> @btime maximum(zip($points, $v)) do (p, _v)
abs($h * dot(_v, p) / (dot(p, p) + $eta2))
end
2.173 ms (0 allocations: 0 bytes)
0.002499999788221453
By the way, thank you so much for showing me this approach!
It gives me a completely allocation free time stepping function since I am using the principle you just showed me:
And that seems to give faster end result, even if the benchmark test is slower - a bit confusing for me, but really happy for you and @nsajko taking the time to show me how one could approach it
julia> function maxvp(v,points,h,eta2)
maxval = -Inf
for i in eachindex(v,points)
maxval = max(maxval, abs(h * dot(v[i],points[i]) / (dot(points[i],points[i]) + eta2)))
end
return maxval
end
maxvp (generic function with 1 method)
julia> @btime maxvp($v,$points,$h,$eta2)
351.774 μs (0 allocations: 0 bytes)
0.002499999787200059
If you add @fastmath to maxval = ..., time reduces by roughly 2.
That did give an extra boost yes Of course now it is getting down to extremely small performance benefits, so probably best to stop improving it now hehe
Thank you for making me aware. Indeed seems to be the case that the most straight forward approach from a function perspective is not yet developed enough for performance