I have a doubt and I will use a simple polynomial function as an example to describe it.
Defining a regular polynomial function, the speed of direct value calculation is very fast (0.8ns):
f1(x,y)=x^3*y^3+2*x^2*y+3*x*y^2
@benchmark f1(1.0,2.0)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 0.791 ns … 12.333 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 0.875 ns ┊ GC (median): 0.00%
Time (mean ± σ): 0.874 ns ± 0.128 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂ ▂
0.791 ns Histogram: frequency by time 0.917 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
However, when the values are passed as parameters or extracted from an array, the calculation speed is much slower (17ns and 51ns):
a = 1.0
b = 2.0
@benchmark f1(a,b)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max): 16.366 ns … 36.281 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 17.577 ns ┊ GC (median): 0.00%
Time (mean ± σ): 17.706 ns ± 1.295 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▃▁▃▄▂▅██▅▃ ▂
▇████████████▇▇▅▅▅▆▆▅▆▄▄▄▄▄▆▅▆██▇▇▄▆▅▄▅▄▃▃▃▁▅▄▄▄▁▃▁▁▃▁▃▁▄▄▃ █
16.4 ns Histogram: log(frequency) by time 24.4 ns <
Memory estimate: 16 bytes, allocs estimate: 1.
c=[1.0,2.0]
@benchmark f1(c[1],c[2])
BenchmarkTools.Trial: 10000 samples with 990 evaluations.
Range (min … max): 45.833 ns … 202.441 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 49.706 ns ┊ GC (median): 0.00%
Time (mean ± σ): 51.062 ns ± 5.727 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁█ ▃▂
▄▂▁▁▁▁▁▂▃██▆██▄▃▃▂▃▃▃▃▂▂▂▁▁▁▁▂▂▂▃▂▂▁▁▁▁▁▁▂▂▂▁▁▁▁▁▁▁▁▁▁
So when I want to extract data from an array and pass it into this function for processing to obtain a new function, the computational speed of the new function will be slow:
arr=[1.0,1.0,1.0,1.0,1.0]
function f2(y)
res=0.0
for i=1:5
res+=f1(arr[i],y)
end
return res
end
@benchmark f2(2.0)
BenchmarkTools.Trial: 10000 samples with 362 evaluations.
Range (min … max): 248.848 ns … 126.508 μs ┊ GC (min … max): 0.00% … 99.75%
Time (median): 266.804 ns ┊ GC (median): 0.00%
Time (mean ± σ): 286.499 ns ± 1.264 μs ┊ GC (mean ± σ): 4.88% ± 2.70%
█
▃▄▂▂▂█▅▁▁▃▅█▇▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
249 ns Histogram: frequency by time 341 ns <
Memory estimate: 320 bytes, allocs estimate: 20.
You can see that the computational speed of obtaining the new function f2 by looping 5 times is 280ns, which is approximately 5 times the time it takes to pass each array element value to the calculation of f1 (51ns).
For the simple function mentioned above, the speed difference between various ways of passing parameters may not be significant. However, when dealing with a very complex polynomial function, the difference in computational speed between direct calculation and passing parameters/extracting array elements as parameters can be extremely large (ranging from 1ns to approximately 700μs):
Therefore, when I use the complex function to loop through and extract array data in order to define a new function, the speed is not ideal.
My first question is about the main source of this speed difference. I speculate that it may be due to the optimization triggered when functions are directly passed constants, but I am not clear on the specific principles and why the speed difference is so significant.
My second question is, since direct algebraic function calculations are fast, is there an optimization method to convert the calculations of extracting array elements in the loop in the example defining f2 into similar direct algebraic calculations? Or is this fundamentally impossible to achieve?
I look forward to everyone’s answers and greatly appreciate it!