It’s not the operators +,*
or ⊕,⊗
which gives different allocations, but the parantheses:
julia> @macroexpand @. dTA_dH +=
-1.0 ⊗ 1.0 ⊗ 1.0 ⊗ (2 ⊗ H^3 ⊕ H^2 ⊗ 1.0 - 1.0 ⊗ 1.0 ⊗ 1.0) /
(H^3 ⊕ H^2 ⊗ 1.0 ⊕ H ⊗ 1.0 ⊗ 1.0 ⊕ 1.0 ⊗ 1.0 ⊗ 1.0)^2
:(dTA_dH .+= (/).((⊗).((⊗).((⊗).(-1.0, 1.0), 1.0), (-).((⊕).((⊗).(2, (^).(H, 3)), (⊗).((^).(H, 2), 1.0)), (⊗).((⊗).(1.0, 1.0), 1.0))), (^).((⊕).((⊕).((⊕).((^).(H, 3), (⊗).((^).(H, 2), 1.0)), (⊗).((⊗).(H, 1.0), 1.0)), (⊗).((⊗).(1.0, 1.0), 1.0)), 2)))
julia> function test2(dTA_dH, H)
dTA_dH .+= (/).((⊗).((⊗).((⊗).(-1.0, 1.0), 1.0), (-).((⊕).((⊗).(2, (^).(H, 3)), (⊗).((^).(H, 2), 1.0)), (⊗).((⊗).(1.0, 1.0), 1.0))), (^).((⊕).((⊕).((⊕).((^).(H, 3), (⊗).((^).(H, 2), 1.0)), (⊗).((⊗).(H, 1.0), 1.0)), (⊗).((⊗).(1.0, 1.0), 1.0)), 2))
end
test2 (generic function with 1 method)
julia> @btime test2($dTA_dH, $H);
298.513 ns (10 allocations: 80 bytes)
## Now changing ⊗ to * and ⊕ to + in above test2 for definition of test:
julia> function test(dTA_dH, H)
dTA_dH .+= (/).((*).((*).((*).(-1.0, 1.0), 1.0), (-).((+).((*).(2, (^).(H, 3)), (*).((^).(H, 2), 1.0)), (*).((*).(1.0, 1.0), 1.0))), (^).((+).((+).((+).((^).(H, 3), (*).((^).(H, 2), 1.0)), (*).((*).(H, 1.0), 1.0)), (*).((*).(1.0, 1.0), 1.0)), 2))
end
test (generic function with 1 method)
julia> @btime test($dTA_dH, $H);
298.092 ns (10 allocations: 80 bytes)
With above example you can see, that using exact the same parantheses in test and test2 and only different symbols for operators results in the same allocations.
You can check the other way round starting with:
julia> @macroexpand @. dTA_dH +=
-1.0 * 1.0 * 1.0 * (2 * H^3 + H^2 * 1.0 - 1.0 * 1.0 * 1.0) /
(H^3 + H^2 * 1.0 + H * 1.0 * 1.0 + 1.0 * 1.0 * 1.0)^2
:(dTA_dH .+= (/).((*).(-1.0, 1.0, 1.0, (-).((+).((*).(2, (^).(H, 3)), (*).((^).(H, 2), 1.0)), (*).(1.0, 1.0, 1.0))), (^).((+).((^).(H, 3), (*).((^).(H, 2), 1.0), (*).(H, 1.0, 1.0), (*).(1.0, 1.0, 1.0)), 2)))
and you see, that both functions now allocate the higher amount.
So, the allocations needed depend on the order the operations are performed.
But how to optimize this is quite a bit over my horizon.