To focus on something concrete, you could probably provide an example where broadcasting loop fusion improves the performance of code compared to the equivalent code in R. You could steal the following example and add an R microbenchmark comparison: