This query was originally going to be me complaining about why I was getting inconsistent performance over some benchmark times, it turns out that it was at some kind of cache threshold and modifying the size of the vector fixed the issue.

My original code was in R and I decided to speed things up by implementing it in C++ using Rcpp:

```
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
#include <functional>
#include <cmath>
#include <Rcpp.h>
using namespace Rcpp;
double bananaObjective(NumericVector x)
{
return 100*std::pow(x[1] - std::pow(x[0], 2.0), 2.0) + std::pow(1 - x[0], 2.0);
}
//[[Rcpp::export]]
NumericVector bananaObjective(NumericMatrix x)
{
int n = x.nrow();
NumericVector output(x.nrow());
for(int i = 0; i < n; ++i)
{
NumericVector tmp = x(i, _);
output[i] = bananaObjective(tmp);
}
return output;
}
//[[Rcpp::export]]
NumericVector benchmarkBanana(int len, int n)
{
NumericVector output(len);
for(int i = 0; i < len; ++i)
{
//arma::mat tmp(n, 2, arma::fill::randu);
NumericVector v = runif(n*2);
v.attr("dim") = Dimension(n, 2);
NumericMatrix tmp = as<NumericMatrix>(v);
arma::wall_clock timer;
timer.tic();
auto x = bananaObjective(tmp);
output[i] = timer.toc();
}
return output;
}
```

Okay before I carry on, there is this horrible myth that you can get okay performance from R if you pre-cache a vector even in a loop. The performance in an R loop is **always** bad but *sometimes* it can be less bad if you pre-cache. Anyway we can get a x10 speedup just by moving the functions to C++, here is the equivalent R code and benchmark for comparison:

```
.bananaObjective = function(x)
100*(x[2] - (x[1])^2)^2 + (1 - x[1])^2
RBananaObjective = function(x)
{
n = nrow(x)
output = vector(mode = "numeric", length = n)
for(i in 1:n)
output[i] = .bananaObjective(x[i, ])
return(output)
}
require(Rcpp)
sourceCpp("test.cpp")
n = 5E5; x = matrix(runif(n*2), ncol = 2)
system.time(RBananaObjective(x))
# user system elapsed
# 0.540 0.000 0.541
system.time(bananaObjective(x))
# user system elapsed
# 0.047 0.000 0.047
```

Okay great, now we can run the benchmark function in C++ 100 times maybe the returning to R each time has an overhead - apparently it doesnāt:

```
x = benchmarkBanana(100, 500000)
mean(x)
# [1] 0.04853786
```

So what about Julia?

```
function bananaObjective(x::Array{T, 1}) where {T <: AbstractFloat}
return 100*(x[2] - (x[1])^2)^2 + (1 - x[1])^2
end
function bananaObjective(x::Array{T, 2}) where {T <: AbstractFloat}
n = size(x)[1]
output = zeros(T, n)
for i in 1:n
output[i] = bananaObjective(x[i, :])
end
return output
end
# bench(Float64, 100, 1000_000)
function bench(T::Type{<: AbstractFloat}, len::I, n::I) where {I <: Integer}
times = zeros(T, len)
for i in 1:len
x = rand(T, n, 2)
t1 = time(); bananaObjective(x); t2 = time()
times[i] = t2 - t1
end
return times
end
import Statistics: mean
x = bench(Float64, 100, 500_000)
mean(x)
# 0.019942469596862793
```

Thatās more than x2 the speed of the code implemented with Rcpp. Whatās more this code is **much** easier to write than the C++ code and **far** more flexible and generic.