Buried inside a for loop of mine, I have a function which, morally, looks like:
function f!(a,b)
@. a = b;
a
end
It gets used inside the for loop like:
# construct vector x of length nx
# construct vector y of length ny > nx
for i in 1:iters
# code
rows = sample(1:ny, nx,replace = false); # grab random rows of y;
f!(x, y[rows]);
end
The above code works without issue. However, there are some cases where, instead of x
and y
being vectors, they are matrices with the same number of columns. The equivalent code is then:
# construct matrix x of size (nx,d)
# construct vector y of size (ny,d)
for i in 1:iters
# code
rows = sample(1:ny, nx,replace = false); # grab random rows of y;
f!(x, y[rows,:]);
end
and notice that I have had to use y[rows,;]
. Now, I could use this in the vector case, but I take a performance hit:
Random.seed!(100);
x = zeros(100);
y = randn(1000);
rows = sample(1:1000, 100, replace=false);
@btime f!(x,y[rows]);
124.491 ns (2 allocations: 928 bytes)
@btime f!(x,y[rows,:]);
320.127 ns (2 allocations: 944 bytes)
Also, if I create x
and y
as zeros(100,1)
and randn(1000,1)
, I also take the performance hit.
The case where I am only working with vectors is common enough for me that I don’t want to take this performance hit. To avoid the hit now, I have two versions of my outer function (which has the aforementioned for loop), and one is for the vector case and one is for the matrix case via multiple dispatch. But it seems really silly to have that when the only distinction is y[rows]
and y[rows,:]
. Is there a cleaner way to reconcile this?