Is there a more elegant / performant / built-in way to do this? Thanks in advance.
"Apply each function in f_array to each element of the corresponding row."
function func_array_eachrow!(W, f_array)
for i=1:size(W,1)
W[i, :] = f_array[i].(W[i, :])
end
end
W = rand(3,4)
@show W
func_array_eachrow!(W, [sin, cos, sqrt])
@show W
The application is to use a different activation function for each output from a Flux layer.
Maybe
function func_array_eachrow2!(W, f_array)
W .= mapslices(x->map(x->x[1](x[2]), zip(f_array, x)), W, dims=1)
end
There must be a nicer way to say x->x[1](x[2])
, it doesn’t seem to be called apply
in Julia.
1 Like
That’s a really creative way to get it in one line; I definitely wouldn’t have figured that out. mapslices
is a good one for me to remember.
I tested out the performance with the @time
macro for different sizes of W
and f_array
. Typically, the one-line version makes about 10x as many assignments, so the time to completion is about 10x as long as the original. I also replaced the single loop in the original with a nested loop; completion time was the same or 2x as long as the original. So the original appears to be fastest, at least among these options.
Using map!
:
function func_array_eachrow3!(W, f)
@inbounds for i in 1:size(W, 1)
wi = view(W, i, :)
map!(f[i], wi, wi)
end
W
end
f_array = [sin, cos, sqrt]
W = rand(3, 10)
@btime func_array_eachrow!(V, $f_array) setup=(V=copy(W))
@btime func_array_eachrow3!(V, $f_array) setup=(V=copy(W))
727.948 ns (15 allocations: 1.08 KiB)
182.216 ns (6 allocations: 288 bytes)
2 Likes
That’s a very fast solution, and I like the use of map!
and view
.
From my tests with a 12-element f_array
and 12-row W
, it’s consistently 3.3x as fast as the original. And removing the @inbounds
doesn’t hurt performance.
As arrays get large (length(f_array)=50
, size(W, 1)=5000
), performance gets to be on par with (but still better than) the original. But for smaller arrays – which is what I’m working with – what you suggested is much faster (4x).
Thanks!
Note that due to memory order, if you can transpose your problem and make the application column-wise, your cpu will thank you. If not applicable, please ignore this comment.
1 Like
It’s not obvious to me why the following func_array_eachrow4!
function should allocate at all. Why should applying a function coming from an array be different than, say, applying the intrinsic sin
function inside the double loop?
function func_array_eachrow3!(W, f)
for i in 1:size(W, 1)
wi = view(W,i,:)
map!(f[i], wi, wi)
end
W
end
function func_array_eachrow4!(W, f)
for i = 1:size(W,1)
fn = f[i]
for j = 1:size(W,2)
W[i,j] = fn(W[i,j])
end
end
end
f_array = [sin, cos, sqrt]
W = rand(3, 10)
@btime func_array_eachrow3!(V, $f_array) setup=(V=copy(W))
@btime func_array_eachrow4!(V, $f_array) setup=(V=copy(W))
247.872 ns (6 allocations: 288 bytes)
2.344 μs (60 allocations: 960 bytes)
The same allocations happen if I write, e.g.
function func_array_eachrow5!(W, f)
map.(f, W)
end
but in this case we can imagine that the calculations are done in a separate temp array without mutating the original array W
, so allocations are justified here. Can someone comment on the above double loop please?