Feedforward NN using StaticArrays with no allocation

bender9000 · January 24, 2019, 2:20pm

I am writing a performance sensitive application that cannot have any allocations which also runs inference using a multi layered perceptron. The dimensions of each layers output in Neural Network are fairly small and benchmarking individual layers forward pass stage revealed that using StaticArrays was faster.

My question is how do I implement the inference step in an elegant way using StaticArrays. My current implementation uses Matrix and Vectors for weights and bias of each layer and A_mul_B! to avoid any allocations.

ChrisRackauckas · January 24, 2019, 2:28pm

Just use out of place. StaticArrays are stack-allocated structures so they won’t allocate when you create them. They’re more like a high dimensional number, like a complex number or a Float64. They don’t heap-allocate memory. So the algorithm is just tanh(W2*sigma(W1*x)) etc.

bender9000 · January 24, 2019, 2:44pm

Thanks for your response. I am still getting allocations when I run this simple foobar example in Julia 0.6. I think I might be missing something.

using BenchmarkTools
using StaticArrays

@inline relu{T <: AbstractFloat}(x::T) = max(zero(T), x)

function profile()
    input = @SVector rand(5);
    W1 = @SMatrix rand(10,5); b1 = @SVector rand(10);
    W2 = @SMatrix rand(1,10); b2 = @SVector rand(1);
    
    @btime W2*relu.(W1*input+b1)+b2
end

profile()
# prints the following
# 274.584 ns (5 allocations: 320 bytes)
# 1-element StaticArrays.SArray{Tuple{1},Float64,1,1}:
# 11.5083

bender9000 · January 24, 2019, 2:49pm

@ChrisRackauckas Actually, I take that back. If use @allocated instead of @btime I get zero allocations. Not sure what’s happening there. I am assuming that means there are no allocations in practice.

kristoffer.carlsson · January 24, 2019, 2:49pm

Is this coded on Julia 0.6? 1.0 has a lot of performance fixes.

Also, interpolate the variables into the @btime macro, or make them const.

bender9000 · January 24, 2019, 3:02pm

Yes, it is coded in Julia 0.6 (edited the comment to state it explicitly). I guess the few allocations were because of the macro creating the closure. Following what you said caused it to have zero allocations as shown below:

using BenchmarkTools
using StaticArrays

@inline relu{T <: AbstractFloat}(x::T) = max(zero(T), x)

function profile()
    @btime W2*relu.(W1*input+b1)+b2 setup=(input = @SVector rand(5); W1 = @SMatrix rand(10,5); b1 = @SVector rand(10); W2 = @SMatrix rand(1,10); b2 = @SVector rand(1);)
end

profile()
#  22.266 ns (0 allocations: 0 bytes)
# 1-element StaticArrays.SArray{Tuple{1},Float64,1,1}:
# 10.2185

Topic		Replies	Views
Inexplicable allocations when summing `StaticArrays` Performance	8	1126	December 3, 2018
Allocations while using static arrays in a struct General Usage memory-allocation	3	1577	December 12, 2018
Allocations even when using StaticArrays.jl Performance staticarrays	2	78	November 18, 2024
Assigning in place to avoid allocations New to Julia	5	609	June 16, 2021
Memory allocation in matrix multiplication Performance	2	2133	July 8, 2020

Feedforward NN using StaticArrays with no allocation

Related topics