Speeding up antialiased non-linearity

Hi,

I’m playing around with some guitar amplifier modeling and in the course of that I implemented the anti-derivative anti-aliased non-linearity of “Note on Alias Suppression in Digital Distortion”, Martin Vicanek in a naive form:

function dist_aa2(x)
  x0 = x[3:end,:,:]
  x1 = x[2:(end-1),:,:]
  x2 = x[1:(end-2),:,:]

  F1 = sqrt.(1 .+ x1.^2)
  F12 = sqrt.(1 .+ ((x0 + x1)./2).^2)
  F23 = sqrt.(1 .+ ((x1 + x2)./2).^2)

  ((x0 .+ 3 .* x1) ./ (F12 .+ F1) .+ (x2 .+ 3 .* x1) ./ (F23 .+ F1)) ./ 4
end

While it works in my experiments (i.e. training converges and produces nice results) it seems to be quite slow in the current form and wonder if there is an obvious way to speed it up. I’m using Flux with CUDA.jl/cuDNN.jl and use it as activation by using a Flux.Chain(Flux.Conv(#= 1D convolution without activation here =#), dist_aa2) as a single layer.

Thanks!

You are allocating lots of temporary arrays. At least on a CPU, I would just write a single scalar function and then broadcast it over views:

function dist_aa2(x0, x1, x2)
  F1 = sqrt(1 + x1^2)
  F12 = sqrt(1 + ((x0 + x1)/2)^2)
  F23 = sqrt(1 + ((x1 + x2)/2)^2)
  return ((x0 + 3 * x1) / (F12 + F1) + (x2 + 3 * x1) / (F23 + F1)) * (1//4)
end

dist_aa2(x) = @views dist_aa2.(x[3:end,:,:], x[2:(end-1),:,:], x[1:(end-2),:,:])

(For x = rand(100,100,100) on my CPU, though, this is only about a 40% speedup.)