Computing linear convolution efficiently

GunnarFarneback · August 17, 2021, 8:06pm

Fair enough. But if you want to do repeated computations with the same kernel you will benefit from saving the invariant parts (for both options).

apo383 · August 17, 2021, 9:01pm

I got a very modest 16 MB savings in allocations by using a view:

myconv(f,g) = @view DSP.conv(fftshift(f),g)[N:2N-1]
julia> @btime directconv($f,$g);
  305.838 ms (88 allocations: 183.11 MiB)
julia> @btime myconv($f,$g);
  295.373 ms (87 allocations: 167.85 MiB)

Part of what @GunnarFarneback was referring to is your definition of F outside of the function, which hides its allocations, about 38 MB, and therefore sees unfair advantage over the other methods.

julia> @btime toepconv($F,$g);
  393.324 ms (79 allocations: 106.82 MiB)
julia> toepconv2(f,g) = Toeplitz(f[1:N],f[vcat(1,2N-1:-1:N+1)])*g;
julia> @btime toepconv2($f,$g);
  388.091 ms (91 allocations: 144.96 MiB)

I do suspect it should be possible to simultaneously get the speed of DSP.conv and the allocations of toepconv2. For example, fftshift(f) allocates 30 MB by itself, which is more than the gap with myconv above. Perhaps it could be done in-place, or the convolution altered to avoid shifting at all. Or, there may be a way avoid the reflection in definition of F.

roflmaostc · August 18, 2021, 5:24pm

Depending on your datatype you can also save a factor of 2 by using real valued FFTs rfft.

If you need performant wrappers for FFT convolution planning (depending on the type), you can also check out FourierTools.jl (no paid advertisement )

We provide a wrapper to get a planned convolution similar to plan_fft, etc. That provides a little bit of speedup.

If you want to bring memory consumption to almost zero, you can also try fft! but which only works on complex arrays.

julia> using FFTW

julia> function plan_conv!(v; flags=FFTW.MEASURE)
           v_ft = fft(v)
           p = plan_fft!(copy(v), flags=flags)
           function conv!(u)
               u .= (p * u) .* v_ft
               return inv(p) * u
           end
           return conv!
       end
plan_in_place (generic function with 2 methods)

julia> v = randn(ComplexF32, (1024, 1024));

julia> u = randn(ComplexF32, (1024, 1024));

julia> @time my_conv! = plan_conv!(v);
  0.043965 seconds (25.53 k allocations: 9.676 MiB, 14.00% compilation time)

julia> @time my_conv! = plan_conv!(v);
  0.039729 seconds (65 allocations: 8.005 MiB)

julia> @time my_conv!(u);
  0.028485 seconds (21.88 k allocations: 8.962 MiB, 72.59% compilation time)

julia> @time my_conv!(u);
  0.007707 seconds (2 allocations: 160 bytes)

Edit: Note, that plan_fft! might overwrite the provided array with zeros. Hence we should copy it before!

EditEdit: I added now a generic version to FourierTools.jl which works without any allocations during convolution itself (excluding planning).

Topic		Replies	Views
Fastest Julia implementation for a cyclic convolution of real vectors Performance fftw , fft	17	394	December 12, 2024
Efficient implementation of a convolution within a auto-differentiable objective function Optimization (Mathematical)	9	532	March 8, 2023
Scipy.signal.convolve in Julia General Usage question	6	1078	December 4, 2022
Repeated Convolutions With Large 1D Arrays Performance question , dsp , fft	20	368	April 14, 2025
Iterative 2D Convolution Optimization, Help? Performance question	1	57	June 25, 2025

Computing linear convolution efficiently

Related topics