Separable convolution

#1

Hi All,

I would like to run a separate over individual filters. Specifically imagine I have two matrices and I want to convolve each of them with its own filter. And I wonder if it possible to write it as a single convolution.
MWE should be as follows

x,y = randn(3,3,1,1),randn(3,3,1,1)
Cx, Cy = Conv((2,2), 1 => 1), Conv((2,2), 1 => 1)

xy = cat(x, y, dims = 3)
C = ???
C(xy) = cat(Cx(x), Cy(y), dims = 3)

where C = ??? stands for an operator I do not know how to write.

Thanks for an help in advance.
Tomas

#2

I am not certain what all your notation means. But ImageFiltering.jl has support for separable convolution; you can define the kernel as a tuple of lower-dimensional objects that get applied sequentially. It even does pipeline optimization to improve cache utilization.

#3

Sweet, I guess this is exactly what I am interested in. Of course, I would need compatibility with Flux.

#4

I think what you’re asking for is called a “depthwise convolution” in Flux:

C = DepthwiseConv((2,2), 2=>1)

The 2=>1 is really confusing here, but seems to map a two channel input to a two channel output with the property that changing x only changes output[:,:,1,:], and changing y only changes output[:,:,2,:].

Using ImageFiltering would probably work on the CPU if wrapped in an appropriate layer with the flux params plumbed through. But I suspect it would be a lot of work to get GPU training working.

#5

In figuring out the above, I ended up in a rabbit hole of neural network jargon and discovered some other things which might be useful to someone in the future. It’s a bit off topic, but having written most of it already I thought I’d post it here in case someone finds it useful.

It seems that the depthwise conv layer is a special case of what’s known as grouped convolutions, where the number of groups equals the number of channels. I gather this terminology has been around for a while but was only just added to tensorflow recently (see https://github.com/tensorflow/tensorflow/pull/25818).

AFAICT from looking at the Flux source this is not yet present in Flux (though I remain confused about the channel mapping behavior of DepthwiseConv). The following post from last year mentions grouped convolutions, and I see no sign that it’s been added in the meantime:

For GPU training of grouped convolutions, I think something like the following would be required:

  1. Add support for cudnnSetConvolutionGroupCount to CuArrays.
  2. Plumb this through NNlib.jl, presumably in analogy with what needed to be added for DepthwiseConv https://github.com/FluxML/NNlib.jl/pull/42.
  3. Add an API in Flux to support groups. This might be a groups keyword to Conv in analogy to similar high level frameworks.
1 Like
#6

Thank you very much both for help. Intuitively, I was thinking that separable convolution is the thing I want, but frankly speaking, I was not sure. I am not expert in convolutions and I am not that much interested in CNNs. I just run into a small problem I am curious about.

#7

The term separable convolution is a well-established term in image processing and means that an N-dimensional kernel is replaced by a sequence of N 1-dimensional kernels, with the equivalent result. This is not possible for kernels in general but does work for a number of practically important ones, not least Gaussians and derivatives of Gaussians.

From what I can tell this is not what you are looking for.

#8

Yes, it looks like a pair of unrelated convolutions for each channel is desired. That’s definitely not a separable convolution.

One super confusing thing I found reading the NN jargon was that “depthwise convolution” seems to be different from “depthwise separable convolution”, with the latter being a “depthwise convolution” followed by a “1x1 convolution” (ie, “color mixing”; weights connecting all input channels to all output channels, being the same for every pixel).