Separable convolution

Hi All,

I would like to run a separate over individual filters. Specifically imagine I have two matrices and I want to convolve each of them with its own filter. And I wonder if it possible to write it as a single convolution.
MWE should be as follows

x,y = randn(3,3,1,1),randn(3,3,1,1)
Cx, Cy = Conv((2,2), 1 => 1), Conv((2,2), 1 => 1)

xy = cat(x, y, dims = 3)
C = ???
C(xy) = cat(Cx(x), Cy(y), dims = 3)

where C = ??? stands for an operator I do not know how to write.

Thanks for an help in advance.

I am not certain what all your notation means. But ImageFiltering.jl has support for separable convolution; you can define the kernel as a tuple of lower-dimensional objects that get applied sequentially. It even does pipeline optimization to improve cache utilization.

Sweet, I guess this is exactly what I am interested in. Of course, I would need compatibility with Flux.

I think what you’re asking for is called a “depthwise convolution” in Flux:

C = DepthwiseConv((2,2), 2=>1)

The 2=>1 is really confusing here, but seems to map a two channel input to a two channel output with the property that changing x only changes output[:,:,1,:], and changing y only changes output[:,:,2,:].

Using ImageFiltering would probably work on the CPU if wrapped in an appropriate layer with the flux params plumbed through. But I suspect it would be a lot of work to get GPU training working.

In figuring out the above, I ended up in a rabbit hole of neural network jargon and discovered some other things which might be useful to someone in the future. It’s a bit off topic, but having written most of it already I thought I’d post it here in case someone finds it useful.

It seems that the depthwise conv layer is a special case of what’s known as grouped convolutions, where the number of groups equals the number of channels. I gather this terminology has been around for a while but was only just added to tensorflow recently (see

AFAICT from looking at the Flux source this is not yet present in Flux (though I remain confused about the channel mapping behavior of DepthwiseConv). The following post from last year mentions grouped convolutions, and I see no sign that it’s been added in the meantime:

For GPU training of grouped convolutions, I think something like the following would be required:

  1. Add support for cudnnSetConvolutionGroupCount to CuArrays.
  2. Plumb this through NNlib.jl, presumably in analogy with what needed to be added for DepthwiseConv
  3. Add an API in Flux to support groups. This might be a groups keyword to Conv in analogy to similar high level frameworks.

Thank you very much both for help. Intuitively, I was thinking that separable convolution is the thing I want, but frankly speaking, I was not sure. I am not expert in convolutions and I am not that much interested in CNNs. I just run into a small problem I am curious about.

The term separable convolution is a well-established term in image processing and means that an N-dimensional kernel is replaced by a sequence of N 1-dimensional kernels, with the equivalent result. This is not possible for kernels in general but does work for a number of practically important ones, not least Gaussians and derivatives of Gaussians.

From what I can tell this is not what you are looking for.

Yes, it looks like a pair of unrelated convolutions for each channel is desired. That’s definitely not a separable convolution.

One super confusing thing I found reading the NN jargon was that “depthwise convolution” seems to be different from “depthwise separable convolution”, with the latter being a “depthwise convolution” followed by a “1x1 convolution” (ie, “color mixing”; weights connecting all input channels to all output channels, being the same for every pixel).