I would like to run a separate over individual filters. Specifically imagine I have two matrices and I want to convolve each of them with its own filter. And I wonder if it possible to write it as a single convolution.
MWE should be as follows
x,y = randn(3,3,1,1),randn(3,3,1,1)
Cx, Cy = Conv((2,2), 1 => 1), Conv((2,2), 1 => 1)
xy = cat(x, y, dims = 3)
C = ???
C(xy) = cat(Cx(x), Cy(y), dims = 3)
C = ??? stands for an operator I do not know how to write.
Thanks for an help in advance.
I am not certain what all your notation means. But ImageFiltering.jl has support for separable convolution; you can define the kernel as a tuple of lower-dimensional objects that get applied sequentially. It even does pipeline optimization to improve cache utilization.
Sweet, I guess this is exactly what I am interested in. Of course, I would need compatibility with Flux.
I think what you’re asking for is called a “depthwise convolution” in Flux:
C = DepthwiseConv((2,2), 2=>1)
2=>1 is really confusing here, but seems to map a two channel input to a two channel output with the property that changing
x only changes
output[:,:,1,:], and changing
y only changes
ImageFiltering would probably work on the CPU if wrapped in an appropriate layer with the flux params plumbed through. But I suspect it would be a lot of work to get GPU training working.
In figuring out the above, I ended up in a rabbit hole of neural network jargon and discovered some other things which might be useful to someone in the future. It’s a bit off topic, but having written most of it already I thought I’d post it here in case someone finds it useful.
It seems that the depthwise conv layer is a special case of what’s known as grouped convolutions, where the number of groups equals the number of channels. I gather this terminology has been around for a while but was only just added to tensorflow recently (see https://github.com/tensorflow/tensorflow/pull/25818).
AFAICT from looking at the Flux source this is not yet present in Flux (though I remain confused about the channel mapping behavior of DepthwiseConv). The following post from last year mentions grouped convolutions, and I see no sign that it’s been added in the meantime:
For GPU training of grouped convolutions, I think something like the following would be required:
- Add support for
- Plumb this through NNlib.jl, presumably in analogy with what needed to be added for
- Add an API in Flux to support groups. This might be a
groups keyword to
Conv in analogy to similar high level frameworks.
Thank you very much both for help. Intuitively, I was thinking that separable convolution is the thing I want, but frankly speaking, I was not sure. I am not expert in convolutions and I am not that much interested in CNNs. I just run into a small problem I am curious about.
The term separable convolution is a well-established term in image processing and means that an N-dimensional kernel is replaced by a sequence of N 1-dimensional kernels, with the equivalent result. This is not possible for kernels in general but does work for a number of practically important ones, not least Gaussians and derivatives of Gaussians.
From what I can tell this is not what you are looking for.
Yes, it looks like a pair of unrelated convolutions for each channel is desired. That’s definitely not a separable convolution.
One super confusing thing I found reading the NN jargon was that “depthwise convolution” seems to be different from “depthwise separable convolution”, with the latter being a “depthwise convolution” followed by a “1x1 convolution” (ie, “color mixing”; weights connecting all input channels to all output channels, being the same for every pixel).