have you checked out Knet.jl?
By âpure convolutionâ I mean only forward convolution, i.e. applying a filter to an image, but not finding its gradient or applying pooling.
Any time youâre thinking about multichannels, what youâre really doing is filtering âvectorâ objects over a spatial domain. But most languages donât have nice ways of storing vector objects packed densely in an array, so they tend to use another array axis to encode the channel information. Even though itâs very standard, itâs unfortunately confusing and messy, because youâre now having to break what would otherwise be a very nice and natural abstraction (that the axes of the array correspond to âspaceâ and that youâre filtering over space).
In Julia itâs easy to do this explicitly:
julia> a = rand(4, 5, 5);
julia> using StaticArrays
julia> b = reinterpret(SVector{4,Float64}, a, (5, 5))
5Ă5 Array{SVector{4,Float64},2}:
[0.144815, 0.772607, 0.872914, 0.908799] [0.633158, 0.0525735, 0.280742, 0.902825] ⌠[0.896061, 0.407699, 0.380729, 0.0341831] [0.755399, 0.0471598, 0.663326, 0.601342]
[0.549754, 0.690097, 0.220682, 0.0802711] [0.28825, 0.994905, 0.805212, 0.89021] [0.743983, 0.966301, 0.18642, 0.702578] [0.286288, 0.378818, 0.836149, 0.543906]
[0.657729, 0.508657, 0.881363, 0.785144] [0.512295, 0.714254, 0.729959, 0.273817] [0.169743, 0.525499, 0.949791, 0.376547] [0.225779, 0.494444, 0.41264, 0.851775]
[0.00444584, 0.933053, 0.187804, 0.500937] [0.355278, 0.154059, 0.011323, 0.514391] [0.197709, 0.550803, 0.649878, 0.111545] [0.871994, 0.446872, 0.859054, 0.238003]
[0.652901, 0.486696, 0.952878, 0.0807595] [0.0998179, 0.281748, 0.477665, 0.74011] [0.26828, 0.278152, 0.925838, 0.386637] [0.400142, 0.0622027, 0.291799, 0.561965]
julia> using ImageFiltering
julia> kern = centered(rand(3,3))
OffsetArrays.OffsetArray{Float64,2,Array{Float64,2}} with indices -1:1Ă-1:1:
0.994265 0.895043 0.577934
0.381196 0.874816 0.395609
0.519834 0.999507 0.371597
julia> imfilter(b, kern)
5Ă5 Array{SVector{4,Float64},2}:
[2.01427, 3.89947, 3.65341, 4.19016] [2.93492, 2.8359, 3.76376, 4.5498] ⌠[4.63843, 2.33211, 3.43263, 2.63515] [4.08414, 1.71187, 3.5868, 2.80771]
[2.63374, 3.78868, 4.01752, 3.98642] [3.09131, 3.54901, 3.96213, 4.09826] [3.7568, 2.84139, 3.55824, 3.13144] [3.04872, 2.27554, 3.54844, 3.24217]
[2.3728, 4.27511, 2.56762, 2.71285] [2.79171, 3.80449, 2.87623, 3.18184] [2.89213, 3.54058, 3.93211, 3.17953] [2.81132, 3.24626, 3.81895, 3.10968]
[2.71393, 3.45083, 3.95264, 2.87203] [2.53135, 2.75678, 3.22067, 3.22934] [2.23564, 2.67726, 3.78592, 2.74956] [2.3726, 2.25835, 3.77262, 2.94542]
[2.10234, 3.41877, 3.3724, 2.03566] [1.3769, 2.29489, 2.89553, 2.85193] [1.948, 1.63945, 4.43804, 2.11742] [2.77972, 1.62081, 3.51652, 2.2943]
Sure, but thereâs also ReverseDiff. In Julia you shouldnât have to do anything âspecialâ: the generic implementation for âplain convolution/correlationâ should work as long as you define your kernel in terms of the right kinds of numbers. You can literally save tens of thousands of lines of code that way.
Iâve thought about this, but the output of deeper networks often has too many channels(e.g. a 32x32x2048 array), which means the vector object is of 2048 long, I guess itâs not a good idea to fit this into a StaticVector
.
PS: For people who may not be familiar with deep learning networks, here is a visualization app called netscope in which you could check out the dimension of the input and output of each layer. Popular networks are listed at the end of the page.
Make sense. In that case construct your filter so that it has size 1 along axes that you donât want to filter along:
julia> using ImageFiltering, OffsetArrays
julia> A = zeros(5, 7); A[:,1] = 1:5;
julia> kern = OffsetArray(rand(1,3), 0:0, -2:0)
OffsetArrays.OffsetArray{Float64,2,Array{Float64,2}} with indices 0:0Ă-2:0:
0.0251132 0.155058 0.0603386
julia> imfilter(A, kern, Fill(0))
5Ă7 Array{Float64,2}:
0.0603386 0.155058 0.0251132 0.0 0.0 0.0 0.0
0.120677 0.310116 0.0502264 0.0 0.0 0.0 0.0
0.181016 0.465174 0.0753396 0.0 0.0 0.0 0.0
0.241354 0.620232 0.100453 0.0 0.0 0.0 0.0
0.301693 0.77529 0.125566 0.0 0.0 0.0 0.0
This is exactly the point - ReverseDiff currently doesnât support convolutions:
julia> using ReverseDiff
INFO: Precompiling module ReverseDiff.
julia> f1(x) = sum(2 .* x)
f1 (generic function with 1 method)
julia> ReverseDiff.gradient(f1, (rand(10,)))
10-element Array{Float64,1}:
2.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
julia> f2(x, w) = sum(conv2(x, w))
f2 (generic function with 1 method)
julia> ReverseDiff.gradient(f2, (rand(11, 11), rand(3,3)))
ERROR: MethodError: no method matching conv2(::ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}, ::ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}})
Stacktrace:
[1] f2(::ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}, ::ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}) at ./REPL[5]:1
[2] ReverseDiff.GradientTape(::Function, ::Tuple{Array{Float64,2},Array{Float64,2}}, ::ReverseDiff.GradientConfig{Tuple{ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}},ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}}}) at /home/dfdx/.julia/v0.6/ReverseDiff/src/api/tape.jl:207
[3] gradient(::Function, ::Tuple{Array{Float64,2},Array{Float64,2}}, ::ReverseDiff.GradientConfig{Tuple{ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}},ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}}}) at /home/dfdx/.julia/v0.6/ReverseDiff/src/api/gradients.jl:22 (repeats 2 times)
To make it work, one needs to add definition of gradient of convolution to ReverseDiff. And to do so, one needs at least to have a function to compute the gradient given input parameters. If somebody wants to use ReverseDiff for deep learning, the same functions (convolution and its gradient) should also be defined for GPU (presumably, using method dispatching on array type). Summarize and you get the first post in this topic
Seems to work for me:
julia> using ReverseDiff, ImageFiltering
julia> f2(x, w) = sum(imfilter(x, (w,)))
f2 (generic function with 1 method)
julia> g = ReverseDiff.gradient(f2, (rand(11, 11), rand(3,3)))
([0.0 0.0 ⌠0.0 0.0; 0.0 0.67748 ⌠1.73115 4.81457; ⌠; 0.0 2.58537 ⌠5.78043 16.352; 0.0 8.06819 ⌠17.9961 50.9701], [61.2169 61.5058 62.326; 61.794 61.3385 61.8287; 63.5663 62.6786 62.9682])
julia> g[1]
11Ă11 Array{Float64,2}:
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.67748 1.43255 1.73115 1.73115 1.73115 1.73115 1.73115 1.73115 1.73115 4.81457
0.0 1.59582 2.35881 3.3945 3.3945 3.3945 3.3945 3.3945 3.3945 3.3945 9.62338
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 2.58537 4.18432 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 5.78043 16.352
0.0 8.06819 12.9459 17.9961 17.9961 17.9961 17.9961 17.9961 17.9961 17.9961 50.9701
julia> g[2]
3Ă3 Array{Float64,2}:
61.2169 61.5058 62.326
61.794 61.3385 61.8287
63.5663 62.6786 62.9682
EDIT: might want to fix up ReverseDiff to handle arbitrary indices, though.
Could you elaborate more on this? Specifically, how to avoid calling imfiler
repeatly in the loops below:
A = rand(32,32,128)
ker = rand(3,3,128,256)
B = zeros(32,32,256)
for kn = 1:256
for cn = 1:128
kern = OffsetArray(ker[:,:,cn,kn], -1:1, -1:1)
B[:,:,kn] += imfilter(A[:,:,cn], kern, Fill(0))
end
end
Oh, cool example. Sorry, I wasnât careful enough in reading that link you posted earlier, now I understand better that itâs not multichannel but multi-filter.
But the same principles still apply. Since youâre using multiple filters I canât get rid of the outer loop, but the cn
loop is just a dot product. So one can do this:
using ImageFiltering, OffsetArrays
# Define a type that supports a*b but that this is meant in the sense of a dot product
struct Scalar{N,T}
v::NTuple{N,T}
end
function Base.:*(a::Scalar{N}, b::Scalar{N}) where N
ret = a.v[1]*b.v[1]
for i = 2:N
ret += a.v[i]*b.v[i]
end
ret
end
Base.@pure Base.zero(::Type{Scalar{N,T}}) where {N,T} =
Scalar{N,T}((zeros(T,N)...))
# Here's one ugly tweak (safe_for_prod is about handling arithmetic overflow, e.g., 8-bit pixels)
ImageFiltering.safe_for_prod(x::Scalar{N,Float64}, ref) where N = x
A = reinterpret(Scalar{128,Float64}, rand(128,32,32), (32, 32))
kerA = reinterpret(Scalar{128,Float64}, rand(128,3,3,256), (3,3,256))
ker = OffsetArray(kerA, -1:1, -1:1, 1:256)
B = zeros(32,32,256)
tmp = similar(B, 32, 32)
for kn = 1:256
imfilter!(view(B, :, :, kn), A, (view(ker, :, :, kn),), Fill(zero(eltype(A))))
end
I would be fine with adding Scalar
to ImageFiltering or elsewhere.
I had to make a one-character change to ImageFiltering to fix a type bug in padding (Iâll submit it in the next day or so), but that was it.
CUDNN is actually already fixed to work with CUDAdrv on the latest release. If you want to try hooking up convolutions or anything else, youâd be more than welcome. See softmax for a working example. More refactoring will happen but it doesnât have to block this.
Yeah, using ReverseDiff with imfilter
as-is like this is definitely not a good idea. It will work numerically but only by differentiating through tens of millions of scalar operations; even a moderately sized image (500x500) takes minutes for me. If you compile the tape it will do better, but still be handily beaten by a gradient kernel.
All this is somewhat moot given that AD canât differentiate GPU kernels or libraries calls; convolutions will certainly need to use nvidiaâs library and will therefore need an explicit gradient.
We could rename the package ArrayFiltering or ArrayConvolution if you think thatâs better.
Please no! The field âimage processingâ handles d-dimensional signals. So the name Image
has a general meaning here. It also structures various packages quite nicely:
http://juliaimages.github.io/latest/
Please do not rename one individual package when it really belongs to the âImage Toolboxâ that Images.jl provides (as a metapackage)
But wait, is you that created that package family :-)))
@MikeInnes, thanks for chiming in. FWIW for a 500x500 image I get something thatâs a lot better, if I compile and ask only for the gradient with respect to the kernel:
julia> @time ReverseDiff.gradient!(res, compiled_f_tape, kern)
0.468191 seconds (4 allocations: 160 bytes)
3Ă3 Array{Float64,2}:
1.24595e5 1.24343e5 1.24098e5
124347.0 124095.0 1.23851e5
124089.0 1.23838e5 1.23594e5
But youâre absolutely right, that does still seem slow given that the raw filtering operation is only a few ms and there are only 9 components. So, there does seem to be room for a hand-computed gradient. Fortunately for convolution thatâs easy, though for stuff that comes after it could get a little hairier.
Sweet! I will definitely do.
Hello,
Is Intel IPP integrated as well?
Since Intel MKL is integrated into Julia the next step would be integrating Intel IPP.
FYI, there is a fast ND convolution package:
https://github.com/aamini/FastConv.jl
there is also https://github.com/uschmidt83/ImageProcessing.jl which has a julia im2col
implementation
FastConv is nice if you want something that grows the array by the kernel size (like conv2
, something Iâve never gotten around to implementing in ImageFiltering). Otherwise, though, imfilter
is both faster and more general:
A = rand(1000,1000);
k = rand(3,3);
julia> @btime convn(A, k);
15.451 ms (20 allocations: 7.66 MiB)
julia> @btime imfilter(A, (k,));
12.473 ms (44 allocations: 15.32 MiB)
julia> B = view(A, 1:900, 1:900);
julia> @btime imfilter(B, (k,));
10.139 ms (44 allocations: 12.42 MiB)
julia> @btime convn(B, (k,));
ERROR: MethodError: no method matching convn(::SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},UnitRange{Int64}},false}, ::Tuple{Array{Float64,2}})
Just a quick note: I will put together a FFT-based implementation as soon as I find the time; I havenât forgotten this topic. Also, it seems like itâs still not clear which package will ultimately collect all these implementations?