I wish to fastly convert the image blocks to columns. Such a function is called im2col in Matlab. I was wondering if there is any off-the-shelf function in Julia that does the some job.
We have a solution that IMO is considerably better: mapwindow
in the ImageFiltering package. The reason I say it is better is that it doesn’t force you to allocate an array that is much bigger than your image. Here are the first few lines of its docstring (?mapwindow
from the REPL prompt):
julia> using ImageFiltering
help?> mapwindow
search: mapwindow mapwindow!mapwindow(f, img, window, [border=“replicate”], [imginds=axes(img)]) → imgf
Apply
f
to sliding windows ofimg
, with window size or axes specified bywindow
. For example,mapwindow(median!, img, window)
returns an Array of values similar toimg
(median-filtered, of course), whereasmapwindow(extrema, img, window)
returns an Array of(min,max)
tuples over a window of sizewindow
centered on each point ofimg
.The function
f
receives a bufferbuf
for the window of data surrounding the current point. If window is specified as a Dims-tuple (tuple-of-integers), then all the integers must be odd and the window is centered around the current image point. For example, ifwindow=(3,3)
, thenf
will receive an Arraybuf
corresponding to offsets(-1:1, -1:1)
from theimgf[i,j]
for which this is currently being computed. Alternatively, window can be a tuple of AbstractUnitRanges, in which case the specified ranges are used forbuf
; this allows you to use asymmetric windows if needed.
(There’s more to it, but this probably gives you the idea.)
By default the window preserves its shape, which can be handy for some operations. If you don’t want that, then notice the part about specializing default_shape
.
@tim.holy, I think what you described is similar to MATLAB’s nlfilter
(Which in many cases could be replicated with more efficient colfilt
).
But sometimes you need to change the layout in memory for more efficient memory access (Caching) and easier vectorization.
Does mapwindow
do that?
Is there explicit equivalent of im2col
(And col2im
)?
Yes, that looks like roughly the same thing, except theirs doesn’t support different boundary conditions or asymmetric windows. Also, ours is fast either for passing a multidimensional array (as in nlfilter
) or a column (as in colfilt
).
Also, note that “easier vectorization” isn’t particularly relevant for Julia, it’s only important for slow languages. But I agree the cache-efficiency might be important in some applications, so I can see why you’d want this.
Fortunately, it comes for free via composition (note: example on julia 0.6, there seems to be some issue with 1.0, I’ll look into it):
julia> using ImageFiltering, StaticArrays
julia> A = collect(reshape(linspace(0, 1, 15), 3, 5))
3Ă—5 Array{Float64,2}:
0.0 0.214286 0.428571 0.642857 0.857143
0.0714286 0.285714 0.5 0.714286 0.928571
0.142857 0.357143 0.571429 0.785714 1.0
julia> out = similar(A, SVector{4,eltype(A)})
3Ă—5 Array{StaticArrays.SArray{Tuple{4},Float64,1,4},2}:
[6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] … [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310]
[6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310]
[6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310] [6.91108e-310, 6.91108e-310, 6.91108e-310, 6.91108e-310]
julia> mapwindow!(identity, out, A, (0:1, 0:1))
3Ă—5 Array{StaticArrays.SArray{Tuple{4},Float64,1,4},2}:
[0.0, 0.0714286, 0.214286, 0.285714] [0.214286, 0.285714, 0.428571, 0.5] [0.428571, 0.5, 0.642857, 0.714286] [0.642857, 0.714286, 0.857143, 0.928571] [0.857143, 0.928571, 0.857143, 0.928571]
[0.0714286, 0.142857, 0.285714, 0.357143] [0.285714, 0.357143, 0.5, 0.571429] [0.5, 0.571429, 0.714286, 0.785714] [0.714286, 0.785714, 0.928571, 1.0] [0.928571, 1.0, 0.928571, 1.0]
[0.142857, 0.142857, 0.357143, 0.357143] [0.357143, 0.357143, 0.571429, 0.571429] [0.571429, 0.571429, 0.785714, 0.785714] [0.785714, 0.785714, 1.0, 1.0] [1.0, 1.0, 1.0, 1.0]
If you like the plain-old im2col
output it’s just
julia> i2c = reinterpret(Float64, out, (4, 15))
4Ă—15 Array{Float64,2}:
0.0 0.0714286 0.142857 0.214286 0.285714 0.357143 0.428571 0.5 0.571429 0.642857 0.714286 0.785714 0.857143 0.928571 1.0
0.0714286 0.142857 0.142857 0.285714 0.357143 0.357143 0.5 0.571429 0.571429 0.714286 0.785714 0.785714 0.928571 1.0 1.0
0.214286 0.285714 0.357143 0.428571 0.5 0.571429 0.642857 0.714286 0.785714 0.857143 0.928571 1.0 0.857143 0.928571 1.0
0.285714 0.357143 0.357143 0.5 0.571429 0.571429 0.714286 0.785714 0.785714 0.928571 1.0 1.0 0.928571 1.0 1.0
Note we get one column per input pixel. If you want something like Matlab’s version which only returns columns corresponding to interior windows, use Inner()
for the border:
julia> out = similar(A, SVector{4,eltype(A)}, (2,4))
2Ă—4 Array{StaticArrays.SArray{Tuple{4},Float64,1,4},2}:
[0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0]
julia> mapwindow!(identity, out, A, (0:1, 0:1), Inner())
2Ă—4 Array{StaticArrays.SArray{Tuple{4},Float64,1,4},2}:
[0.0, 0.0714286, 0.214286, 0.285714] [0.214286, 0.285714, 0.428571, 0.5] [0.428571, 0.5, 0.642857, 0.714286] [0.642857, 0.714286, 0.857143, 0.928571]
[0.0714286, 0.142857, 0.285714, 0.357143] [0.285714, 0.357143, 0.5, 0.571429] [0.5, 0.571429, 0.714286, 0.785714] [0.714286, 0.785714, 0.928571, 1.0]
julia> i2c = reinterpret(Float64, out, (4, 8))
4Ă—8 Array{Float64,2}:
0.0 0.0714286 0.214286 0.285714 0.428571 0.5 0.642857 0.714286
0.0714286 0.142857 0.285714 0.357143 0.5 0.571429 0.714286 0.785714
0.214286 0.285714 0.428571 0.5 0.642857 0.714286 0.857143 0.928571
0.285714 0.357143 0.5 0.571429 0.714286 0.785714 0.928571 1.0
If this is something you do a lot, then of course you should wrap it in a function.
As is hopefully clear, the Julia approach has advantages in terms of flexibility:
- you can have as many output “windows” as there are pixels in the original image (or not, it’s up to you). This can make alignment of the input and the result of some computation considerably easier.
- you can have the output keep each column as an inner vector (or matrix, if you used
SMatrix{2,2}
instead ofSVector{4}
) and preserve the shape of the original image, or you can “flatten” into a matrix representation. -
mapwindow
is also more general than Matlab’ssliding
vsdistinct
by allowing you to specify the output indices you want (see theimginds
argument)
I had no doubt Julia’s implementation will be more efficient and flexible.
Yet when I was talking about Vectorization I meant the ability of the compiler to vectorize the code (Yield SIMD code).
I’m not an expert but I think it is easier for the compiler in cases the data (Image patch) is contiguous in memory.
Hence sometimes im2col
approach, though memory intensive, is more efficient (In the run time sense).
Well, like tools, you better have them both and use them properly based on the task.
By the way, the mapwindow
funciton screams for something like ISPC (SPMD Programming) which can basically solve all problems.
What do you think?
The nice thing about Julia is that you can implement your own algorithm quite easily and it turns out to be also fast. Here is a very rough implementation of MATLAB’s im2col(A, [m,n])
in Julia:
julia> function im2col(A, m,n) # mxn: block_size
M,N = size(A)
mc = M-m+1 # no. vertical blocks
nc = N-n+1 # no. horizontal blocks
B = Array{eltype(A)}(undef, m*n, mc*nc)
for j = 1:nc
for i = 1:mc
@views block = A[i:i+m-1, j:j+n-1]
for k=1:m*n
B[k,(j-1)*mc+i] = block[k]
end
end
end
B
end
im2col (generic function with 1 method)
julia> A = reshape(1:64,8,:) # A: 8x8
8Ă—8 reshape(::UnitRange{Int64}, 8, 8) with eltype Int64:
1 9 17 25 33 41 49 57
2 10 18 26 34 42 50 58
3 11 19 27 35 43 51 59
4 12 20 28 36 44 52 60
5 13 21 29 37 45 53 61
6 14 22 30 38 46 54 62
7 15 23 31 39 47 55 63
8 16 24 32 40 48 56 64
julia> @btime B = im2col(A, 2,2)
882.021 ns (1 allocation: 1.77 KiB)
4Ă—49 Array{Int64,2}:
1 2 3 4 5 6 7 9 10 11 12 13 14 15 … 41 42 43 44 45 46 47 49 50 51 52 53 54 55
2 3 4 5 6 7 8 10 11 12 13 14 15 16 42 43 44 45 46 47 48 50 51 52 53 54 55 56
9 10 11 12 13 14 15 17 18 19 20 21 22 23 49 50 51 52 53 54 55 57 58 59 60 61 62 63
10 11 12 13 14 15 16 18 19 20 21 22 23 24 50 51 52 53 54 55 56 58 59 60 61 62 63 64
Got it, yes, usage of SIMD could still use improvement. It’s a hard problem, but worthy.
If we can build “Sub Compiler” like ISPC for Julia you’ll get SIMD + Multi Threading for free.
It’s not an easy task as one can read through the original developer blog - Matt Pharr - The Story of ISPC (Really worth reading).
And if you’re brave enough, here is a version (quite cryptic though) that’s 80X faster than MATLAB:
julia> using BenchmarkTools
julia> @inline function im2col(A, m,n) # mxn: block_size
M,N = size(A)
B = Array{eltype(A)}(undef, m*n, (M-m+1)*(N-n+1))
indx = reshape(1:M*N, M,N)[1:M-m+1,1:N-n+1]
for (i,value) in enumerate(indx)
for j = 0:n-1
@views B[(i-1)*m*n+j*m+1:(i-1)m*n+(j+1)m] = A[value+j*M:value+m-1+j*M]
end
end
B
end
im2col (generic function with 1 method)
julia> A = reshape(1:8^2,8,:); # 8x8
julia> @btime im2col($A, 2,2);
364.194 ns (2 allocations: 2.30 KiB)
And here is the MATLAB version (timings taken from MATLAB profiler):
I think your function is somehow getting an unfair advantage from operating on a reshaped range. On honest arrays, a generalized version of @tim.holy’s code is quite competitive (on my modest hardware, at least):
function im2col2(A,m,n)
sz = size(A)
out = similar(A, SVector{m*n,eltype(A)}, sz .- 1)
mapwindow!(identity, out, A, (0:m-1, 0:n-1), Inner())
nout = prod(size(out))
i2c = reshape(reinterpret(eltype(A), out), (m*n, nout))
end