Size Mismatch Convolution Layer

I have some experience using PyTorch, so I decided to try to learn another language for Deep Learning and happened upon Julia and Flux.jl. I am trying to get started with a simple cat/dog classification problem, and I have the data all set up and manipulated (they are all set to grayscale images that are 100x100x1). I am doing this in Jupyter with iJulia. Here is the transformation:

using ColorTypes, ColorVectorSpace

function createArray(directory)
    # load the images
    temp = [load(i) for i in directory]
    # resize down to 100x100
    temp = [imresize(i, (100, 100)) for i in temp]
    # grayscale images
    temp = [Gray.(i) for i in temp]
    # image normalization using 0.5 for both mean and std
    temp = [(i .- 0.5) / 0.5 for i in temp]
    return temp
end

To get this array, I pass in a glob directory to get an array of images. When I call

typeof(dogs)
# it returns
# Vector{Matrix{Gray{Float64}}} (alias for Array{Array{Gray{Float64}, 2}, 1})
# and doing
typeof(dogs[1])
size(dogs[1])
# returns:
# Matrix{Gray{Float64}} (alias for Array{Gray{Float64}, 2})
# and
# (100, 100)
# respectively

Actually as I am writing this, I realize that the size is only 100x100, is the gray channel implied, or is there something wrong with the size (should it return 100x100x1)? That may be the solution, but if not, I am going to continue on.

Here is my model:

final_dims = 25*25*16
model = Flux.Chain(
    Flux.Conv((3, 3), 1 => 8, relu; pad=1),
    Flux.MaxPool((2, 2)),
    Flux.BatchNorm(8),
    Flux.Conv((3, 3), 8 => 16, relu; pad=1),
    Flux.MaxPool((2, 2)),
    Flux.BatchNorm(16),
    Flux.flatten,
    Flux.Dense(final_dims, 8, relu),
    Flux.Dropout(0.4),
    Flux.Dense(8, 2)
)

And when I pass

model(dogs[1])

Just to see if I have everything set up right, I get the following error

#=
DimensionMismatch("Rank of x and w must match! (2 vs. 4)")

Stacktrace:
 [1] DenseConvDims(x::Matrix{Gray{Float64}}, w::Array{Float32, 4}; kwargs::Base.Iterators.Pairs{Symbol, Tuple{Int64, Int64, Vararg{Int64, N} where N}, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:stride, :padding, :dilation), Tuple{Tuple{Int64, Int64}, NTuple{4, Int64}, Tuple{Int64, Int64}}}})
   @ NNlib C:\Users\tyler\.julia\packages\NNlib\3MZcC\src\dim_helpers\DenseConvDims.jl:50
 [2] (::Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}})(x::Matrix{Gray{Float64}})
   @ Flux C:\Users\tyler\.julia\packages\Flux\0c9kI\src\layers\conv.jl:156
 [3] applychain(fs::Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, MaxPool{2, 4}, BatchNorm{typeof(identity), Vector{Float32}, Float32, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, MaxPool{2, 4}, BatchNorm{typeof(identity), Vector{Float32}, Float32, Vector{Float32}}, typeof(Flux.flatten), Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dropout{Float64, Colon}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}, x::Matrix{Gray{Float64}})
   @ Flux C:\Users\tyler\.julia\packages\Flux\0c9kI\src\layers\basic.jl:36
 [4] (::Chain{Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, MaxPool{2, 4}, BatchNorm{typeof(identity), Vector{Float32}, Float32, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, MaxPool{2, 4}, BatchNorm{typeof(identity), Vector{Float32}, Float32, Vector{Float32}}, typeof(Flux.flatten), Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dropout{Float64, Colon}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}})(x::Matrix{Gray{Float64}})
   @ Flux C:\Users\tyler\.julia\packages\Flux\0c9kI\src\layers\basic.jl:38
 [5] top-level scope
   @ In[237]:1
 [6] eval
   @ .\boot.jl:360 [inlined]
 [7] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base .\loading.jl:1094
=#

Where is the size mismatch happening? I tried to include as much as I thought was relevant to the issue.

You seem to be on the right track. 2D conv layers require 4D input, and you need to add the channel and batch dimensions manually. Something like model(reshape(dogs[1], 1, 1)) should put a single training sample through.

I think (typing on the phone) that something like cat(dogs[1:16]; dims=4) will give a batch of 16 examples (and it also looks a bit funny in the context).

2 Likes

I think this needs a splat, multi = cat(dogs[1:16]...; dims=4). You could also do multi == reshape(reduce(hcat, dogs[1:16]), 100,100,1,:).

2 Likes

Ok, so that worked, thank you! What exactly does the “…” do? Could I just find more searching cat() on the Julia docs? Sorry this took a bit longer to respond to, been busy, but thanks again.