CNN for MNIST

Tomas_Berky · December 10, 2020, 1:36am

Hello guys,

i am trying to make my own CNN for MNIST classification. But i am getting a weird error.

LoadError: DimensionMismatch("Rank of x and w must match! (2 vs. 4)")
in expression starting at C:\Users\tomic\OneDrive\Plocha\BP\julia\3_3.jl:28
DenseConvDims(::Array{Float64,2}, ::Array{Float32,4}; kwargs::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:stride, :padding, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}}) at DenseConvDims.jl:50
(::Core.var"#Type##kw")(::NamedTuple{(:stride, :padding, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::Type{DenseConvDims}, ::Array{Float64,2}, ::Array{Float32,4}) at DenseConvDims.jl:49
#adjoint#1133 at nnlib.jl:37 [inlined]
(::ZygoteRules.var"#adjoint##kw")(::NamedTuple{(:stride, :padding, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::typeof(ZygoteRules.adjoint), ::Zygote.Context, ::Type{DenseConvDims}, ::Array{Float64,2}, ::Array{Float32,4}) at none:0
_pullback at adjoint.jl:53 [inlined]
Conv at conv.jl:146 [inlined]
_pullback(::Zygote.Context, ::Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}}, ::Array{Float64,2}) at interface2.jl:0
applychain at basic.jl:36 [inlined]
_pullback(::Zygote.Context, ::typeof(Flux.applychain), ::Tuple{Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}},MaxPool{2,4},typeof(flatten),Dense{typeof(identity),Array{Float32,2},Array{Float32,1}}}, ::Array{Float64,2}) at interface2.jl:0
Chain at basic.jl:38 [inlined]
_pullback(::Zygote.Context, ::Chain{Tuple{Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),Array{Float32,4},Array{Float32,1}},MaxPool{2,4},typeof(flatten),Dense{typeof(identity),Array{Float32,2},Array{Float32,1}}}}, ::Array{Float64,2}) at interface2.jl:0
L at 3_3.jl:25 [inlined]
_pullback(::Zygote.Context, ::typeof(L), ::Array{Float64,2}, ::Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}) at interface2.jl:0
adjoint at lib.jl:188 [inlined]
_pullback at adjoint.jl:47 [inlined]
#14 at train.jl:103 [inlined]
_pullback(::Zygote.Context, ::Flux.Optimise.var"#14#20"{typeof(L),Tuple{Array{Float64,2},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}) at interface2.jl:0
pullback(::Function, ::Zygote.Params) at interface.jl:167
gradient(::Function, ::Zygote.Params) at interface.jl:48
macro expansion at train.jl:102 [inlined]
macro expansion at progress.jl:119 [inlined]
train!(::Function, ::Zygote.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{Array{Float64,2},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}}, ::Descent; cb::Flux.var"#throttled#42"{Flux.var"#throttled#38#43"{Bool,Bool,var"#88#89",Int64}}) at train.jl:100
(::Flux.Optimise.var"#train!##kw")(::NamedTuple{(:cb,),Tuple{Flux.var"#throttled#42"{Flux.var"#throttled#38#43"{Bool,Bool,var"#88#89",Int64}}}}, ::typeof(Flux.Optimise.train!), ::Function, ::Zygote.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{Array{Float64,2},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}}, ::Descent) at train.jl:98
top-level scope at 3_3.jl:28
include_string(::Function, ::Module, ::String, ::String) at loading.jl:1088

And this is my code… i cant find out what array has 4 dimensions, really confused…

using Pkg
Pkg.add("Flux")
Pkg.add("Images")
Pkg.add("Plots")
using Flux, Flux.Data.MNIST, Images, Plots

labels = MNIST.labels();
images = MNIST.images();

xs = [vec(Float64.(img)) for img in images[1:5000]]
ys = [Flux.onehot(label, 0:9) for label in labels[1:5000]]

imgsize = (28,28,1)
model = Chain(Conv((3, 3), imgsize[3]=>16, pad=(1,1), relu),
        MaxPool((2,2)),
        Conv((3, 3), 16=>32, pad=(1,1), relu),
        MaxPool((2,2)),
        Conv((3, 3), 32=>32, pad=(1,1), relu),
        MaxPool((2,2)),
        flatten,
        Dense(prod(Int.(floor.([imgsize[1]/8,imgsize[2]/8,32]))), 10))



L(x, y) = Flux.crossentropy(model(x), y)
opt = Descent(0.1)
databatch = (Flux.batch(xs), Flux.batch(ys))
Flux.train!(L, params(model), Iterators.repeated(databatch, 1000), opt,
cb = Flux.throttle(() -> println("Probíhá trénování"), 5))

test(i) = findmax(model(vec(Float64.(images[i]))))[2]-1
sum(test(i) == labels[i] for i in 1:60000)/60000code here

ToucheSir · December 10, 2020, 4:25am

julia> xs[1] |> size
(784,)

julia> Flux.batch(xs) |> size
(784, 5000)

The inputs are being flattened to 1D (28\times28 = 784) by vec instead of imgsize. Float32.(img) already returns a 2D array, so all that needs to be done is adding a 3rd unit dimension for the channels:

...
xs = [Flux.unsqueeze(Float32.(img), 3) for img in images[1:5000]]
...

julia> xs[1] |> size
(28, 28, 1)

julia> Flux.batch(xs) |> size
(28, 28, 1, 5000)

Tomas_Berky · December 10, 2020, 10:39am

Did that and my error changed to following…

oadError: DomainError with -0.16674832:
log will only return a complex result if called with a complex argument. Try log(Complex(x)).
in expression starting at C:\Users\tomic\OneDrive\Plocha\BP\julia\test3_3.jl:28
throw_complex_domainerror(::Symbol, ::Float32) at math.jl:33
log(::Float32) at log.jl:321
xlogy at utils.jl:22 [inlined]
_broadcast_getindex_evalf at broadcast.jl:648 [inlined]
_broadcast_getindex at broadcast.jl:621 [inlined]
getindex at broadcast.jl:575 [inlined]
macro expansion at broadcast.jl:932 [inlined]
macro expansion at simdloop.jl:77 [inlined]
copyto! at broadcast.jl:931 [inlined]
copyto! at broadcast.jl:886 [inlined]
copy at broadcast.jl:862 [inlined]
materialize at broadcast.jl:837 [inlined]
adjoint at utils.jl:32 [inlined]
_pullback at adjoint.jl:47 [inlined]
#crossentropy#9 at functions.jl:69 [inlined]
_pullback(::Zygote.Context, ::Flux.Losses.var"##crossentropy#9", ::Int64, ::typeof(Statistics.mean), ::Float32, ::typeof(Flux.Losses.crossentropy), ::Array{Float32,2}, ::Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}) at interface2.jl:0
crossentropy at functions.jl:69 [inlined]
_pullback(::Zygote.Context, ::typeof(Flux.Losses.crossentropy), ::Array{Float32,2}, ::Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}) at interface2.jl:0
L at test3_3.jl:25 [inlined]
_pullback(::Zygote.Context, ::typeof(L), ::Array{Float32,4}, ::Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}) at interface2.jl:0
adjoint at lib.jl:188 [inlined]
_pullback at adjoint.jl:47 [inlined]
#14 at train.jl:103 [inlined]
_pullback(::Zygote.Context, ::Flux.Optimise.var"#14#20"{typeof(L),Tuple{Array{Float32,4},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}) at interface2.jl:0
pullback(::Function, ::Zygote.Params) at interface.jl:167
gradient(::Function, ::Zygote.Params) at interface.jl:48
macro expansion at train.jl:102 [inlined]
macro expansion at progress.jl:119 [inlined]
train!(::Function, ::Zygote.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{Array{Float32,4},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}}, ::Descent; cb::Flux.var"#throttled#42"{Flux.var"#throttled#38#43"{Bool,Bool,var"#19#20",Int64}}) at train.jl:100
(::Flux.Optimise.var"#train!##kw")(::NamedTuple{(:cb,),Tuple{Flux.var"#throttled#42"{Flux.var"#throttled#38#43"{Bool,Bool,var"#19#20",Int64}}}}, ::typeof(Flux.Optimise.train!), ::Function, ::Zygote.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{Array{Float32,4},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}}, ::Descent) at train.jl:98
top-level scope at test3_3.jl:28
include_string(::Function, ::Module, ::String, ::String) at loading.jl:1088

Tomas_Berky · December 10, 2020, 11:46am

Made some other changes as moving to GPU and now trainings works and are fast. But i have problems viewing my results on the last two lines. Given error will be below code.

using Pkg
Pkg.add("Flux")
Pkg.add("Images")
Pkg.add("Plots")
Pkg.add("CUDA")
using Flux, Flux.Data.MNIST, Images, Plots, CUDA

labels = MNIST.labels();
images = MNIST.images();

xs = [Flux.unsqueeze(Float64.(img), 3) for img in images[1:5000]]
ys = [Flux.onehot(label, 0:9) for label in labels[1:5000]]

imgsize = (28,28,1)
model = Chain(Conv((3, 3), imgsize[3]=>16, pad=(1,1), relu),
        MaxPool((2,2)),
        Conv((3, 3), 16=>32, pad=(1,1), relu),
        MaxPool((2,2)),
        Conv((3, 3), 32=>32, pad=(1,1), relu),
        MaxPool((2,2)),
        flatten,
        Dense(prod([3,3,32]), 10)) |> gpu

L(x, y) = Flux.crossentropy(model(x), y)
opt = Descent(0.1)
databatch = (Flux.batch(xs), Flux.batch(ys)) |> gpu
Flux.train!(L, params(model), Iterators.repeated(databatch, 500), opt,
cb = Flux.throttle(() -> println("Probíhá trénování"), 5))

test(i) = findmax(model(vec(Float64.(images[i]))))[2]-1
sum(test(i) == labels[i] for i in 1:60000)/60000

LoadError: DimensionMismatch("Rank of x and w must match! (1 vs. 4)")
in expression starting at C:\Users\tomic\OneDrive\Plocha\BP\julia\3_3.jl:37
DenseConvDims(::Array{Float64,1}, ::CuArray{Float32,4}; kwargs::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:stride, :padding, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}}) at DenseConvDims.jl:50
(::Core.var"#Type##kw")(::NamedTuple{(:stride, :padding, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::Type{DenseConvDims}, ::Array{Float64,1}, ::CuArray{Float32,4}) at DenseConvDims.jl:49
(::Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}})(::Array{Float64,1}) at conv.jl:146
applychain(::Tuple{Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}},MaxPool{2,4},typeof(flatten),Dense{typeof(identity),CuArray{Float32,2},CuArray{Float32,1}}}, ::Array{Float64,1}) at basic.jl:36
(::Chain{Tuple{Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}},MaxPool{2,4},Conv{2,2,typeof(relu),CuArray{Float32,4},CuArray{Float32,1}},MaxPool{2,4},typeof(flatten),Dense{typeof(identity),CuArray{Float32,2},CuArray{Float32,1}}}})(::Array{Float64,1}) at basic.jl:38
test(::Int64) at 3_3.jl:36
(::var"#39#40")(::Int64) at none:0
MappingRF at reduce.jl:93 [inlined]
_foldl_impl(::Base.MappingRF{var"#39#40",Base.BottomRF{typeof(Base.add_sum)}}, ::Base._InitialValue, ::UnitRange{Int64}) at reduce.jl:58
foldl_impl at reduce.jl:48 [inlined]
mapfoldl_impl at reduce.jl:44 [inlined]
#mapfoldl#204 at reduce.jl:160 [inlined]
mapfoldl at reduce.jl:160 [inlined]
#mapreduce#208 at reduce.jl:287 [inlined]
mapreduce at reduce.jl:287 [inlined]
sum at reduce.jl:494 [inlined]
sum(::Base.Generator{UnitRange{Int64},var"#39#40"}) at reduce.jl:511
top-level scope at 3_3.jl:37
include_string(::Function, ::Module, ::String, ::String) at loading.jl:1088

ToucheSir · December 10, 2020, 5:14pm

This line is still flattening images out into a 1D array instead of reshaping them into a 4D (WxHxCxB) array like model is expecting. Stack traces can be a little noisy, but if you ignore all the library code the error pops out.

Tomas_Berky:

LoadError: DimensionMismatch("Rank of x and w must match! (1 vs. 4)")
in expression starting at C:\Users\tomic\OneDrive\Plocha\BP\julia\3_3.jl:37
...
sum at reduce.jl:494 [inlined]
sum(::Base.Generator{UnitRange{Int64},var"#39#40"}) at reduce.jl:511
top-level scope at 3_3.jl:37
include_string(::Function, ::Module, ::String, ::String) at loading.jl:1088

Tomas_Berky · December 10, 2020, 5:23pm

I found that later on, but i still cannot find out how to make it 4D in the test phase… Can you help?

Tomas_Berky · December 10, 2020, 5:34pm

I tried now to change it to this. Now i am getting no errors, but it cannot calculate the test even for one image.

test(i) = findmax(model(Flux.batch(xs)))[2]-1
sum(test(i) == labels[i] for i in 1:1)/1

ToucheSir · December 10, 2020, 7:07pm

Can you post the full code snippet again and elaborate on what “cannot calculate the test even for one image” means?

Tomas_Berky · December 10, 2020, 7:18pm

I changed it to this. Now its works fine. I wanted to test it on its train data, because i need it as short and simple as possible, but with CNN it seems impossible to use train data for testing.

using Pkg
Pkg.add("Flux")
Pkg.add("CUDA")
Pkg.add("Statistics")
using Flux, Flux.Data.MNIST, CUDA, Statistics

labels = MNIST.labels();
images = MNIST.images();

xs = [Flux.unsqueeze(Float64.(img), 3) for img in images[1:5000]]
ys = [Flux.onehot(label, 0:9) for label in labels[1:5000]]

model = Chain(Conv((3, 3), 1=>16, pad=(1,1), relu),
        MaxPool((2,2)),
        Conv((3, 3), 16=>32, pad=(1,1), relu),
        MaxPool((2,2)),
        Conv((3, 3), 32=>32, pad=(1,1), relu),
        MaxPool((2,2)),
        flatten,
        Dense(288,10),
        softmax) |> gpu

L(x, y) = Flux.logitcrossentropy(model(x), y)
opt = Descent(0.1)
databatch = (Flux.batch(xs), Flux.batch(ys)) |> gpu
Flux.train!(L, params(model), Iterators.repeated(databatch, 1000), opt,
cb = Flux.throttle(() -> println("Probíhá trénování"), 5))

test_images = MNIST.images(:test)
test_labels = MNIST.labels(:test)
txs = [Flux.unsqueeze(Float64.(img), 3) for img in test_images[1:1000]]
tys = [Flux.onehot(label, 0:9) for label in test_labels[1:1000]]

test_set = (Flux.batch(txs), Flux.batch(tys)) |> gpu

accuracy(x, y, model) = mean(Flux.onecold(cpu(model(x))) .== Flux.onecold(cpu(y)))
acc = accuracy(test_set..., model)

mbauman · December 10, 2020, 7:30pm

The thing to remember about CNNs is that their input is 3d (instead of the “typical” 1d vector input): X + Y + channel. Thus, when you batch lots of them together, you get a 4d input (instead of the “typical” matrix).

I think what might be tripping you up is that the MNIST dataset is implicitly 1-channel, so you’ve used unsqueeze to add in that third dimension. The batching is what adds that fourth dimension. You can of course test single images from either the test or train set — but you just need to either batch them together or make a single one 4d (again with unsqueeze).

Topic		Replies	Views
Flux.jl DimensionMismatch(3 vs 4) New to Julia question	2	441	April 18, 2022
Conv_mnist model-zoo example, Machine Learning	9	784	August 18, 2021
Why is there a "Warning: Slow Fallback implementation" and a DimensionMismatch error? General Usage images , flux , csv	1	507	February 16, 2021
Size Mismatch Convolution Layer Machine Learning question , flux	3	822	June 24, 2021
Unable to understand dimension mismatch error New to Julia question , flux , error-message	7	3096	June 22, 2021

CNN for MNIST

Related topics