Hello I have this problem with my neural network. I want to classify a 7 segment led digits.
Problem is i get this error:
DimensionMismatch: matrix A has dimensions (100,10), matrix B has dimensions (1,7)
1. <mark>**_generic_matmatmul!**(::Matrix{Float32}, ::Char, ::Char, ::Matrix{Float32}, ::Vector{Int64}, ::LinearAlgebra.MulAddMul{true, true, Bool, Bool})</mark>@*matmul.jl:856*
2. <mark>**generic_matmatmul!**</mark>@*matmul.jl:847* [inlined]
3. <mark>**mul!**</mark>@*matmul.jl:407* [inlined]
4. <mark>**mul!**</mark>@*matmul.jl:276* [inlined]
5. <mark>*****</mark>@*matmul.jl:141* [inlined]
6. <mark>**rrule**</mark>@*arraymath.jl:40* [inlined]
7. <mark>**rrule**</mark>@*rules.jl:134* [inlined]
8. <mark>**chain_rrule**</mark>@*chainrules.jl:223* [inlined]
9. <mark>**macro expansion**</mark>@*interface2.jl:101* [inlined]
10. <mark>**_pullback**</mark>@*interface2.jl:101* [inlined]
11. <mark>**_pullback**</mark>@*[Other: 10](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)* [inlined]
12. <mark>**_pullback**(::Zygote.Context{true}, ::Main.var"workspace#4".Layer, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
13. <mark>**_apply**</mark>@*boot.jl:838* [inlined]
14. <mark>**adjoint**</mark>@*lib.jl:203* [inlined]
15. <mark>**_pullback**</mark>@*adjoint.jl:66* [inlined]
16. <mark>**_pullback**</mark>@*operators.jl:1035* [inlined]
17. <mark>**_pullback**</mark>@*operators.jl:1034* [inlined]
18. <mark>**_pullback**</mark>@*operators.jl:1031* [inlined]
19. <mark>**_pullback**(::Zygote.Context{true}, ::Base.var"##_#97", ::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ::ComposedFunction{Main.var"workspace#4".Layer, Main.var"workspace#4".Layer}, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
20. <mark>**_apply**(::Function, ::Vararg{Any})</mark>@*boot.jl:838*
21. <mark>**adjoint**</mark>@*lib.jl:203* [inlined]
22. <mark>**_pullback**</mark>@*adjoint.jl:66* [inlined]
23. <mark>**_pullback**</mark>@*operators.jl:1031* [inlined]
24. <mark>**_pullback**(::Zygote.Context{true}, ::ComposedFunction{Main.var"workspace#4".Layer, Main.var"workspace#4".Layer}, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
25. <mark>**_pullback**</mark>@*[Other: 8](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)* [inlined]
26. <mark>**_pullback**(::Zygote.Context{true}, ::Main.var"workspace#124".Network, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
27. <mark>**_pullback**</mark>@*[Local: 18](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)* [inlined]
28. <mark>**_pullback**(::Zygote.Context{true}, ::Main.var"workspace#2623".var"#1#2"{Int64})</mark>@*interface2.jl:0*
29. <mark>**pullback**(::Function, ::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})</mark>@*interface.jl:414*
30. <mark>**gradient**(::Function, ::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})</mark>@*interface.jl:96*
31. <mark>**top-level scope**</mark>@*[Local: 17](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)*
Here is my code to this problem:
begin
struct Layer
W::Matrix{Float32} # weight matrix - Float32 for faster gradients
b::Vector{Float32} # bias vector
activation::Function
Layer(in::Int64,out::Int64,activation::Function=identityFunction) =
new(randn(out,in),randn(out),activation) # constructor
end
(m::Layer)(x) = m.activation.(m.W*x .+ m.b) # feed-forward pass
end
begin
ReLu(x) = max(0,x)
identityFunction(x) = x
end;
begin
struct Network
layers::Vector{Layer}
Network(layers::Vararg{Layer}) = new(vcat(layers...))
# constructor - allow arbitrarily many layers
end
(n::Network)(x) = reduce((left,right)->rightâleft, n.layers)(x)
# perform layer-wise operations over arbitrarily many layers
end
begin
inputs = [
1 1 1 1 1 1 0;
0 1 1 0 0 0 0;
1 1 0 1 1 0 1;
1 1 1 1 0 0 1;
0 1 1 0 0 1 1;
1 0 1 1 0 1 1;
1 0 1 1 1 1 1;
1 1 1 0 0 0 0;
1 1 1 1 0 0 1;
1 1 1 1 0 1 1
]
# create training data
targetOutput = [
1 0 0 0 0 0 0 0 0 0;
0 1 0 0 0 0 0 0 0 0;
0 0 1 0 0 0 0 0 0 0;
0 0 0 1 0 0 0 0 0 0;
0 0 0 0 1 0 0 0 0 0;
0 0 0 0 0 1 0 0 0 0;
0 0 0 0 0 0 1 0 0 0;
0 0 0 0 0 0 0 1 0 0;
0 0 0 0 0 0 0 0 1 0;
0 0 0 0 0 0 0 0 0 1
]
mse(x,y) = sum((x .- y).^2)/length(x) # MSE will be our loss function
using Random
Random.seed!(54321) # for reproducibility
twoLayerNeuralNet = Network(Layer(10,100,ReLu), Layer(100,10)) # instantiate a two-layer network
end
begin
# Packages for automatic differentiation and neural networks
# (i.e. Tensorflow for Julia)
Flux.@functor Layer # set the Layer-struct as being differentiable
Flux.@functor Network # set the Network-struct as being differentiable
parameters = Flux.params(twoLayerNeuralNet)
# obtain the parameters of the layers (recurses through network)
optimizer = ADAM(0.01) # from Flux-library
netOutput = [] # store output for plotting
lossCurve = [] # store loss for plotting
for i in 1:500
for j in shuffle(0:9)
# Calculate the gradients for the network parameters
gradients = Zygote.gradient(
() -> mse(
twoLayerNeuralNet(transpose(inputs[j,:]))[:],
targetOutput[j,:]),
parameters
)
# Update the parameters using the gradients and optimiser settings.
Flux.Optimise.update!(optimizer, parameters, gradients)
# Log the performance for later plotting
actualOutput = twoLayerNeuralNet(transpose(inputs[j,:]))[:]
push!(netOutput, actualOutput)
push!(lossCurve,
mse(
actualOutput,
targetOutput[j,:]))
end
end
end
I dont know how to resolve the problem. Help would be appreciated