DimensionMismatch: matrix A has dimensions (100,10), matrix B has dimensions (1,7)

Hello I have this problem with my neural network. I want to classify a 7 segment led digits.

Problem is i get this error:

DimensionMismatch: matrix A has dimensions (100,10), matrix B has dimensions (1,7)

1. <mark>**_generic_matmatmul!**(::Matrix{Float32}, ::Char, ::Char, ::Matrix{Float32}, ::Vector{Int64}, ::LinearAlgebra.MulAddMul{true, true, Bool, Bool})</mark>@*matmul.jl:856*
2. <mark>**generic_matmatmul!**</mark>@*matmul.jl:847* [inlined]
3. <mark>**mul!**</mark>@*matmul.jl:407* [inlined]
4. <mark>**mul!**</mark>@*matmul.jl:276* [inlined]
5. <mark>*****</mark>@*matmul.jl:141* [inlined]
6. <mark>**rrule**</mark>@*arraymath.jl:40* [inlined]
7. <mark>**rrule**</mark>@*rules.jl:134* [inlined]
8. <mark>**chain_rrule**</mark>@*chainrules.jl:223* [inlined]
9. <mark>**macro expansion**</mark>@*interface2.jl:101* [inlined]
10. <mark>**_pullback**</mark>@*interface2.jl:101* [inlined]
11. <mark>**_pullback**</mark>@*[Other: 10](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)* [inlined]
12. <mark>**_pullback**(::Zygote.Context{true}, ::Main.var"workspace#4".Layer, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
13. <mark>**_apply**</mark>@*boot.jl:838* [inlined]
14. <mark>**adjoint**</mark>@*lib.jl:203* [inlined]
15. <mark>**_pullback**</mark>@*adjoint.jl:66* [inlined]
16. <mark>**_pullback**</mark>@*operators.jl:1035* [inlined]
17. <mark>**_pullback**</mark>@*operators.jl:1034* [inlined]
18. <mark>**_pullback**</mark>@*operators.jl:1031* [inlined]
19. <mark>**_pullback**(::Zygote.Context{true}, ::Base.var"##_#97", ::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ::ComposedFunction{Main.var"workspace#4".Layer, Main.var"workspace#4".Layer}, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
20. <mark>**_apply**(::Function, ::Vararg{Any})</mark>@*boot.jl:838*
21. <mark>**adjoint**</mark>@*lib.jl:203* [inlined]
22. <mark>**_pullback**</mark>@*adjoint.jl:66* [inlined]
23. <mark>**_pullback**</mark>@*operators.jl:1031* [inlined]
24. <mark>**_pullback**(::Zygote.Context{true}, ::ComposedFunction{Main.var"workspace#4".Layer, Main.var"workspace#4".Layer}, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
25. <mark>**_pullback**</mark>@*[Other: 8](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)* [inlined]
26. <mark>**_pullback**(::Zygote.Context{true}, ::Main.var"workspace#124".Network, ::LinearAlgebra.Transpose{Int64, Vector{Int64}})</mark>@*interface2.jl:0*
27. <mark>**_pullback**</mark>@*[Local: 18](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)* [inlined]
28. <mark>**_pullback**(::Zygote.Context{true}, ::Main.var"workspace#2623".var"#1#2"{Int64})</mark>@*interface2.jl:0*
29. <mark>**pullback**(::Function, ::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})</mark>@*interface.jl:414*
30. <mark>**gradient**(::Function, ::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})</mark>@*interface.jl:96*
31. <mark>**top-level scope**</mark>@*[Local: 17](http://localhost:1234/edit?id=f53566dc-8710-11ee-3282-15008b18595c#)*

Here is my code to this problem:

begin
	struct Layer
		W::Matrix{Float32} # weight matrix - Float32 for faster gradients
		b::Vector{Float32} # bias vector
		activation::Function
		Layer(in::Int64,out::Int64,activation::Function=identityFunction) =
			new(randn(out,in),randn(out),activation) # constructor
	end
	
	(m::Layer)(x) = m.activation.(m.W*x .+ m.b) # feed-forward pass
end

begin
	ReLu(x) = max(0,x)
	identityFunction(x) = x
end;
begin
	struct Network
		layers::Vector{Layer} 
		Network(layers::Vararg{Layer}) = new(vcat(layers...)) 
			# constructor - allow arbitrarily many layers
	end
	
	(n::Network)(x) = reduce((left,right)->right∘left, n.layers)(x) 
		# perform layer-wise operations over arbitrarily many layers
end

begin
	inputs = [
  1 1 1 1 1 1 0;
  0 1 1 0 0 0 0;
  1 1 0 1 1 0 1;
  1 1 1 1 0 0 1;
  0 1 1 0 0 1 1;
  1 0 1 1 0 1 1;
  1 0 1 1 1 1 1;
  1 1 1 0 0 0 0;
  1 1 1 1 0 0 1;
  1 1 1 1 0 1 1   
] 
	# create training data
	targetOutput =  [
	1 0 0 0 0 0 0 0 0 0; 
	0 1 0 0 0 0 0 0 0 0;
	0 0 1 0 0 0 0 0 0 0;
	0 0 0 1 0 0 0 0 0 0; 
	0 0 0 0 1 0 0 0 0 0; 
	0 0 0 0 0 1 0 0 0 0; 
	0 0 0 0 0 0 1 0 0 0; 
	0 0 0 0 0 0 0 1 0 0; 
	0 0 0 0 0 0 0 0 1 0; 
	0 0 0 0 0 0 0 0 0 1
] 
	
	mse(x,y) = sum((x .- y).^2)/length(x) # MSE will be our loss function
	
	using Random
	Random.seed!(54321) # for reproducibility
	
	twoLayerNeuralNet = Network(Layer(10,100,ReLu), Layer(100,10)) # instantiate a two-layer network

end

begin
	# Packages for automatic differentiation and neural networks
	# (i.e. Tensorflow for Julia)
	Flux.@functor Layer # set the Layer-struct as being differentiable
	Flux.@functor Network # set the Network-struct as being differentiable
	
	parameters = Flux.params(twoLayerNeuralNet)
	
	# obtain the parameters of the layers (recurses through network)
	optimizer = ADAM(0.01) # from Flux-library
	netOutput = [] # store output for plotting
	lossCurve = [] # store loss for plotting
	
	for i in 1:500
		for j in shuffle(0:9)
	# Calculate the gradients for the network parameters
	gradients = Zygote.gradient(
	() -> mse(
	twoLayerNeuralNet(transpose(inputs[j,:]))[:],
	targetOutput[j,:]),
	parameters
	)
	# Update the parameters using the gradients and optimiser settings.
	Flux.Optimise.update!(optimizer, parameters, gradients)
	# Log the performance for later plotting
	actualOutput = twoLayerNeuralNet(transpose(inputs[j,:]))[:]
	push!(netOutput, actualOutput)
	push!(lossCurve,
	mse(
	actualOutput,
	targetOutput[j,:]))
		end
	end
end

I dont know how to resolve the problem. Help would be appreciated

The inputs are vectors with 7 features, yet the network is defined from 10 inputs. Maybe changing:

Network(Layer(10,100,ReLu), Layer(100,10))

to

Network(Layer(7,100,ReLu), Layer(100,10))

would help.

1 Like

@Dan Now the error is [DimensionMismatch: matrix A has dimensions (100,7), matrix B has dimensions (1,7)]. I used 10 because i wanted to randomize the input nodes. So that it starts in every iteration with a different randomized node.

Perhaps also replace:

transpose(inputs[j,:])

with

vec(inputs[j,:])

On a “meta” note, this same question appears also on StackOverflow, twice under two different names, isn’t asking once enough? Or once on each platform.

Yeah,thats true,i asked on Stack Overflow once. Idk about the other post. Still thank you for your help!

@SimonHubert Did you get help? I am also having a similar problem