Error dimensions mismatch"A has dimension (83,5) "but B has dimensions (83,5)")"when using FLux package in julia

Hey i am a beginner when it comes to machine learning and using Julia.For a current project I am doing I have encountered a problem where it says my dimension mismatch and I am not sure how to go about solving it. I tried making them the same but yet to no avail, it gives me the same error…Here are the Matrices I am using

83×5 Matrix
  0.685468    2.71934    -1.3916    -1.64212    -2.46184
  1.9476     -0.368776   -0.706665   0.552662    0.0840651
 -0.896637   -0.774699    0.320741  -0.167762    0.471714
  1.07655    -0.0556672   0.320741   1.69273     0.074243
  0.836567   -0.590463    0.663209  -0.36163     0.469094
 -1.13662     0.620122   -0.706665   1.21484    -1.04679
  1.1921     -0.382114    0.320741  -0.0105933   0.854123

83 elements


The error comes when I run this specific line of code:

Flux.train!(loss(x,y),params(ms),data ,opt)

The specific error:

DimensionMismatch("A has dimensions (83,5) but B has dimensions (83,5)")
gemm_wrapper!(::Matrix{Float64}, ::Char, ::Char, ::Matrix{Float64}, ::Matrix{Float64}, ::LinearAlgebra.MulAddMul{true, true, Bool, Bool})@matmul.jl:643
(::Flux.Dense{typeof(NNlib.relu), Matrix{Float32}, Vector{Float32}})(::Matrix{Float64})@basic.jl:158
(::Flux.Chain{Tuple{Flux.Dense{typeof(NNlib.relu), Matrix{Float32}, Vector{Float32}}, Flux.Dense{typeof(NNlib.relu), Matrix{Float32}, Vector{Float32}}, Flux.Dense{typeof(NNlib.relu), Matrix{Float32}, Vector{Float32}}}})(::Matrix{Float64})@basic.jl:49
loss(::Matrix{Float64}, ::Vector{Float64})@Other: 2
top-level scope@Local: 2[inlined]

The Model Architecture I am trying to use is a multi-layer perceptron and I have 5 input features and 1 output and I believe this is where my problem also is + I am using the Flux package>

ms = Chain(
  Dense(83,5, relu),
  Dense(5,83, relu),

This is how I defined my loss function:

loss(x,y)= Flux.mse(ms(x),y)

Could anyone please give me some form of guidance or a solution to fix this :slight_smile:

Hi @zed_unseened, and welcome!
When you post on a forum like this one, please provide the complete example leading to the error, so that we may reproduce it.
In my opinion, your problem is related to your choice of architecture: I don’t understand your layer sizes, but I don’t think they can fit together. The dimensions of your layers should not depend on the number of training samples (which seems to be 83). The idea is to have one neuron per input feature in the first layer, and one neuron per output in the last layer.
So I would recommend you change your model to something like this:

ms = Chain(
    Dense(5, d1, relu),
    Dense(d1, d2, relu),
    Dense(d2, 1, relu),

Does that solve your issue? And more importantly, if it does, do you understand why?

Thank you for your response, I believe my understanding got better after your explanation but what do you think would be a suitable d1,d2 value, given the dataset X and Y above also is there a specific guide I can follow when trying to implement a model for a multi-layer perceptron using flux.

As for the choice of hidden layer sizes, it is not a matter of code correctness so I won’t be able to help. I’m not very experienced with deep learning either, so maybe you’ll find more wisdom in standard books on the subject such as
If you want examples of Flux syntax, you can check out the model zoo, but I’m afraid they don’t have a simple MLP for array data (they do have one for images though).