Flux: Dimension mismatch error

Hi everyone,

I’m a beginner with Flux, and I want to use it to approximate a high-dimensional input function. However, I’m running into a dimension mismatch error. Here’s a MWE, where I’m just trying to approximate f(x,y) = x^2 + y^2.

using Flux
using Surrogates
using Statistics
using Pipe

#Defining toy function to approximate
f(x) = x[1].^2 + x[2].^2;

#Generating 2D sample inputs
n_samples = 100;
lower_bound = [-1.0, -1.0];
upper_bound = [1.0, 1.0];

xys = Surrogates.sample(n_samples, lower_bound, upper_bound, SobolSample())   #Sobol sampling - gives me a 100 element array of 2 element arrays

rawInputs = convert(Vector{Tuple{Float32, Float32}}, xys)

#Corresponding outputs
rawOutputs = @pipe [[f(xy)] for xy in xys] |> 
             convert(Vector{Vector{Float32}}, _);


#Defining neural network
dim_input = 2;   #it's a 2 dimensional input (x and y)
dim_ouptut = 1;  #1 dimensional output (f(x,y))
Q1 = 784;        #Number of nodes for the first hidden layer
Q2 = 50;         #Number of nodes for the second hidden layer

# Two inputs, one output
model = Chain(Dense(2,Q1,relu),
            Dense(Q1,Q2,relu),
            Dense(Q2,1,identity))

# Define loss function and weights
loss(x, y) = Flux.Losses.mse(model(collect(x)), y);

lr = 0.001; # learning rate

opt = Descent(lr);

epochs = 1000; # Define the number of epochs
trainingLosses = zeros(epochs);# Initialize a vector to keep track of the training progress
ps = Flux.params(model) #initialize weigths

trainingData = [(rawInputs, rawOutputs)];

# Training loop
@time for ii in 1:epochs

    Flux.train!(loss, ps, trainingData, opt)

end

ERROR: DimensionMismatch: layer Dense(2 => 784, relu) expects size(input, 1) == 2, but got 100-element Vector{Tuple{Float32, Float32}}

FYI, I know that using

trainingData = zip(rawInputs, rawOutputs);

instead of

trainingData = [(rawInputs, rawOutputs)];

resolves the issue, but I eventually want to put this onto my GPU and I’d run into a “scalar indexing is disallowed” error message.

Thank you for your help!

Dense wants either a vector (for one sample) or a matrix (whose columns are many samples), but it’s getting a vector of tuples. One way to convert is this:

julia> rawInputs
100-element Vector{Tuple{Float32, Float32}}:
 (-0.953125, -0.203125)
 (0.046875, 0.796875)
 (0.546875, -0.703125)
...

julia> stack(rawInputs)
2×100 Matrix{Float32}:
 -0.953125  0.046875   0.546875  -0.453125  …  -0.882812  0.117188   0.617188  -0.382812
 -0.203125  0.796875  -0.703125   0.296875     -0.867188  0.132812  -0.367188   0.632812

julia> size(ans, 1) == 2
true

The way you have written training is the old “implicit” style, I’d recommend writing it like this (see docs here for more):

julia> loss(m, x, y) = Flux.Losses.mse(m(x), y);  # takes model as explicit argument

julia> opt = Flux.setup(Descent(lr), model);  # state necc. really for other opt rules

julia> train_data = [(stack(rawInputs), stack(rawOutputs))];

julia> @time for ii in 1:epochs
           Flux.train!(loss, model, train_data, opt)
       end
  0.483642 seconds (375.89 k allocations: 1.744 GiB, 23.82% gc time, 14.30% compilation time: 100% of which was recompilation)
1 Like