Problems with Flux

jacobcvt12 · March 14, 2018, 5:14pm

I’m having trouble fitting a really basic neural network in Flux. I wanted to get a handle on the syntax by simulating simple data from a logistic regression, but the model isn’t converging to anything reasonable. I think this is probably due to a misunderstanding on my part of the syntax. Any thoughts on what I’m doing wrong? Code below

# generate logistic regression data
using Distributions

## size of data
n = 1000
d = 10 # covariates including bias

## seed
srand(1)

## coefficients
b = rand(Normal(), d)

## covariates
X = rand(Normal(), n, d)
X[:, 1] = ones(n)

## output
θ = sigmoid.(X * b) # true probabilities
y = rand.(Bernoulli.(θ))
mean((θ - y) .^ 2) # MSE with truth

## now drop bias from data
X = X[:, 2:d]

# set up neural network
using Flux 
using Flux.Tracker
using Flux: @epochs

model = Chain(Dense(d - 1, 5, sigmoid),
              Dense(5, 1, sigmoid))

# train model

## set up loss to minimize, optimizer, and data
loss(x, y) = Flux.mse(model(x), y)
opt = ADAM(params(model))
data = [(X', y)] # note the transpose of covariates!

loss(X', y) # 267.83

# now fit model
@epochs 100 Flux.train!(loss, data, opt)

# check model performance
loss(X', y) # 252.88

jacobcvt12 · March 14, 2018, 7:27pm

I fixed the issue by minimizing crossentropy and using onehotbatch instead of of minimizing MSE.

# load packages
using Distributions
using Flux 
using Flux.Tracker
using Flux: onehotbatch, argmax, crossentropy, throttle, @epochs

# generate logistic regression data

## size of data
n = 1000
d = 10 # covariates including bias

## seed
srand(1)

## coefficients
b = rand(Normal(), d)

## covariates
X = rand(Normal(), n, d)
X[:, 1] = ones(n)

## output
θ = sigmoid.(X * b) # true probabilities
Y = onehotbatch(rand.(Bernoulli.(θ)), 0:1)

## now drop bias from data and transpose matrix
X = X[:, 2:d]
X = X'

# set up neural network

model = Chain(Dense(d - 1, 5, sigmoid),
              Dense(5, 2),
              softmax)

# train model

## set up loss to minimize, optimizer, and data
loss(x, y) = crossentropy(model(x), y)
opt = ADAM(params(model))
data = [(X, Y)] # note the transpose of covariates!

## accuracy
accuracy(x, y) = mean(argmax(model(x)) .== argmax(y))
accuracy(X, Y) # 32.5%

# now fit model
@epochs 1000 Flux.train!(loss, data, opt)

# check model performance
accuracy(X, Y) # 86.7%

jacobcvt12 · March 14, 2018, 8:10pm

Circling back on this, I found that the issue was the dimension of the output… In my original post, X was of dimension d x n and y was dimension n x 1. As a side effect of changing y to one hot encoding, I corrected the dimension to 2 x n.

Topic		Replies	Views
Flux function fitting Machine Learning flux	2	1048	August 7, 2020
Question about how to train a neural network using Flux.jl Machine Learning first-steps , flux	1	2339	October 8, 2018
Logistic regression in flux New to Julia flux	1	2072	April 8, 2020
Problems with Flux NN regression Machine Learning question , package	1	410	November 19, 2021
How to train/predict a very simple feed-forward neural network in Flux? Machine Learning	6	2538	May 26, 2021

Problems with Flux

Related topics