ADAM crash

I am getting started with Flux and learning machine learning and Julia at the same time. So I had a look at some examples and loading MNIST and running stuff in Jupyter and that works fine for me. I started experimenting with the simplest possible model to see that I understand things correctly. I am using simplest possible Softmax (ie multidimentional logistic regression) with and without my own initialization like this:

model1LRi = Chain(
    # Dense(784, 10),
    Dense(W, -W*m),          # 784 x 10 + 10 = 7850 parameters

and then optimizing it with either Descent or ADAM like this:

optimizer = Descent(0.1) # ADAM(0.001) 
train_data = [(train_x, train_y)]
for i in 0:400
    if i % 25 == 0 println(i, " ", loss1LRi(train_x, train_y)) end
    Flux.train!(loss1LRi, params1LRi, train_data, optimizer)

Now all combinations work EXCEPT if I initialize the model and use ADAM. Then it says

TypeError: in typeassert, expected Tuple{Transpose{Float32, Matrix{Float32}}, Transpose{Float32, Matrix{Float32}}, Vector{Float64}}, got a value of type Tuple{Matrix{Float32}, Matrix{Float32}, Vector{Float64}}
 [1] apply!(o::Adam, x::Transpose{Float32, Matrix{Float32}}, Δ::Matrix{Float32})
   @ Flux.Optimise ~/.julia/packages/Flux/KkC79/src/optimise/optimisers.jl:179
 [2] update!(opt::Adam, x::Transpose{Float32, Matrix{Float32}}, x̄::Matrix{Float32})
   @ Flux.Optimise ~/.julia/packages/Flux/KkC79/src/optimise/train.jl:18

What did I do wrong?

EDIT: It also says

WARNING: both Losses and NNlib export "ctc_loss"; uses of it in module Flux must be qualified


Hi! Can you put a complete reproducible example, including the package imports and data definitions? Thanks :slight_smile:

Thanks. Here is the extract. I ran that one from command line and it did the same. Copy paste into REPL. Same result.

using MLDatasets
train_x_raw, train_y_raw = MNIST(split = :train)[:];
test_x_raw,  test_y_raw  = MNIST(split = :test)[:];
using Flux
train_x = Flux.flatten(train_x_raw);
test_x  = Flux.flatten(test_x_raw);
train_y = Flux.onehotbatch(train_y_raw, 0:9);
test_y  = Flux.onehotbatch(test_y_raw, 0:9);
function mynormvec(W, m)
    w = mapslices(mean, W, dims=2)[:] .- m
    w = w ./ norm(w)
using Statistics
using LinearAlgebra
m = mapslices(mean, train_x, dims=2)[:]
W = [mynormvec(train_x[:, train_y_raw .== i], m) for i = 0:9]
W = hcat(W...)
W = transpose(W)
model1LRi = Chain(
    #Dense(784, 10),
    Dense(W, -W*m),          # 784 x 10 + 10 = 7850 parameters
params1LRi = Flux.params(model1LRi)
loss1LRi(x,y)  = Flux.Losses.crossentropy(model1LRi(x),y)
println("STARTLOSS: ", loss1LRi(train_x, train_y))
optimizer = ADAM(0.001) 
train_data = [(train_x, train_y)]
for i in 1:400
    if i % 25 == 0 println(i, " ", loss1LRi(train_x, train_y)) end
    Flux.train!(loss1LRi, params1LRi, train_data, optimizer)

Oddly enough this example runs fine for me. Maybe try upgrading Flux or change this line W = transpose(W) to W = collect(transpose(W)).


  1. Yes, strangely enough the collect thing made it work.

  2. Yes, upgrading made it work also.
    From half a year ago (Julia 1.9 and Flux v0.13.4) to bleeding edge.

Thanks a lot.

Well, not so strange after all:
You got an error in apply!(o::Adam, x::Transpose{Float32, Matrix{Float32}}, Δ::Matrix{Float32}) so I suspected that the problem might be that the weights are of type Transpose{...} while the gradient is Matrix.
As I did not get this error it could have been a version issue. Further, the part in your code where you created weights of type Transpose{...} was in the line W = transpose(W). Thus, passing a regular matrix instead should also fix it:

julia> W = rand(2, 3);

julia> typeof(transpose(W))
LinearAlgebra.Transpose{Float64, Matrix{Float64}}

# collect it into a fresh matrix
julia> typeof(collect(transpose(W)))

Hope that explains it.