Reproducing TF model with Flux- Slow?

NAS · June 19, 2023, 4:44pm

Hello everyone. I’m new to Flux and as an example to learn I’m just trying to reproduce an example TF model. So far it seems to be working but I have one major snag, the Flux training is orders of magnitude slower. For example, the TF training epochs are about 30s each and the Flux training epochs are about 13 minutes each.

Here is what I am trying to reproduce:

INPUT_SHAPE = [train_df.shape[1]]  ## 1024
BATCH_SIZE = 5120

model = tf.keras.Sequential([
    tf.keras.layers.BatchNormalization(input_shape=INPUT_SHAPE),    
    tf.keras.layers.Dense(units=512, activation='relu'),
    tf.keras.layers.Dense(units=512, activation='relu'),
    tf.keras.layers.Dense(units=512, activation='relu'),
    tf.keras.layers.Dense(units=num_of_labels,activation='sigmoid')  #num_of_labels = 1500
])


# Compile model
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss='binary_crossentropy',
    metrics=['binary_accuracy', tf.keras.metrics.AUC()],
)

history = model.fit(
    train_df, labels_df,
    batch_size=BATCH_SIZE,
    epochs=5
)

My “translation” is :

INPUT_SHAPE = size(train_df)[2]   #1024
BATCH_SIZE = 5120

model = Chain(
    BatchNorm(INPUT_SHAPE),
    Dense(1024=>512,relu),
    Dense(512=>512,relu),
    Dense(512=>512,relu),
    Dense(512=>1500,sigmoid)
) 


obs = Matrix(train_df) |> permutedims
labels = Matrix(labels_df) |> permutedims

loader = Flux.DataLoader((data = obs,label = labels) ,batchsize = BATCH_SIZE)

optim = Flux.setup(Flux.Adam(0.001, (0.9, 0.999), 1.0e-7), model)


for epoch in 1:5
    println("epoch: $epoch")
    @showprogress for(data,label) in loader
        grads = Flux.gradient(model) do m
            result = m(data)
            Flux.Losses.binarycrossentropy(result,label)
        end
        Flux.update!(optim,model,grads[1])
    end
end

As I mentioned, the TF training epochs run about 30s each and the Flux for ~13 minutes.

I feel like I have to be missing something simple. Any feedback you may have would be greatly appreciated.

Thanks

NAS · June 24, 2023, 2:47pm

Solved-

Turns out this:

labels = Matrix(labels_df) |> permutedims

Was a Matrix{Any} instead of Matrix{Float32} which caused the slowdown.

Flux documentation even says:

After correcting, training speed is on par with TF version

CarloLucibello · June 24, 2023, 3:42pm

nice finding! you can mark the thread as solved if you want to

Topic		Replies	Views
Flux results not similar to Tensorflow Machine Learning question	3	1817	March 11, 2019
Why is this MLP slower in Flux than in TensorFlow? Performance performance , flux , python , neural-network	5	1128	April 30, 2022
The same network performs differently in Flux.jl and tensorflow Machine Learning performance	13	3068	December 18, 2019
Why the result from Flux.jl is totally different from tf.Keras (with the same simple MLP) Machine Learning question , package	6	1458	December 3, 2019
Flux running slow? Machine Learning	16	2753	August 19, 2021

Reproducing TF model with Flux- Slow?

Related topics