# Why does binarycrossentropy needs an index in a denoising autoencoder?

Hi there,

I am still somewhat a rookie in Julia and Flux and I have problem understanding what is going on when I switch between `crossentropy` and `binarycrossentropy` loss functions.

I coded the following simple denoising autoencoder:

``````using Flux, Random

data = rand(2000,100)
data_corrupted = copy(data)

# Corrupt data
for sample_index in 1:size(data)
# Create random indices
rng = MersenneTwister(1234)
indices = findall(bitrand(rng, 2000) .> 0)

# Change values at indices to 0
for i in 1:size(indices)
data_corrupted[indices[i], sample_index] = 0
end
end

# Partition into batches of 10
data = [data[:, i:min(i+10-1,size(data, 2))] for i in 1:10:size(data, 2)]
data_corrupted = [data_corrupted[:, i:min(i+10-1,size(data_corrupted, 2))] for i in 1:10:size(data_corrupted, 2)]

# Define model
encoder = Dense(2000, 50, σ)
decoder = Dense(50, 2000, σ)
m = Chain(encoder, decoder)

# Defining the loss function
loss(x, y) = Flux.crossentropy(m(x), y)

# Defining the optimiser

# Train
Flux.train!(loss, params(m), zip(data_corrupted, data), opt)
``````

This runs fine.

But if I then change the loss function to:

``````loss(x, y) = Flux.binarycrossentropy(m(x), y)
``````

I get the following error:

``````ERROR: LoadError: MethodError: no method matching eps(::Array{Float32,2})
Closest candidates are:
eps(!Matched::Dates.Time) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Dates/src/types.jl:387
eps(!Matched::Dates.Date) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Dates/src/types.jl:386
eps(!Matched::Dates.DateTime) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Dates/src/types.jl:385
...
``````

However if I change the loss to what has been suggested here:

``````loss(x, y) = Flux.binarycrossentropy(m(x), y)
``````

The model trains without any problem.

I have a hard time understanding why I need this indexing for the `binarycrossentropy`, while I do not need it for the `crossentropy`. I understand that the `eps` function requires an array of dim 1, but I am confused as whether it will now calculate the loss only on the first batch instead of on all data.

Any insights are very welcome!

Many thanks,
Sander

The loss function needs to return a scalar.

From the definitions of `crossentropy` and `binarycrossentropy`, `crossentropy` includes a sum over the model output, while `binarycrossentropy` expects a scalar model output.

Try broadcasting `binarycrossentropy` over the output and summing like this:

``````loss(x, y) = sum(Flux.binarycrossentropy.(m(x), y))
``````
2 Likes

That makes a whole lot of sense and works like a charm. Thank you for pointing this out to me!