The same network performs differently in Flux.jl and tensorflow

Could the batch size be an issue? It seems that keras defaults to 32 if unspecified (The Model class).