I believe that fmap returns a copy of the model with the updated parameters, instead of doing it in-place. If you do mod = fmap(f64, mod) are all of mod’s parameters still Float32?
Edit: this seems to work for me, on Flux version 0.10.0:
julia> using Flux
[ Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
julia> m = Chain(Dense(4, 4, relu), Dense(4, 4), softmax)
Chain(Dense(4, 4, relu), Dense(4, 4), softmax)
julia> m[1].W
4×4 Array{Float32,2}:
  0.823812    -0.593816   -0.799553  -0.570861
 -0.242723    -0.529529    0.012944   0.740021
  0.00234743   0.542825   -0.627849  -0.746003
  0.640687     0.0562518   0.183272   0.300056
julia> m = fmap(f64, m)
Chain(Dense(4, 4, relu), Dense(4, 4), softmax)
julia> m[1].W
4×4 Array{Float64,2}:
  0.823812    -0.593816   -0.799553  -0.570861
 -0.242723    -0.529529    0.012944   0.740021
  0.00234743   0.542825   -0.627849  -0.746003
  0.640687     0.0562518   0.183272   0.300056
And, in fact, I’ve just learned that you can skip the fmap altogether and just call f64 on the model directly. 
julia> m = Chain(Dense(4, 4, relu), softmax) |> f64
Chain(Dense(4, 4, relu), softmax)
julia> m[1].W
4×4 Array{Float64,2}:
 -0.378957    0.74928    0.776147   0.804017
 -0.298902   -0.494922  -0.519403  -0.668515
  0.0772961  -0.219601   0.776379  -0.067537
  0.64462    -0.259356  -0.697054  -0.218149