The Julia code is leaving all the inputs as Float64, while the TF code uses Float32 by default. Make sure to use x[.x]f0 for literals, and convert non-literals (e.g. with Float32(x)). With just those changes, the Flux version runs an order of magnitude faster on my machine.
1 Like