Does Flux.jl layers make use of tensor cores in Nvidia GPUs?

I was searching for an answer in the Flux.jl, CUDA.jl, cuDNN.jl, but only found [] which talks about independent GPU operations using CUDA.jl.
I have not found particular information about Flux.jl exploiting this technology.

I don’t think Flux uses mixed-precision, so probably no. It is possible to configure CUDA.jl to use tensor cores more eagerly, at the expense of some precision, by starting Julia with fast math enabled or by calling CUDA.math_mode!(CUDA.FAST_MATH), which will e.g. use TF32 when doing an F32xF32 matmul. Further speed-ups are possible by setting CUDA.jl’s math precision to :BFloat16 or even :Float16. Ideally though, I guess Flux.jl would have an interface to use mixed-precision arithmetic.