Let’s say I need my weights to be determined with an accuracy of
10^10 and hence want to train my network using
Float64. Does it make sense to train my network with
Float32 and then (after I get around 10^-7 accuracy) shift to
Also, I have a feeling that this idea of first training with smaller precision numbers (even
Float16 ) and then moving on to higher precision numbers will speed up the training process a lot. Is there a reason why it isn’t inherently coded yet?