Note that Flux.train! only trains for one “epoch”. In this case, that means one iteration of gradient descent with the default step size. The training section of the documentation elaborates on how to train your model for many “epochs”.
In general you will have to implement your own convergence criteria, though in this simple case training for a few thousand epochs with the default settings will likely be sufficient.