Flux Custom Loss Function Not Working Properly

It is a convention that the optimizers always minimize the target function. Therefore, in order to maximize the reward, you have to multiply it with -1.

1 Like