Ah, just noticed you’re running on Cpu with tanh activations. Flux vs pytorch cpu performance is most likely the culprit (long story short, small dense MLPs with tanh on CPU hit a bunch of areas in Flux that need to be optimized), except more or less pronounced because you’re also running the backwards pass.
1 Like