I don’t know about GPUs, but on my CPU, using ifelse
instead of max
and min
lead to significant performance improvements. ifelse
produces cmov
or vmax/min
whereas min
and max
may produce jmp
instructions which may cause pipeline stalls. Though, I’m not certain what @fastmath
does with it, probably using ifelse
.