Investigating numerical change in function return value between v1.4 vs v1.5

I’m curious about this as well.
FWIW, you don’t need -O3, all you need is -instcombine to get the fdiv fast. So it seems to have been a change with the instcombine pass between versions.

I also recently encountered an issue with fdiv and instcombine that started with LLVM 9 where the instcombine moves an fdiv inside a loop, dramatically worsening performance.
My issue on LLVM was closed because that was intended behavior; you’re supposed to place a licm at some point after the last instcombine to move the division back out of the loop.

Maybe you could file this as an instcombine issue with LLVM. It seems reasonably likely they’re connected (instcombine getting more aggressive with divisions from LLVM 8 to 9), but your example seems harder to close as a Julia issue / shows up with the default -O3 optimization pipeline.

2 Likes