@inbounds code slower than one without

Elrod · March 6, 2019, 10:37pm

This has has been discussed before here. AVX512, like @mbauman’s machine, does have an instruction to convert Int64 to Float64, therefore it is fast for him.

AVX2 and earlier do not have the instruction. Using Int32 instead should work for these architectures, assuming

julia> typemax(Int32)
2147483647

is big enough.

Topic		Replies	Views
LoopVectorization: @turbo performs worse than @inbounds on trivial loop New to Julia question , simd , loopvectorization	9	2098	August 28, 2021
Questions on a number of code acceleration techniques General Usage performance , hpc , parallel	11	1777	July 8, 2017
Is the triple `@inbounds @fastmath @simd` necessary for absolute peak performance? Performance	7	492	October 21, 2024
@inbounds: is the compiler now so smart that this is no longer necessary? Performance	33	2906	July 16, 2018
How to speed up this simple code? Multithreading, simd, inbounds Performance	39	7022	January 29, 2019

@inbounds code slower than one without

Related topics