Strange performance of a loop

RoyiAvital · July 20, 2018, 7:03pm

I think that prior to Sandy Bridge if you accessed Aligned Data using the non aligned load it wouldn’t be efficient.
In modern CPU’s if the data is aligned it doesn’t matter if you use the load which assumes alignment or not.
But I still think accessing unaligned data is slower than aligned data.

But my point is different.
We must make sure the length of the data allocated it a multiplication of 16 Bytes (For SSE) / 32 Bytes (For AVX) / 64 Byte (For AVX512).

The tricky part is dealing with 1D / 2D / 3D / Etc… arrays.

Topic		Replies	Views
The cost of size() in for-loops Performance	11	1933	July 20, 2018
Static array over 1,024 elements is slow to compile? General Usage	11	3260	March 25, 2021
Optimizing iteration over slices of multiple matrices Performance question , array	14	2038	December 30, 2018
Avoiding allocations of small but non-trivial arrays (work array alternative?) Performance question	38	4984	November 17, 2022
Local arrays in for loops New to Julia	55	2180	March 3, 2021

Strange performance of a loop

Related topics