SIMD: Need some help to speed up sampling a code vector

minetest2048 · July 16, 2024, 3:46am

I like the fixed point number implementation better, its how hardware NCOs work on hardware GPS receiver

One option is to vectorize the fixed point index calculation and table lookup using gather instructions:

A SIMD intrinsic correlator library for GNSS software receivers | GPS Solutions

We can do vectorized load from a lookup table if we can somehow convince Julia to emit vgatherdpd instruction: vgatherdps . Whether this is faster than a indexing loop on a CPU is debatable:

The paper shows that its profitable for an i9-7900X processor with AVX512:

(reg_standalone is scalar indexing loop)

But a security update might make this fast vectorized lookup table code go 50% slower:

Another option is to run the code LFSRs in parallel instead of indexing into a lookup table:

I haven’t seen anyone doing this for GNSS PRN generators though

Topic		Replies	Views
Efficient repeated sampling of small vector Performance	13	413	April 8, 2023
SIMD gather result in slow down Performance	6	564	February 28, 2023
How to speed up the Interpolation Performance	7	1662	April 4, 2018
Customize a random function to sample 3 out of a list of 4097 real numbers Performance question	5	253	July 7, 2023
Help to improve performance of kalman filter New to Julia question	3	679	December 30, 2018

SIMD: Need some help to speed up sampling a code vector

Related topics