Hi,
I am currently doing some practice projects in Julia, which involves using SIMD.jl for writing/generating microkernels. My issue is trying to determine what is the “best” way of loading Vec types from a higher-dimensional (currently 2D) array, where the loads are continuous (no need for gather ops).
In 1D, the following works:
using SIMD
arr = Vector{Float64}(undef, 100)
xs = vload(Vec{4,Float64}, arr, i)
In 2D however, linear indexing fails (lets say I want to load the elements of x[1:4, 1], i.e. those of linear index 1:4:
x = zeros(100, 100)
xs = vload(Vec{4,Float64}, x, 1)
(MethodError)
A workaround is using raw pointers:
x = zeros(100, 100)
xs = vload(Vec{4,Float64}, pointer(x, 1))
Which does work, but relies on raw pointers, and thus makes debugging/boundscheck a pain (and the code starts to look an awfully C-like…).
My question is, is there a better/safer/more idiomatic way of achieving this, or should I instead rewrite the kernels to operate on explicitly 1-D arrays (thus losing some abstraction)?
If yes, are there any examples of code where SIMD.jl is used in conjunction with explicitly higher-dimensional, but continuous loads?
Thank you for your help in advance.