Help defining (masked) vload and vstore operations for SArrays (or other isbits structs) using llvmcall

What, exactly, do you want to achieve?

Can you post julia code that does semantically the same you want, and we then think how the correct assembly should look like, and then we try to coax llvm into emitting this?

Or can you post C code using the _mm_something intel intrinsics that does what you want?

I think r=Ref(something_bitstype) and then using pointer_from_objref(r) like an MMatrix has a certain chance of working just as well as the MMatrix.