When to use or not use `reinterpret`?

I reinterpret an array of bytes (UInt8) as Int64 like this

dict = reinterpret(Int64, uncompressed_data)

and when I use dict via a bunch of essentially random indices like so

for i in indices
  do_something(dict[i])
end

So I am accessing dict[i] like a random access array. Is this a bad way to use reinterpret? I think using unsafe_wrap to create the dict results in faster random access overall. Although the code is complex, so I need to write a simple MWE to confirm.

… naming an array dict isn’t a particular good choice IMHO…


Anyway, if you get the data directly as pointer or read it from somewhere else, you should create the array with the right type from start.

If you already have the array in a normal Array the creating another array using unsafe_wrap is illegal. You must not pass any pointer from any julia objects to unsafe_wrap. Doing so can crash your program and you are just lucky that it didn’t.

1 Like

That’s a good point. Parquet calls the array a “dictionary” though, so it’s in relation to the parquet reader. Maybe I call it vec_dict or something.

Awesome to know.