I reinterpret an array of bytes (UInt8
) as Int64
like this
dict = reinterpret(Int64, uncompressed_data)
and when I use dict
via a bunch of essentially random indices like so
for i in indices
do_something(dict[i])
end
So I am accessing dict[i]
like a random access array. Is this a bad way to use reinterpret? I think using unsafe_wrap
to create the dict
results in faster random access overall. Although the code is complex, so I need to write a simple MWE to confirm.
… naming an array dict
isn’t a particular good choice IMHO…
Anyway, if you get the data directly as pointer or read it from somewhere else, you should create the array with the right type from start.
If you already have the array in a normal Array
the creating another array using unsafe_wrap
is illegal. You must not pass any pointer from any julia objects to unsafe_wrap
. Doing so can crash your program and you are just lucky that it didn’t.
1 Like
That’s a good point. Parquet calls the array a “dictionary” though, so it’s in relation to the parquet reader. Maybe I call it vec_dict
or something.
Awesome to know.