Is it possible to reinterpret and reshape without allocating?

AwesomeQuest · June 12, 2025, 5:21pm

So I have data I’m getting over HTTP,

r = rand(UInt8, 847296)

It is arraigned in 3 8 byte chunks, where the first chunk is unwanted metadata, so every 3n -2 chunks is unwanted.
I have written a function that does this:

function parseRawData(data)
    data = @view reinterpret(Float64, data)[(3:length(data)÷8+2) .% 3 .!= 0]
    return reshape(data, 2, length(data)÷2)
end

julia> @time parseRawData(r)
  0.000224 seconds (8 allocations: 564.811 KiB)

But this allocates 0.55MB. Is it possible to do this while reusing all the memory of the original vector with minimal allocations?

I’ve heard that it’s bad to try to access discontinuous memory like this so is that why it can’t be done much better?

stevengj · June 12, 2025, 5:24pm

This allocates an array.

AwesomeQuest · June 12, 2025, 5:27pm

Yes, but only a small one compared to the 0.5MB one

julia> @time (3:length(r)÷8+2) .% 3 .!= 0
  0.000204 seconds (10 allocations: 13.194 KiB)

Is it possible to do this without allocating?

stevengj · June 12, 2025, 5:30pm

This code returns the same result but does no allocations:

function parseRawData2(data)
    d = reshape(reinterpret(Float64, data), 3, :)
    return @view d[2:3,:]
end

Views involving logical indexing (arrays of booleans) are much less efficient to work with than range slices like 2:3. (I also think parseRawData2 is clearer.)

AwesomeQuest · June 12, 2025, 5:48pm

Wow! That’s surprising. Very cool.
So I should always try to avoid logical indexing in favor of finding a pure @view slice that does the same thing?
What about stacking views if your criteria get more complex?

Is there a point where this stops being worth it? (other than the code becoming unreadable)

gbaraldi · June 12, 2025, 6:12pm

Basically, logical indexing on the view is better than a view that’s created with logical indexing. The hard thing is that a view has to behave as if it was continuous, so if it’s constructed over disjoint indexes it has to do some work

AwesomeQuest · June 12, 2025, 8:13pm

Can you recommend something to read on the subject? Seems really interesting.

gbaraldi · June 12, 2025, 9:36pm

You can look at the subarray.jl file which is the source for how views are implemented.

raman_kumar · June 12, 2025, 10:27pm

If you are not working with live data then i think you may get better speed performance if you download data and then analyze it offline.

AwesomeQuest · June 12, 2025, 10:37pm

What do you mean download? Isn’t that what I’m doing when I do a HTTP.get(url)?

raman_kumar · June 13, 2025, 9:04am

Yes, You download using http but you connect and disconnect with url many times so this may increase time. Simply try my suggestion. It may increase speed. Earlier I was also working on online data and in my case I saw too much speed performance.

AwesomeQuest · June 13, 2025, 9:18am

So you mean like opening a web-socket?

Topic		Replies	Views
Why does `reinterpret` cause an extra allocation? General Usage	30	4097	February 24, 2018
Reinterpret to existing vector Performance question , performance	16	577	January 29, 2023
Big overhead with the new lazy reshape/reinterpret Internals & Design	35	4922	August 18, 2018
Reshape() still allocating memory in Julia 1.5.0 Performance	2	1385	August 6, 2020
Reusing preallocated memory without unsafe wraps Performance multithreading , memory , memory-allocation	8	710	February 27, 2022

Is it possible to reinterpret and reshape without allocating?

Related topics