Serialization of a SubArray forgets position in parent array

I am working in a codebase where we use Vectors divided in multiple “viewing windows” (SubArrays) representing some data.
Recently I had to serialize those views and noticed that the information about the parent array and index offsets was lost in the process.

As an example:

import Serialization

original = [10.0, 20.0, 30.0, 40.0, 50.0, 60.0]
window = @view original[3:5]

Serialization.serialize("example", window)
serialized = Serialization.deserialize("example")

If we compare serialized with window, both the parent and offsets are wrong.

parent(window)    #  [10.0, 20.0, 30.0, 40.0, 50.0, 60.0]
parent(serialized) # [30.0, 40.0, 50.0]

window.indices   # (3:5, )
serialized.indices # (1:3, )

This is the behavior even when serializing both the original vector and subarray together.

Serialization.serialize("example2", (original, window))
(w, y) = Serialization.deserialize("example2")

There is no relation between w and parent(y) and there seems to be no way (besides searching and matching on it) to find on y from which indices of w it is supposed to refer to, an information that I needed.

So, my question is if there is some rationale behind this behavior or I just stumbled into a bug?
And does anybody knows if in the current Julia version (LTS or Release) there is a way to serialize Vectors together SubArrays with losing their linking? Or I will have to roll my own serializer?

This is the behavior I would have expected. Serialize only the values inside the window. Serializing both the parent and the view seems wasteful. The view only contains the ‘visible’ values.

Hoewever, the fact that the parent is accessible via the parent function makes me a bit less confident in my conclusion.

Yes, this is intentional. There are lots of smarts in SubArray that recompute both parent and indices in many places — they are not promised to be held as you initially constructed them. For example, we might reshape the parent upon construction or make an unaliased copy of an index. In this case it can be a huge savings in disk/network to “trim” the parent and recompute the indices if possible.

That said, the decision to do this trimming when serializing so was made very long ago and predates a lot of serializer smarts. I wonder if it’d now be possible to see if the parent was already serialized and use a reference to it if that’s the case. I don’t know if such a thing would be possible without having some crazy order-of-serialization dependency to it.

Thanks for the explanations! Apparently I will have to think about another way to do what I needed.

Something that still bugs me out is that deepcopy works exactly as I would expect: it makes a copy of the parent when copying only the SubArray and preserves the relationship when copying a struct/tuple containing both the parent and the SubArray.
Maybe this behavior was implemented after the one in Serialization… I don’t know.