Actually, machines cache more than data. They also cache views of the data, which are encoded as vectors of integers representing which rows (observations) were used in the last training event. I expect this is the explanation.
I will make a note to clear this extra information in the new serialization PR.