Big overhead with the new lazy reshape/reinterpret

I’m also interested to know. Just two days ago I checked out master to try whether there was any progress using the original example I posted, and I didn’t see any improvement yet.