Didn’t get much further.
A problem with doing this is there is so much noise in timing changes that it’s hard to be sure of small improvements. We need a @ctime
macro that will run code in a fresh session mulltiple times and take an average, although it’s going to take a long time to run.
It also seems like reorganising code is only effective for really large blocks.