Thanks for the elaboration, I really appreciate it! Sorry for pestering you so much, but may I ask for a little bit of extra elaboration? All three optimizations you listed are very cool. Especially the delay of allocation to branches that end in throw/unreachable would be nice.
So, to check that I understood this right:
“when all real uses are inlined”: I thought we already remove allocations if every use is inlined?
“delay allocation to branches where the full object is needed”: Delay of allocation looks like it should be done on the level of optimizing SSA-IR, and not touch codegen at all?
“allow passing stack pointer to functions”: Passing stack pointers to non-inlined functions looks like functions would need more attributes (can they leak refs?) and otherwise wants to be done close to llvm / in codegen and not touch the SSA-IR optimizer at all? Except for potentially generating the “can’t leak references” attribute.
“Collapsing pointer-chains”: That would need an ABI change: Implicit (pointerchain-collapsed) objects would be passed by effective value, and the callee would need to instantiate a new wrapper object if it wants to pass it on to a
@nospecialize function. As far as I understood, such situations are rare in performance-relevant situations. But pointer-chain collapse would indeed produce an overhead / extra allocations in such a situation. On the other hand, (2) and (3) would alleviate this potential performance impact.
If I understood you right, the last point is the reason you want the “collapse of pointer chains” to be tackled at a later future date, even if the necessary changes for (4) are mostly independent of (1)-(3)?