Are there any more-readable alternatives to using many individual variables for performance critical code?

It’s not clear how you prefer to to write the code, so I’m just throwing a lot of ideas out here.

  1. I’m guessing that you’d still want to keep variables separate for individual reassignment in the function bodies because x += f(y) is less readable and writable as massive_struct.x += f(massive_struct.y). So, packing variables into a single instance only simplifies data being put in function calls, but the function body has to unpack the data at the start and perhaps repack data at the end. You probably already know about iterator destructuring a, b = 1, 2 and splatting f(ab...), but property destructuring (; b, a) = (a=1, b=2, c=3) sounds useful here. You can also splat named tuples for keyword arguments f(;ab...), but it has to have the precise names (no extras), and property destructuring also works for structs (; x, y) = XY(1, 2).
    These can be done in function arguments too, just note that the destructuring stuff doesn’t do multimethods like separate positional arguments. Argument property destructuring is like keyword arguments, which doesn’t do multimethods already, but argument iterator destructuring doesn’t work either e.g. foo((x,y,z))=... has 1 argument formally and will override foo((x,y))=... with 1 argument. That’s not a weird issue with destructuring, it’s just a downside of packing everything together into 1 unannotated argument. Annotating a large tuple or named tuple would be more of a pain than annotating separate arguments, so if multimethods were needed I would lean toward splatting as much as possible.

  2. If you actually want the data packed into 1 instance inside a function, a mutable struct to emulate variable reassignment is overkill, ideally. Mutability is a property of the instance, not the variable, and its utility is for a change to be accessible by multiple variables or other references at the same time, rather than reassigning all of them one by one. That instance must be stored in a separate spot for the multiple variables to point to, which is often an heap allocation (though it could be on the stack if the compiler can determine it has a fixed size and doesn’t escape a local scope).
    This isn’t necessarily a bad tradeoff; multiple variables that can hold different data must store multiple copies, so if they’re supposed to share data, it saves memory for them to point. However, allocating on the heap and garbage collection is slower than stack allocation-deallocation, so it would make sense if your performance was hurt by frequent heap allocations for only packing data going out of and into functions (hence escaping local scopes). That doesn’t mean mutable types are to be avoided at all costs; it just means to save memory and time, you want variables to share a long-lived instance. If you’re instead frequently constructing instances and don’t need sharing, go for immutable types.
    Of course, it’s a pain to reassign a variable with a new immutable instance with a slight change xystruct1 = XY(xystruct1.x, new_y), but there is Accessors.jl to make that easier to write @reset xystruct1.y = new_y.
    I’m not actually sure if this is as performant as separate variables, I’m not really capable of reading LLVM. This comment says that the compiler is able to mutate a field on the stack for reassigning an immutable instance unless the field was holding a mutable instance. I don’t really understand the reason for that exception, I would think it could just mutate a pointer. I would be very interested in an expert looking into this because I have noticed that people often reach for mutable types when they really only need to reassign a variable or a field, possibly due to different mutability definitions in other languages, and it would be fantastic if Accessors.jl could replace that.

1 Like