I’m writing a simulation that solves a system of N differential equations. I’m using fixed step Euler integration, so it’s always a simply for loop that iterates over many times. I have two implementations, and one is 4X faster than the other, and has 30% the allocations. I’d like to understand why:

```
# fast code
f1(V::Float64,dt) = some function of V
f2(V::Float64,dt) = some function of V
...
fN(V::Float64,dt) = some function of V
integrate(t::CustomType,dt::Float64)
var1::Float64
var2::Float64
...
varN::Float64
for i = 2:nsteps
var1 = f1(var1,dt)
...
varN = fN(varN,dt)
end
end
```

This is rigid code, in the sense that if I want to modify my system by omitting a few variables and corresponding functions, it is laborious to rewrite. Instead, I wrote these structs and associated functions. This is nicer conceptually, but runs much slower:

```
# slow code
abstract Type X
end
mutable struct X1 <: X
v1
end
function integrate(x::X1,dt::Float64)
x.v1 = f1(x.v1,dt)
end
mutable struct Wrapper
all_X::Array{X,1}
end
# now I can define my core integration loop nicely:
function integrate(w::Wrapper,dt)
for x in w.all_X
integrate(x,dt)
end
end
```

I really want to write code like the second case – it’s a lot more flexible, and prevents me from repeating code. But the second implementation is 4 times slower than the first. Any tips on why would be appreciated, as would help on how to get it to run faster.