I have an array of matrices whose columns are used as vectors to do the following calculation in the function E_int

```
nb=6; #number of columns
dim=120; #number of rows
np=400; #number of matrices in the array
function H0(n::Integer)
half=rand(ComplexF64,dim,dim);
H0=half+adjoint(half);
return H0
end
function E_int()
u_list=[rand(ComplexF64,dim,nb-1) for i=1:np];
En=sum(dot(u_list[n][:,i],H0(n)*u_list[n][:,i]) for n=1:length(u_list),i=1:nb-1)/(np^2);
return En
end
```

Calculating the sum En scales badly in the computation time and the memory allocation as the dimension increases. For example that is the output of @time

```
0.806685 seconds (142.97 k allocations: 901.410 MiB, 9.86% gc time, 19.40% compilation time)
```

Any advice how to efficiently do this kind of computation would be appreciated.