Hello guys,

I am doing heterogeneous agents macro in olg. I have a model where the agents have a lot of choices at each point in time, and its becoming very slow to solve, especially when I increase the number of points in the state space. I am parallelizing the code using @threads. My question is: is there a gold standard in speeding up these kinds of problems in Julia? When you solve these problems, do you prefer to use the GPU?

To give a basis for discussion, the central point of my code (and the slowest part) is something like this:

```
T = 30
nb = 24
nh = 6
na = 3
nm = 21
ne = 7
vf = zeros(nb,nm,nh,na,ne,T) # i'm assuming the end-point is zero for every point in the state space, for simplicity
for ij = 1:T-1
Threads.@threads for ib = 1:nb
for im = 1:nm, ih = 1:nh, ia = 1:na, ie = 1:ne
v,b,m,h= solve_problem(param,ib,im,ih,ia,ie,T-ij,vf[:,:,:,:,T-ij+1])
vf[ib,im,ih,ia,ie,T-ij] = v
# store decision variables
end
end
end
```

Thanks for any help you can give me