Well it sounds to me like you are doing simulated method of moments or something similar. In that case, I would guess you would want to use a global optimizer instead of a local one unless you have guarantees that your initial point is “good enough” or that your problem is convex (but usually in SMM we do not have these guarantees — again guessing here).
1 hour is an awfully long time for a single objective function evaluation. I think it could be worthwhile for you to look into using GitHub - SciML/Surrogates.jl: Surrogate modeling and optimization for scientific machine learning (SciML) to build a surrogate (“approximating”) model to your objective function and then optimize surrogate. After that, you use that estimate as the initial point to optimize your true objective function.
Alternatively, you could look into something like the TikTak algorithm which was explicitly designed for this type of parallelism.