Hi,
I’m currently working on solving a discretized PDE-ODE coupled system, which has been discretized into a system of odes where we then a noise in the form of a stochastic process, converting the ODE system into a system of RODEs. I’m solving the system in Julia using the built-in ImplictEuler() method. Furthermore, due to the large number of equations, I am using multithreading, namely the @threads macro, to parallelize the for loop which generates the right-hand side and Jacobian of the system. I also limit the number of BLAS threads to the same number of threads used for the parallel loops.
I have been running into strange behaviour when solving the system. For the RNG, I use the same seed across all simulations of different threads count to ensure that simulations are the same each time. When I run the simulation with different amounts of specified available threads, using --threads tag when running the code through the command line, I get certain simulations crashing with a singularexception(0) exception or simulations that never finish, meaning when I use the progress bar, which I use for all simulations, the ETA increases while the bar never moves. Interestingly though, when running single-threaded, the simulations never fail nor do they get caught in a loop. This issue I am having persists across different workstations, running different architectures.
Machine 1: Intel Xeon E5-1660 and 32 GB of RAM, running Ubuntu 18.04
Machine 2: AMD Ryzen 5 3600X and 16 GB of RAM, running Ubuntu 20.04
When running one class of simulations, both machines 1 and 2 completed the simulation with threads count 1 and 4, machine 1 failed with thread count 2 (singular exception) and machine 2 passed. Both machines were stuck in a loop in thread count 6. These results carried over when I re-ran the simulations again.
On a different RNG simulation, all passed except machine 2 on thread count 6.
So, I am wondering if this issue is happening with anyone else and if there are any possible solutions to this.