Use a proper way of benchmarking (using e.g. BenchmarkTools)
This code is way to small for threading to be useful
This might not benefit from threading since the operation is so simple that it’s possible that memory speed will bottleneck and not executing instructions.
Indeed, in this case too the threaded code is slower than the unthreaded code. I am only using this simple case to observe the performance. I have larger code where I observe the same behaviour. Do you have a suggestion on a better way to parallelize independent for loops?
Threads are fine. You should benchmark your real code to see if you get better performance. If not (as you said) there may be something wrong with your code. You can show us your code and perhaps we can try for a sound suggestion.
Thank you @oheil. I will benchmark and see if I could find something that could be causing this issue before posting my large code here. Also, just for reference, I found a discussion on similar issue here.