The dynamic default scheduling has many advantages. For example, it enables composable multithreading (e.g. an @threads block inside of another @threads block) which you almost certainly will sacrifice when specifying a specific static task-thread mapping manually.
I’m just a beginner in parallelization, so I don’t understand all that you’ve said, but thank you for letting me know about these two options. Generally, it is much easier for me to learn through examples than by just reading documentation.
In light of what I said above: Why, and are you sure?
I am not sure, but in this problem of mine I implemented parallelization accumulate2!, where each thread picks a vertex in the graph, checks all its (left) neighbors, and processes those who have already been finished. My intent was that threads go tightly from left to right (if execution times are not too different), so each time, most of the left neighbors are already finished and are ready for processing.
So - for some reason that is unclear to me - you seem to want
1 => 1, 2 => 2, 3 => 3, 4 => 1, 5 => 2, ...?
No, I would just like the first 6=nthreads() numbers on the left to be a permutation of 1:6, the next 6 numbers on the left to be a permutation of 7:12, etc., as I mentioned. In short, I’d like each thread to pick the first available element.