Behavior of threads

I think in the first code fragment @sync is not needed, actually?

I elaborated these ideas in an application. The algorithm based on tasks does not work very well. Some of the tasks tend to take quite a bit longer than others. Perhaps you are onto something with the spawning?

The threaded version actually works quite well.

Some scaling data is provided in this thread: Parallel assembly of a finite element sparse matrix - #19 by PetrKryslUCSD