I was being sloppy in my language. I was talking from a mathematical point-of-view. You can of course have parallel instances of Mersenne Twister, and they can be initiated with different seeds. But this does not mean that you will get independent streams of random deviates.
In an ideal world, of course, it would. But the independence of streams of random numbers from the Mersenne Twister algorithm is tightly dependent on what the individual seeds are. For many applications you will get away with it. But for many others you will get unacceptable correlations across streams. MT has internally a very high dimensionality of its state space, but it is understudied what are the conditions which will lead to correlations across instances (apart from the obvious identical seeding situation).
In my own application, I use 10,000 parallel streams with 10^9 calls to each. For 10,000 individual initial conditions (seeds) and such long sequences of calls it’s a huge problem to ensure that all streams are completely uncorrelated. That’s why I went for Random123 instead.