What exactly are you trying to accomplish? Usually for group operations a groupby à la split and apply works pretty good and handles memory quite well.
You could try it, but the more columns you have the more work it would have to do. Rearranging every row for the whole dataframe should be faster as it only has to apply the permutation once and It writes in-place. Column wise might be a last resort if memory is not enough.
I don’t think “apply the permutation only once” is correct here. Under the hood, indexing a data frame implies indexing repeatedly each column.
I suspect the permute!-based sorting of DataFrame is slower just because permuting a vector is slower than indexing it (i.e. allocating a new copy). The overhead related to DataFrame should be negligible here.
Say, the computation I want to perform with the permuted dataframe would be faster if the all the columns are permuted as well. This is for “cache-efficiency”, as the next part of the program requires me to go through the vector several times in order.