Why is `sort(x, by = _ -> rand())` not a good shuffler?

I guess that the problem is precisely that the result of rand is changing on each run.

If you create a random column and sort by this colum It must be fine.

But as rand gives you something new on each comparison if the Sort algorithm, the element is doing a “random walk” sometimes it goes to the front and sometimes ir goes back and on average It ends more ore less where It starts

2 Likes