I’m trying to create a power set function with only set operations, but it turns out to be much slower than the array version. I wonder why this is the case, and how to make the Set version as fast as the array version.
You can profile it to see what’s slow. Before looking at the code, I would guess that set union is slower than array concatenation because set union also requires checking for duplicates.
The expensive part is not actually checking for duplicates, but because Set is implemented as a hashmap, each insertion into a set needs to calculate the hash of the item first. Hashing a set is currently O(n), since the overall hash depends on each element of the set. OTOH, appending to an array is amortized to only be O(log(n)). I wonder whether we can use a similar strategy for hashing sets as we do for arrays, where for large arrays we only actually sample O(log(n)) items. That might be worth opening an issue actually.