What's the best way to compare items in an array to every other item?

The very first thing I would do before any of this is some profiling on the actual data. Unfortunately, worst case scenarios are O(n^2) for your problem.

Yes, this could help if there are a lot of duplicates.

Also, investing in a string comparison function that can bail out early if the difference metric is guaranteed to be above some threshold could be worthwhile.