Optimization tips for my julia code. Can I make it even faster and/or memory efficient?

There seems to be a bug which has been introduced by this optimized version. Try Kmeans(X, 7) to reproduce.