"There has been many improvements that has been added to this kind of algorithm [KNN], like adjusting for the different “average” score that each user gives to the items (i.e. they compare the deviations from user’s averages rather than the raw score itself).
Still they are very far from today’s methods. The problem of KNN is that it doesn’t enable us to detect the hidden structures that is there in the data, which is that users may be similar to some pool of other users in one dimension, but similar to some other set of users in a different dimension. For example, loving machine learning books, and having there some “similarity” with other readers of machine learning books on some hidden characteristics (e.g. liking equation-rich books or more discursive ones), and plant books, where the similarity with other plant-reading users would be based on completely different hidden features (e.g. loving photos or having nice tabular descriptions of plants).
And then, some units more, they teach how to implement “collaborative filtering” using Generative (or “Gaussian”, as mostly Gaussian components are used) Mixture Models (that I implemented in Julia here ).
But my question is why a GMM would be better than KNN in “finding the hidden structure” ? At the end we still say this obs is close to this k1 characteristic with p_k1, to this other k2 characteristic k2 with p_k2, …