The importance of recommender systems
We live in a world in which recommendations are so common that we tend to forget the apparent ease with which these systems designed to optimize our consumer choices have been incorporated into practically any device and platform. We’re used to constantly receiving recommendations, but the way in which they have settled in our daily life is a long and interesting history marked by ups and downs and the ambition of a few pioneer companies and some other dreamers.
Two different ways: items vs. persons
First of all, let’s start with the basics. A recommender system, as you may be aware, is a system that normally tries to predict the value a user would grant to a certain content. In order to calculate these predictions, these engines develop combining data models from different nature.
Initially the first recommender systems were based on the content. They focused on the understanding of the product itself instead of knowing the user. Basically, these content-based filtering recommender systems elaborate a specific profile for each content and then perform some correlation matrices of the data. In this way, they’re able to predict certain behavioral patterns related to some existing connections between similar products and consumer trends by users. But a problem arises, with a constant production of new content, the amount of data to be analyzed tends to grow exponentially, so the required effort for those predictions gets more and more excessive.
Then, collaborative filtering recommender systems emerged. Such systems are based on observation of user behavior, assuming that predictions can be made by taking into account consumer choices in the past and by its resemblances to other users. As you suppose, this approach requires a large data collection effort too, but in order to ease the information amount, it classifies users into some specific groups based on certain factors like demographic data and behavioral patterns. However, it remains an enormous task and there are certain problems derived from some weaknesses and potential artifacts resulting from the irrational behavior of the user or the mere data collection routine.
Let’s see some of these problems. One of the best known is the so-called “cold start” which implies the existence of an information gap at the beginning of the user experience. Other typical issue is sparsity, which means that, as the number of objects in the collection is so large, users can only rate a small part of it, so the system tends to favor mainstream items to the detriment of other niche contents. In addition, there is the already mentioned difficulty of scalability which involves an excessive computational workload as the number of items grows exponentially. In order to solve this potential issues, hybrid systems have been created to build a comprehensive model under the belief that, combining the properties of both approaches, it may be possible to establish some real predictions, but the truth is that it is still relatively expensive to maintain a reasonably reliable system that is able to accurately predict user behavior.
Searching for perfection. Some failure cases
A good example of failure might be the famous Netflix Prize in which the multinational offered the amount of $1 million to those who found the best collaborative filtering algorithm. Broadly speaking, the aim of the competition was to reduce a value called root-mean-square error, which basically express the difference between what the system predicts and what users actually do. A lot of money and big efforts in the form of hard work and wasted time were invested by many development teams from all over the world and the solution, found in 2009 by a team called BellKor’s Pragmatic Chaos, was never used. According to Netflix, the accuracy gained with the solution didn’t justified the enormous efforts required to operate the system.
This situation proved that in most cases is more functional a system that works in a practical way to some extent than trying to achieve perfection. In addition, certain circumstances related to the recommender systems that are being studied these years should be considered. In this regard, Wharton university professor Kartik Hosanagar has develop some researches about the operational quality of recommender systems and has come to some interesting conclusions. For example, his studies have been able to demonstrate that, indeed, recommendations don’t necessarily help users discover niche products, so it would be convenient to make some appropriate modifications if this is the ultimate goal. He also found a close relationship between ratings and recommendations which is not always as expected, as evidenced by the fact that users tent to respond even better to items with lower ratings. So, as you can see, to scalability (and the cold-start and sparsity) should be added a very basic “inconvenience”: human being is unpredictable.
The Commons recommender system
Considering the premise that the content grows in an almost irrational way and the amount of users is a much more constant figure, we have developed a recommender system essentially based on people. So Commons is a people-oriented recommender which compares people who share similarities with one another.
An initial test places each individual on the matrix and then the system will refine its relation with the others. Basically, we are replacing items by people and, under this premise, working on the other factors. It is just a simplification and above all it is very practical and helpful.
With the Commons system we solve the major issues of recommender systems. On the one hand, by referring a more discrete magnitude in contrast to the items, we can easily overcome scalability. By seriously analyzing each person, we avoid the “cold start” and, considering that all recommendations come from individuals, we pass over the sparsity problem. What about the unpredictability of people? Well, the good thing is that Commons is entirely formed by and for people.