Recommender system


A recommender system, also called a recommendation algorithm, recommendation engine, or recommendation platform, is a type of information filtering system that suggests items most relevant to a particular user. The value of these systems becomes particularly evident in scenarios where users must select from a large number of options, such as products, media, or content. Major social media platforms and streaming services rely on recommender systems that employ machine learning to analyze user behavior and preferences, thereby enabling personalized content feeds.
Typically, the suggestions refer to a variety decision-making processes, including the selection of a product, musical selection, or online news source to read. The implementation of recommender systems is pervasive, with commonly recognised examples including the generation of playlist for video and music services, the provision of product recommendations for e-commerce platforms, and the recommendation of content on social media platforms and the open web. These systems can operate using a single type of input, such as music, or multiple inputs from diverse platforms, including news, books and search queries. Additionally, popular recommender systems have been developed for specific topics, such as restaurants and online dating services. Recommender systems have also been developed to explore research articles and experts, collaborators, and financial services.
A content discovery platform is a software recommendation platform that employs recommender system tools. It utilizes user metadata in order to identify and suggest relevant content, whilst reducing ongoing maintenance and development costs. A content discovery platform delivers personalized content to websites, mobile devices, and set-top boxes. A large range of content discovery platforms currently exist for various forms of content ranging from news articles and academic journal articles to television. As operators compete to serve as the gateway to home entertainment, personalized television emerges as a key service differentiator. Academic content discovery has recently become another area of interest, the emergence of numerous companies dedicated to assisting academic researchers in keeping up to date with relevant academic content and facilitating serendipitous discovery of new content.

Overview

Recommender systems usually make use of either or both collaborative filtering and content-based filtering, as well as other systems such as knowledge-based systems. Collaborative filtering approaches build a model from a user's past behavior as well as similar decisions made by other users. This model is then used to predict items that the user may have an interest in. Content-based filtering approaches utilize a series of discrete, pre-tagged characteristics of an item in order to recommend additional items with similar properties.

Example

The differences between collaborative and content-based filtering can be demonstrated by comparing two early music recommender systems, Last.fm and Pandora Radio. We can also look at how these methods are applied in e-commerce, for example, on platforms like Amazon.
  • Last.fm creates a "station" of recommended songs by observing what bands and individual tracks the user has listened to on a regular basis and comparing those against the listening behavior of other users. Last.fm will play tracks that do not appear in the user's library, but are often played by other users with similar interests. As this approach leverages the behavior of users, it is an example of a collaborative filtering technique.
  • Pandora uses the properties of a song or artist to seed a "station" that plays music with similar properties. User feedback is used to refine the station's results, deemphasizing certain attributes when a user "dislikes" a particular song and emphasizing other attributes when a user "likes" a song. This is an example of a content-based approach.
  • In e-commerce, Amazon's well-known "customers who bought X also bought Y" feature is a prime example of collaborative filtering. It also uses content-based filtering when it recommends a book by the same author you've previously read or a pair of shoes in a similar style to ones you've viewed.
Each type of system has its strengths and weaknesses. In the above example, Last.fm requires a large amount of information about a user to make accurate recommendations. This is an example of the cold start problem, and is common in collaborative filtering systems. Whereas Pandora needs very little information to start, it is far more limited in scope.

Alternative implementations

Recommender systems are a useful alternative to search algorithms since they help users discover items they might not have found otherwise. Of note, recommender systems are often implemented using search engines indexing non-traditional data. In some cases, like in the Gonzalez v. Google Supreme Court case, may argue that search and recommendation algorithms are different technologies.
Recommender systems have been the focus of several granted patents, and there are more than 50 software libraries that support the development of recommender systems including LensKit, RecBole, ReChorus and RecPack.

History

created the first recommender system in 1979, called Grundy. She looked for a way to recommend users books they might like. Her idea was to create a system that asks users specific questions and classifies them into classes of preferences, or "stereotypes", depending on their answers. Depending on users' stereotype membership, they would then get recommendations for books they might like.
Another early recommender system, called a "digital bookshelf", was described in a 1990 technical report by Jussi Karlgren at Columbia University,
and implemented at scale and worked through in technical reports and publications from 1994 onwards by Jussi Karlgren, then at SICS,
and research groups led by Pattie Maes at MIT, Will Hill at Bellcore, and Paul Resnick, also at MIT, whose work with GroupLens was awarded the 2010 ACM Software Systems Award.
Montaner provided the first overview of recommender systems from an intelligent agent perspective. Adomavicius provided a new, alternate overview of recommender systems. Herlocker provides an additional overview of evaluation techniques for recommender systems, and Beel et al. discussed the problems of offline evaluations. Beel et al. have also provided literature surveys on available research paper recommender systems and existing challenges.

Approaches

Collaborative filtering

One approach to the design of recommender systems that has wide use is collaborative filtering. Collaborative filtering is based on the assumption that people who agreed in the past will agree in the future, and that they will like similar kinds of items as they liked in the past. The system generates recommendations using only information about rating profiles for different users or items. By locating peer users/items with a rating history similar to the current user or item, they generate recommendations using this neighborhood. This approach is a cornerstone for e-commerce sites that analyze the purchasing patterns of thousands of users to suggest what you might like. Collaborative filtering methods are classified as memory-based and model-based. A well-known example of memory-based approaches is the user-based algorithm, while that of model-based approaches is matrix factorization.
A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the k-nearest neighbor approach and the Pearson Correlation as first implemented by Allen.
When building a model from a user's behavior, a distinction is often made between explicit and implicit forms of data collection.
Examples of explicit data collection include the following:
  • Asking a user to rate an item on a sliding scale.
  • Asking a user to search.
  • Asking a user to rank a collection of items from favorite to least favorite.
  • Presenting two items to a user and asking him/her to choose the better one of them.
  • Asking a user to create a list of items that he/she likes.
Examples of implicit data collection include the following:
  • Observing the items that a user views in an online store.
  • Analyzing item/user viewing times.
  • Keeping a record of the items that a user purchases online.
  • Obtaining a list of items that a user has listened to or watched on his/her computer.
  • Analyzing the user's social network and discovering similar likes and dislikes.
Collaborative filtering approaches often suffer from three problems: cold start, scalability, and sparsity.
  • Cold start: For a new user or item, there is not enough data to make accurate recommendations. Note: one commonly implemented solution to this problem is the multi-armed bandit algorithm.
  • Scalability: There are millions of users and products in many of the environments in which these systems make recommendations. Thus, a large amount of computation power is often necessary to calculate recommendations.
  • Sparsity: The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.
One of the most famous examples of collaborative filtering is item-to-item collaborative filtering, an algorithm popularized by Amazon.com's recommender system.
Many social networks originally used collaborative filtering to recommend new friends, groups, and other social connections by examining the network of connections between a user and their friends. Collaborative filtering is still used as part of hybrid systems. This technique can employ embeddings, a machine learning technique.