Essentially, in this blog post, I explain in simple words how personalization works, why it can be beneficial and why it, unfortunately, often is considered creepy. Not surprisingly, Facebook plays a major role in this article. What exactly is the free lunch that Facebook serves and could it be served in a more decent manner?
The original ambition of personalization, as stated back in the 1990s in the classic book Adaptive User Interfaces, is that not only 'everyone should be computer literate', but also that 'computers should be user literate'. In this early stage, we humans created 'mentalistic' models that represented our knowledge, interests, needs and goals in a way that could be interpreted by computers, but also by us. Gradually, these models have matured from hand-made and rather simple to statistical models based on a large amount of raw data.
A classic statistical approach to personalization is collaborative filtering, which still works in a very human-understandable way. In simple terms, collaborative filtering assumes that people who like similar things (such as books or movies) have a similar taste and therefore will also like other similar things. Collaborative filtering first identifies those users that are most similar to you, and then recommends items that they like but that you haven't seen (or rated, or bought) yet. Indeed, this is the way Amazon (among others) works, and anyone who has experience with these recommendations knows that they are far from perfect.
Companies like Facebook and Google therefore use a different approach: based on as many observations (or data points) as they can collect (and store and process), their algorithms (which are far more complex and less transparent than good old collaborative filtering) try to predict which search results, friends' posts, page suggestions - and advertisements - will be relevant for us. These observations can be anything, including your user profile, previous search queries, clicks on friends' posts, participation in an online game, online purchases, the likes that you receive and give, and so on. Researchers like Jennifer Golbeck even think that far-fetched proxies such as liking a picture of curly fries are an indicator of how intelligent you are (watch her entertaining TED Talk, it's nine minutes well spent). This data-driven approach arguably works better, but with the consequence that it becomes hard - but not as impossible as many companies would like us to believe - to explain why they think we will like these personalized results.
De Nederlandstalige versie van dit artikel vind je op de site van het Privacy & Identity Lab.
Back in 2011 already, Eli Pariser taught us in his TED Talk “Beware online filter bubbles” that our online lives largely take place within a filter bubble. Facebook automatically selects the items that will reach your news feed based on your click behavior, and Google search results are personalized based on, among others, your current location and your search history. As a result, we mainly encounter information and opinions that match our own life philosophy.
In a similar fashion, traditional newspapers and other news outlets make a selection of the news items to be included. It is common knowledge that the New York Times has a liberal bias and Fox News a conservative bias, and that people usually choose for a newspaper that matches their own orientation and interests. By contrast, little is known about political bias in smaller, regional newspapers or in the still growing number of newsportals, among which the Huffington Post, Yahoo News, CBS, but also the Breitbart News Network.
We carried out a study to identify political bias within the media in Chile and obtained some surprising results that are relevant for the media landscape in general and for our personal, personalized news consumption.
Starting today, I will post updates on my research work on my website. My blog posts will probably vary from longer or shorter summaries of recently accepted papers to rambling about my research field, which is on the fine balance between the benefits of personalization and perceived and actual risks associated with privacy matters. All blog posts will be intended to be read by the general, interested audience.
I am realistic enough to know that most blogs start off enthusiastically and then slowly bleed to death. Well, I am in the first phase, so do expect some more new posts in the near future.
Privacy Engineering, User Modeling, Personalization, Recommendation, Web Usage Mining, Data Analysis and Visualization, Usability, Evaluation
Dr. Ir. Eelco Herder
Radboud Universiteit Nijmegen
Institute for Computing and Information Sciences
Mercator 1 - Room 03.01
6525 EC Nijmegen