Essentially, in this blog post, I explain in simple words how personalization works, why it can be beneficial and why it, unfortunately, often is considered creepy. Not surprisingly, Facebook plays a major role in this article. What exactly is the free lunch that Facebook serves and could it be served in a more decent manner?

The original ambition of personalization, as stated back in the 1990s in the classic book Adaptive User Interfaces, is that not only 'everyone should be computer literate', but also that 'computers should be user literate'. In this early stage, we humans created 'mentalistic' models that represented our knowledge, interests, needs and goals in a way that could be interpreted by computers, but also by us. Gradually, these models have matured from hand-made and rather simple to statistical models based on a large amount of raw data.

A classic statistical approach to personalization is collaborative filtering, which still works in a very human-understandable way. In simple terms, collaborative filtering assumes that people who like similar things (such as books or movies) have a similar taste and therefore will also like other similar things. Collaborative filtering first identifies those users that are most similar to you, and then recommends items that they like but that you haven't seen (or rated, or bought) yet. Indeed, this is the way Amazon (among others) works, and anyone who has experience with these recommendations knows that they are far from perfect.

Companies like Facebook and Google therefore use a different approach: based on as many observations (or data points) as they can collect (and store and process), their algorithms (which are far more complex and less transparent than good old collaborative filtering) try to predict which search results, friends' posts, page suggestions - and advertisements - will be relevant for us. These observations can be anything, including your user profile, previous search queries, clicks on friends' posts, participation in an online game, online purchases, the likes that you receive and give, and so on. Researchers like Jennifer Golbeck even think that far-fetched proxies such as liking a picture of curly fries are an indicator of how intelligent you are (watch her entertaining TED Talk, it's nine minutes well spent). This data-driven approach arguably works better, but with the consequence that it becomes hard - but not as impossible as many companies would like us to believe - to explain why they think we will like these personalized results.

Who is the user and who is the product?

Is it bad if we do not understand exactly how results are generated and how these results are influenced by the profiles built on these observations? Not necessarily: from a Dataism point of view, "in a world of increasing complexity, relying on data can reduce cognitive biases and illuminate patterns of behavior we haven't yet noticed". We have become used to automatic route descriptions and increasingly rely on automatic news feeds, knowing that it is impossible to read through all news articles written on a particular, random day, or all tweets or posts written by our online friends. And that is exactly what personalization is all about. We could take it even a bit further: what if we accept that the computer (often) knows better what is good for us than ourselves and do not require an explanation anymore? Sounds futuristic? Think about all those people who wear activity trackers that tell them when to wake up, what to eat, how much to exercise, and when to go to sleep.

In an ideal world, personalization can be very beneficial and good for us. However, the downside of personalization - and therewith the collection of the user data that is needed for this purpose - is that the user is not the only stakeholder in the process. This is not really a new insight: we know that Amazon recommends us items because it hopes that we will buy them. We also are aware that the real customers of Facebook are not its users, but the advertisers that generate the money that Facebook earns - making use of our data. And even this does not concern many users: advertisements do not hurt, particularly if that is the only price we have to pay for a fun or useful platform that is 'free'.

But now users have become increasingly aware that their user data is not only used by the platform itself with which they made this functionality-for-data deal, and not only for online advertisements. Some years ago, Facebook was criticized for conducting a psychological experiment in which they tried to find effective methods for changing our emotions, and recently Facebook (yes, again) came under fire for a large amount of user data that was used by Cambridge Analytica for manipulating the US elections and the Brexit vote, user data that they were not supposed in the first place.

I personally doubt whether the election results have been significantly influenced by such manipulations, particularly because the techniques that were used by Cambridge Analytica were apparently quite naive: "In one internal email seen by the Guardian, employees are asked to identify which issues on a list of 500 Facebook “like” items would be most “useful for political modeling or commercial sales”." But this does not mean that no harm has been done, on the contrary. This unwanted and intransparent spreading and use of data for purposes that are not in the user's own interests by actors that users even never heard of, is indeed creepy and rightfully causes a lot of feelings of insecurity, helplessness and anger.

A recent Facebook advertisement that concerned a friend

Recently, a friend of mine reported that at her workplace, Facebook showed her an advertisement for a product that her husband had shown her on his cell phone the evening before. She felt observed and therefore mistrustful. I found out that this probably was caused by a combination of two Facebook ad settings (which can be changed, by the way):

  • Ads based on your use of websites and apps. One of the ways in which we show you ads is based on your use of websites and apps that use Facebook's technologies. For example, if you visit travel websites, you might then see ads on Facebook for hotel deals. We call this online interest-based advertising.
  • Ads on apps and websites outside the Facebook Companies. Facebook Audience Network is a way for advertisers to display ads on websites and apps across devices such as computers, mobile phones and connected TVs. When companies buy ads through Facebook, they can choose to have their ads distributed in Audience Network.

My suspicion is that either her husband's visit to the website was registered based on the household's IP address and then connected to my friend's Facebook account (as she and her husband make use of the same wifi network, they share the same IP address). Or Facebook directly made use of the second 'functionality', displaying ads across devices. Either way, even if this is in line with legal Facebook policies, for ordinary users it is close to impossible to figure out what is going on and therefore it is understandable that they become sceptical with anything related to personalization.

Who paid for my free lunch?

It is hard to find a way out. User data can be used both for genuine purposes that are of the user's interest as well as for manipulating the user (for example by silently introducing bias in a news feed). The upcoming General Data Protection Regulation (GDPR) in Europe provides a legal framework for ensuring that user data will only be used for specific purposes that the user has given consent to. Among many others, Facebook says to be working hard on GDPR privacy controls. However, my fear is that at the same time the privacy policies will be very carefully sugarcoated, with the aim to obtain user consent for such practices after all.

Personalization provides many advantages and has become an unmissable technique on the world wide web. It is high time to regain trust in personalization. This requires not only privacy-preserving algorithms and strong security mechanisms, but also regulations and legal frameworks, and - most importantly - companies that are transparent about the stakeholders and their interests. If there is such a thing as a free lunch, then I would like to know who paid for it - and why.

I do like free lunches and I am willing to accept that parties like Facebook try to seduce me with suggestions that they think I will like in exchange, but at least they should have the decency to tell me why they think I will like it, whom they got this suggestion from, and what they have told them about me in order to get this suggestion. Such explanations might even make me more willing to accept and even like such sponsored suggestions. There is no need to be secretive about it, is there? Is there?

See also my follow-up blog post: Nudge Nudge, Wink Wink: What Do Users Really Want?

Eelco Herder

eelcoherder 256px

Privacy Engineering, User Modeling, Personalization, Recommendation, Web Usage Mining, Data Analysis and Visualization, Usability, Evaluation

Dr. Ir. Eelco Herder
Radboud Universiteit Nijmegen
Institute for Computing and Information Sciences
Toernooiveld 212
Mercator 1 - Room 03.01
6525 EC Nijmegen
The Netherlands

Email:
Phone: +31 24 36 52077
Skype: eelcoherder

linkedin

facebook