Solving Data Privacy Once and For All

The way online services are setup today implies that the only technical means to provide a more personalized experience to customers is to collect as much as possible personal data into a server and then to put it into some machine that offers recommendations. Personalization is convenient, and we all want convenience, even at the price of compromise of our personal lives. This line of thought started with Amazon, Google, and Facebook, and today it seems that every other online service is operating under the same modus operandi. A situation that is irrational in terms of consumer privacy having hundreds of copies of our most intimate online and demographic data in the hands of thousands of employees and systems in small and large companies.?

The fact that our data is collected and stored somewhere out of our hands is the root of all evil ? exactly where the privacy sagas of the recent decade started. On a broader view, the world is stuck in a stalemate against this new world paradigm, where legal and government institutions do not even know how to approach this issue beyond offering arbitrary fines, which are hard to enforce. We all march towards a future where more and more sensitive data is collected about us and potentially abused in ways we can’t imagine.

The question is whether, from a technical point of view, this modus operandi of collecting more and more data centrally to personalize experiences is the only way to go.

To understand our options, we need a little background on how personalization algorithms work. Let’s say we want to see when we go to amazon.com, a list of products that fit our personal preferences. Amazon has millions of products, and it does not make sense to serve customers an alphabetical list of all the products available to choose from. Our personal preferences naturally reside inside our brains, and unless we communicate them explicitly, no one can know about them. One way to create a personalized product list is for the user to tell Amazon explicitly which product categories are interesting, and that is in a way how the personalization wave started. That approach didn’t stand the test of time as our preferences change entirely across time and context in our lives. Furthermore, reviewing a list of product categories and specifying what is interesting and what is not is a tedious task no one wants to go through. The convenience cost of explicitly stating your preferences is higher than the convenience value of getting a personalized product list. Adding to that the fact that every online service today is interested in offering personalization ? I bet it would take 20% of our digital time to fill in such forms.

Once we got over the paradigm of explicitly specifying preferences, companies started understanding that they can extract these preferences implicitly from the way we interact with the online service. For example, you have seen a book on Amazon and clicked to enter the book page and read some reviews about it for five minutes. These online actions can imply that the book has something interesting for you, something that can hint at your preferences. The more behavior recorded on the site, the more accurate they can build a rich profile of our changing preferences within time. Today, recommendation algorithms collect all the interactions on the website and map them to the list of items you interacted with. Every item on the list has a comprehensive descriptive profile – for example, a specific Business Management book has metadata of the subjects the book is dealing with, the name of the author, the text that is inside the book, and a list of other customers who bought that book as well. The profile of the item you showed interest for is compiled into your user profile and, within time, turning your personal profile into an accurate and rich depiction of what are your preferences. That is in simple terms how a personalization process looks like where in reality it is fine-tuned to be more precise. Improvements such as comparing a user profile with other “similar” minded customers for cross recommending items bought or updating a user profile on Netflix based on the actual scenes you see in a movie wrapped with metadata accompanying each scene. An endless game of creating richer user profiles for more accurately optimizing your experience to increase the chances of you doing what they want you to do.

This modus operandi is not a necessity from a technological point of view, and online services can offer personalization in a different way, which is respectful and privacy-preserving of your data. The main thing to keep in mind while thinking about alternative approaches is to make sure you understand that In the digital realm, once you give up a single copy of your data into the hands of a third party, you have lost the battle. One way forward is to record and store all your data locally on your devices, and to offer online services the opportunity to interact with your data but in a respectful manner. It is, in a way, a reversed personalization process where the online service interacts with your local personal profile temporarily whenever a personalized decision needs to be taken. The online service asks for the relevant part of your profile in which you allowed access to and uses that snapshot to create the personalized experience. The online service will be obligated to treat that snapshot of the personal profile as temporary and anonymous hence will not be associated with you beyond the specific browsing session. A concept that is much easier to enforce from a regulatory point of view. There are many different ways where such a scheme can work, including doing the actual heavyweight personalization process with the online service catalog locally on the user device in a secure manner to create more accurate personalized experiences without sending anything to the servers of the company. In a world of 5G where data bandwidth is not a problem anymore, such data exchange can happen seamlessly.

A reversed approach would finally allow consumers to have full control over their data, control over who has access to data, and at what granularity ? shifting the power back to consumers. From a regulatory point of view, since the raw data is not located on the company’s servers anymore, it is easier to enforce laws that prevent providers from using the temporary profile snapshots for other purposes.

It is essential to understand that beyond personalized experiences, there are no real incentives for consumers to give away their data for free. Once the technological challenge of how to create personalized experiences is solved in a privacy-preserving manner, the options are unlimited in terms of going one step further and attaching value to our data.