No one at Facebook knows what it does with your data

One of the biggest tech stories of 2022 didn’t make the biggest headlines but it could show where Big Tech is headed. It came out of documents leaked from Facebook in April, followed by transcripts of a deposition with two of the company’s senior engineers published in September.

The documents revealed that Big Tech’s surveillance systems are starting to move beyond the control of any individual. Your data is being collected in such quantities and distributed across so many systems that even the engineers who built them don’t know how to manage or access the data streams.

And that’s just the systems within one of the richest companies in the world: Meta. If you want to delete it all, or even just see what’s there, you can’t. Your personal information lives on the servers of dozens or hundreds of companies with varying levels of funding, technical expertise, and privacy safeguards.

This isn’t just another privacy scandal. This is the ultimate outcome of a surveillance economy that profits off of people’s private information. One of the world’s major data-collection technologies is now so large and complex that no one can control it.

This article will help you understand Meta’s privacy admissions and their consequences for you.

What we learned about Meta data storage

In April 2022, Motherboard published a leaked internal document(new window) in which Meta employees sounded the alarm that the company was incapable of complying with forthcoming data privacy laws. They compared the data to a drop of ink dropped into a lake.

“Imagine you hold a bottle of ink in your hand,” they wrote. “This bottle of ink is a mixture of all kinds of user data. … You pour that ink into a lake of water … and it flows … everywhere. How do you put that ink back in the bottle? How do you organize it again, such that it only flows to the allowed places in the lake?”

They concluded: “We do not have an adequate level of control and explainability over how our systems use data, and thus we can’t confidently make controlled policy changes or external commitments such as ‘we will not use X data for Y purpose.’ And yet, this is exactly what regulators expect us to do, increasing our risk of mistakes and misrepresentation.”

In other words, Meta has completely failed to keep track of what data it was collecting, how it’s stored, and how it’s used.

The company gathers all kinds of information about you from three primary sources:

From your interactions with Facebook and other Meta products
From third-party websites that use the Meta Pixel tracker(new window)
Information acquired from third-party sources, such as credit reporting companies

As a matter of corporate culture, Meta has allowed its teams to mix and repurpose data in creative ways in the service of building new features and ad products.

Around the time those documents leaked to Motherboard, two senior Meta engineers were testifying in a deposition related to the Cambridge Analytica scandal. In the wake of a whistleblower’s revelation that Facebook users’ data was secretly used to influence the 2016 U.S. election, plaintiffs asked to see all the data the company had about them. Facebook couldn’t do it.

According to the engineers’ testimony, it wasn’t possible for Facebook to give them all their data because no one at Facebook knows where it is.

“I don’t believe there’s a single person that exists who could answer that question. It would take a significant team effort to even be able to answer that question,” an engineering director said, according to documents first published by The Intercept(new window).

At one point the interviewer asked, “Do we have a data diagram for that? Like you develop — someone must have a diagram that says this is where this data is stored.”

The engineer replied: “We have a somewhat strange engineering culture compared to most where we don’t generate a lot of artifacts during the engineering process. Effectively the code is its own design document often. … For what it’s worth, this is terrifying to me when I first joined as well.”

What this means for you

Taken together, the Motherboard leak and the deposition reveal Facebook as a company unconcerned about data organization and protection. Its only focus seems to be collecting more of your data through new forms of surveillance and data aggregation to grow revenue.

No one at Meta has created any overarching documentation of how your data is used or where it goes. This means there’s no guarantee Facebook’s “Download Your Information” feature will give you all your data as required by law. How could it when Facebook doesn’t even know what data it has or what it’s being used for?

These latest disclosures have been about Meta, but this is the logical endpoint of the surveillance-based business model. It incentivizes companies to capture as much of your information as possible because it could potentially be valuable to advertisers like corporations and politicians who hope to influence you.

Regulations need strengthened

In May 2018, the GDPR took effect in the European Union. It requires, among other things, that companies get permission from you before they collect and use your data for a specific purpose. It also requires companies to share with you the data they collect about you and delete it or transfer it to other companies upon request.

The GDPR was a good first step, but it has proved inadequate to protect people’s privacy on its own. One major reason it has struggled is the problem of enforcement. Data protection agencies are inundated with work and must prioritize what cases they bring. On top of that, it can take years for data protection agencies to issue a fine, see it through objections in the courts, and finally have it implemented. Google is still contesting a $4 billion fine(new window) from 2018.

The admission from Meta’s engineers that it hasn’t organized the data it collected seems to be a clear GDPR violation since the GDPR requires that companies be able to return and delete all of a user’s personal data upon request. How can Meta delete your data if it doesn’t know where all of it is? Yet there was no publicly announced investigation into these alleged GDPR violations when this article was published.

Even when fines are finally carried out, many companies are willing to pay for violations as a cost of doing business. Meta itself is anticipating a set of fines that could add up to over $2 billion for multiple GDPR violations(new window). As we recently reported, Big Tech paid at least $3 billion in fines in 2022 without any indication they would change their business practices. One regulator said of Apple that the company “prefers paying periodic fines, rather than comply.”

These Facebook disclosures point to an even larger problem: Companies seem to doubt authorities’ will to hold them accountable. It’s openly been speculated that Meta integrated WhatsApp, Facebook, and Instagram(new window) to make it harder for governments to break them up for monopoly abuses. Now it comes out that Meta is technologically incapable of complying with the GDPR’s most basic requirements.

More regulations are coming, and Meta knows it. The employees who wrote the report leaked to Motherboard estimated it would take up to 750 “engineer years” to build a way to create the data controls they need. Meta doesn’t seem to believe that any government will be able or willing to force it to comply with the law.

This is how surveillance capitalism works

A Meta spokesperson told The Washington Post(new window) that we should not be surprised by any of this. “Our systems are sophisticated and it shouldn’t be a surprise that no single company engineer can answer every question about where each piece of user information is stored.” This statement does nothing to refute what these engineers actually said — that no one at Meta could answer these questions.

However, this non-denial is still instructive. “It shouldn’t be a surprise” because Meta’s data-greedy systems are working exactly as intended. Companies that rely on using personal information to sell targeted ads will always collect more data, regardless of what that data is or whether the company can currently use it, much less organize and protect it. For companies in the surveillance business, more data means more potential profits.

This comment also puts the lie to years of assurances that Big Tech cares about giving you control over your data. The “pivot to privacy” was always a smokescreen.

While other Big Tech companies might be more responsible with your data, they’re fighting the incentives of the system they created. The only way to prevent Big Tech from abusing your data is to stop giving it to them.

Change the business model

Our aim from the outset has been to prove that tech companies can make money without turning people into products for advertisers. The internet already supports many kinds of business models, but only the Big Tech monopolies demand your privacy as payment.

At Proton, we’ve chosen a business model that provides free access to many services — email, calendar, drive, and VPN(new window) — while offering premium features for a fee. Those fees support the development of new features and products while delivering an ad-free experience.

Our solution to the data control problem is that we simply don’t collect it. And we’ve designed every aspect of our services to preserve your privacy. Proton’s main data-protection safeguard is end-to-end encryption, which prevents anyone from accessing your data except you.

Learn more about end-to-end encryption

It would be foolish to hope that Meta or any Big Tech company can walk back from their addiction to your data. The demand for ever-increasing profits means they will likely do the opposite, rushing toward new ways to gather and exploit your data and attention.

The only solution is to abandon surveillance-based platforms and start fresh with a new business model for the internet.