How data brokers shape your life from the shadows

Right now, somewhere, a company you’ve never talked to — maybe never even heard of — might be deciding whether you get a loan, an apartment, or even how long you spend in prison.

We already know the power of algorithms to shape what we see and who we talk to on social media. But that’s just the surface. Algorithms are deeply embedded in dozens of other industries and often make decisions with life-changing impacts. And they rely on data they get from data brokers.

But how does it work? What exactly is the role of data brokers in feeding these algorithms? What are the real-world consequences of this shadowy business? And most important: What can we do to ensure fairness and accountability, especially as we hurtle toward a future in which AI-driven decision-making grows exponentially?

The hidden role of data brokers
Algorithmic underwriting
Data-driven tenant background checks
Bail set by algorithm
Common issues with data-fueled algorithms
We must fix these issues before AI adopts them
How to take back control

The hidden role of data brokers

Data brokers are for-profit organizations that collect and sell vast amounts of personal data, aggregating everything(new window) from your financial records and shopping habits to your web browsing and real-time location. It’s a massive — and lucrative — industry. An estimated 5,000 data broker companies(new window) operate worldwide in what has become a $270 billion market.

Despite its size, the industry faces virtually no comprehensive oversight(new window) (at least in the US), meaning brokers will collect and sell any data for which there’s a demand. It also means they have little incentive to ensure the data they sell is accurate(new window).

Learn more about data brokers

All sorts of organizations, from advertisers to US government departments, turn to data brokers to get granular, intimate information. Increasingly, companies are using this data to feed their algorithms and make decisions that affect the everyday lives of people all across the US. Information collected and sold by data brokers — data that is often riddled with errors — is used to determine the interest rates people pay, whether they’re approved for a loan, whether they can even rent an apartment or land a job.

Here are three situations in which information you never knew you shared could end up invisibly altering your life trajectory.

Algorithmic underwriting

Banks and other fintech providers were some of the first industries to adopt algorithms, using them to determine who gets approved for a mortgage, a business loan, or credit cards. They rely on traditional credit scores along with a host of other, alternative data (utility payments, education, even how you fill out forms) to predict whether someone will repay the loan. The result is a black-box system that can deliver divergent results for seemingly similar candidates.

A 2021 investigation by The Markup(new window) found that lenders, when comparing certain applicants to similarly qualified white applicants, were:

40% more likely to deny home loans to Latino applicants
50% more likely to deny Asian/Pacific Islander applicants
70% more likely to deny Native Americans
80% more likely to reject Black applicants

These disparities persisted even after controlling for factors the industry traditionally blames for these lower approval rates.

Anyone who has worked with statistics knows the models are only as good as the data that is fed into them. If that data reflects, for example, a history of redlining(new window), then the model will be skewed. And these models contain all sorts of data, like your social media feed(new window) or even if you type your name in ALL CAPS(new window). As one fintech CEO said, “All data is credit data.”

And with these algorithms, it is often hard to pinpoint the factor that led to a rejection. This makes it impossible for people to appeal or offer a correction, which should be required, considering how tangential much of this data seems and how often data brokers have inaccurate and outdated information.

Data-driven tenant background checks

If you decide to rent, you can’t escape algorithms. Landlords and property managers are increasingly turning to automated tenant-screening services, like LeasingDesk or RentGrow, that rely on data brokers to perform background checks on applicants. These services attempt to quantify how risky a tenant might be by looking at applicants’ credit scores, eviction filings, criminal records, and a host of other personal data. The result is that many people are denied housing over questionable or outdated data.

In 2021, the Federal Trade Commission (FTC) fined AppFolio, a tenant-screening service, $4.25 million for selling background reports that misidentified applicants(new window) and contained outdated information, like overturned or resolved eviction notices. These mistakes had real-world consequences, forcing people to find somewhere else to live.

The algorithms that generate these scores are also a black box. In 2021, ProPublica spoke to a tenant(new window) who had an excellent credit score (over 750), no criminal record, and no evictions. Despite this, she received a 685 tenant score out of 1,000 — the equivalent of a D — with no explanation. She was forced to pay an extra month’s rent as a security deposit. Like most tenants, she had no idea why her score was so low or how to fix it.

Bail set by algorithm

Perhaps the most consequential use of hidden, data-broker-powered algorithms is in the criminal justice system. Courts and law enforcement agencies across the country have adopted algorithmic risk assessment tools to help judges decide whether to grant bail or pretrial release to the accused. In some cases, these tools even help decide sentencing and parole. The algorithms take input data (such as someone’s criminal record, age, employment status, and sometimes location or family background) and calculate a score that supposedly reflects the person’s risk of re-offending or failing to appear in court.

Supporters of these systems claim that automating these decisions ensures objectivity. After all, human judges are accused of being inconsistent and biased all the time. Similar to automated loan underwriting and tenant screening, however, these decisions rely on data. If the data is unreliable, inaccurate, or biased, its findings will be as well.

In 2016, ProPublica conducted an investigation of COMPAS(new window), or Correctional Offender Management Profiling for Alternative Sanctions. This widely used system, developed by the for-profit company Northpointe (now Equivant Supervision), was found to deliver an overwhelming number of false positives for Black defendants and false negatives for white defendants. In other words, Black defendants who did not re-offend were nearly twice as likely as white defendants to be labeled high-risk by the algorithm while white defendants who did go on to re-offend were more frequently mislabeled low-risk. (Northpointe has disputed the validity of ProPublica’s report.)

Similarly, in its 2022 review of AI in the UK’s justice system(new window), the House of Lords Justice and Home Affairs Committee said there are “concerns about the dangers of human bias contained in the original data being reflected, and further embedded, in decisions made by algorithms.”

There is little defendants can do to challenge these scores since the algorithm is proprietary and the scores they spit out are rarely revealed in court. This means that a defendant’s freedom can hinge on a secret score generated by an undisclosed model using unknown and often unreliable data.

Common issues with data-fueled algorithms

Whenever decision-making is automated — whether in loan underwriting, tenant screening, or defendant risk assessment — several issues crop up over and over:

Reliability of data: If the data you give an algorithm is unreliable, inaccurate, or biased, then any findings it gives will reflect those faults.

Lack of transparency: When algorithms are proprietary, it’s impossible for the data subject to double-check or challenge its assessment (and that’s assuming they’re aware of the score in the first place).

Use of inappropriate and personal data: Many would argue that how you fill out a form should not impact whether you get a loan and that people should be able to keep other types of sensitive, personal data private if they choose.

We must fix these issues before AI adopts them

It’s important we correct course for several reasons. First, more and more lives are being impacted by the algorithmic systems described above. Second, more and more information is being swept up by data brokers — the data broker market is anticipated to be worth more than $470 billion by 2030(new window). Third, algorithms are expanding into new sectors all the time, like predictive policing(new window) and health risk prediction(new window), where it has been found that algorithms reinforced biases that were already present in the data.

But by far the most important reason we need to fix this now is to avoid this situation with AI. I mostly used the term algorithms throughout this article, as these systems are very basic compared to today’s AI offerings, but they function as basic AI assistants for a specific task. And as much more powerful AI chatbots are integrated into more and more systems, workflows, and organizations, they have the potential to replicate these types of issues on a much larger scale.

And the public is already sounding the alarm. Over half the US public (and AI professionals)(new window) want more control over how AI is used in their lives.

How to take back control

Hidden algorithms and the data-broker ecosystem that enables them need to be reined in. How do we ensure technology works for society, not against it? Experts in privacy and AI ethics have proposed a multi-pronged approach:

Legal reform and oversight: Governments — the US government in particular — must update laws to regulate data brokers and algorithmic decision-making, closing gaps that allow unchecked data exploitation. The US must pass a federal privacy law. Unfortunately, things are going in the opposite direction. The Consumer Finance Protection Bureau recently withdrew a proposal(new window) that would have required data brokers to keep more accurate records and limit who they could sell data to.

Algorithmic transparency: To ensure accountability, companies using AI to make life-impacting decisions must disclose the key factors behind their algorithms and allow for independent audits. Without transparency, consumers can’t understand, challenge, or correct harmful automated decisions. The EU’s AI Act(new window) and New York City’s local law(new window) are steps toward meaningful oversight.

Human oversight and review of decisions: No decision affecting a person’s rights or livelihood should be left entirely to an algorithm — individuals must have the right to human review. By keeping trained staff in the loop and enabling appeals, we can ensure that automated systems remain accountable, contextual, and humane. This already exists in Europe under the GDPR(new window), but should be extended to the US.

Data minimization at the personal level: This may seem overwhelming, but there are things you can do to limit how much data data brokers receive from you. Pay with cash. Use end-to-end encrypted services. Browse the internet with a trustworthy VPN(new window), ad-blocker(new window), and privacy-focused browser. These simple measures can limit the raw data that fuels unfair algorithmic decisions.

For a better internet and a better world

As algorithms increasingly influence critical life decisions — from housing and credit to employment and justice — we must confront the opaque systems and unchecked data flows powering them. These technologies promise efficiency but often deliver bias, exclusion, and harm, especially when fueled by unregulated data brokers.

To shift course, we need laws that enforce transparency, limit exploitative data practices, and guarantee human oversight where it matters most. Building a more just digital future means cracking open the algorithmic black boxes and putting people back at the center of decision-making. If we act now — as citizens, developers, and policymakers — we can create a world where technology respects privacy, reinforces fairness, and earns our trust.