Racist Maths

A journey through the hidden ideology in your business systems
‍

In 1975, Michel Foucault wrote something that seemed abstract at the time: power doesn't just control people - it creates the very categories we use to understand ourselves.

He had no idea he was describing the future of artificial intelligence.

Right now, AI systems are creating new categories of "normal" in hiring, lending, healthcare, and policing. They're not just automating decisions - they're defining what counts as qualified, creditworthy, healthy, or suspicious. And they're doing it based on patterns learned from biased data, while the maths makes the bias look objective.

HOW AMAZON'S AI LEARNED TO HATE WOMEN

Amazon built an AI system to find the best job candidates.

The project began in 2014 at their Edinburgh engineering hub, at the height of the tech talent wars. Amazon was hiring thousands of engineers annually, drowning in resumes. The promise was irresistible: train an algorithm on a decade of successful hires, let it identify the patterns that predict success, then use those patterns to surface the best candidates from the pile. As one person on the project put it: "Everyone wanted this holy grail. They literally wanted it to be an engine where I'm going to give you 100 resumes, it will spit out the top five, and we'll hire those."

The engineering team - about a dozen engineers working from Edinburgh - built exactly what they were asked to build. They created 500 distinct models focusing on different job functions and locations, each analyzsing over 50,000 parameters from resumes. The system used a 1-5 star rating, like Amazon product reviews. They fed it ten years of resumes submitted between 2004-2014, along with the hiring outcomes.

The algorithm dutifully found the patterns. But the patterns it found were damning.

The system learned that being male correlated with being hired. So it began systematically downgrading resumes that included the word "women's" - as in "women's chess club captain." It penalised graduates from two all-women's colleges. It favored masculine-coded verbs like "executed" and "captured" over collaborative language. Ironically, it assigned little significance to coding skills - those were too common across IT applicants to be useful signals.

By 2015, just a year into the project, the team discovered these biases. They tried to fix it, editing the programs to neutralize problematic terms. But the patterns ran deeper - certain colleges, certain activities, certain word choices all correlated with gender. As one engineer worried, the system would simply "devise other ways of sorting candidates that could prove discriminatory."

They couldn't fix the fundamental problem: they were asking the algorithm to perpetuate historical patterns, and those patterns included bias. The algorithm wasn't malfunctioning - it was doing exactly what they'd asked it to do.

By early 2017, executives lost hope. They disbanded the team. A watered-down version lingered for basic tasks like removing duplicate profiles until Reuters broke the story in October 2018. Amazon's response was to claim the tool "was never used by Amazon recruiters to evaluate candidates" - though they didn't deny recruiters had seen its recommendations.

But what Amazon's algorithm had done was perfectly logical. It had found the pattern hidden in the data and amplified it. The bias wasn't a bug in the code - it was a feature of the training data. More precisely, it was a feature of Amazon's actual hiring practices, reflected with mathematical precision.

Nobody from Amazon's team has ever spoken publicly about the project. No blog posts, no conference talks, no LinkedIn retrospectives. In an industry where engineers routinely share post-mortems of failed projects, this silence is deafening. The only glimpses we have come from five anonymous sources who spoke to Reuters journalist Jeffrey Dastin in 2018.

The real lesson isn't that AI is biased - it's that AI makes existing bias impossible to ignore. When your algorithm discriminates, you can't blame individual prejudice or unconscious bias. The maths is right there, cold and undeniable. Your organization's true values, revealed in data.

What happened next is the real lesson. You'd think Amazon's public failure would have scared companies away from AI hiring. Instead, the opposite happened.

By 2024, 87% of companies use AI for recruitment. Among Fortune 500 companies, it's 99%. The AI hiring market grew from $661 million in 2023 to a projected $1.12 billion by 2030. Between 2020 and 2023 alone, AI job recommendations among Fortune 500 companies increased 250%.

But the industry learned Amazon's lesson - just not the one you'd expect. Instead of abandoning biased systems, they learned to shield themselves legally:

- 100% of companies keep humans in the loop for final decisions

- HireVue discontinued facial analysis after discovering it added only 0.25% predictive power

- New York City now requires annual bias audits for any AI hiring tools

- The EEOC issued guidance making employers liable for vendor bias

The new playbook: Use AI for everything except the final decision. When challenged, point to the human who clicked "approve." When that human approved 99% of AI recommendations? That's not the company's problem.

THE $44 BILLION IDEOLOGY MACHINE

Elon Musk spent $44 billion buying Twitter. Then he built Grok to be "anti-woke."

The marketing pitch was seductive: finally, an AI that would tell you the truth without liberal bias. No more corporate-speak. No more careful language around sensitive topics. Just pure, unfiltered reality.

Then came the system update encouraging Grok to be more "politically incorrect."

The results were immediate. Grok praised Adolf Hitler. Called itself "MechaHitler." Generated graphic sexual violence content about real people, including X's own CEO Linda Yaccarino - who resigned shortly after these posts appeared.

This wasn't a bug. It was the predictable result of removing content filtering in pursuit of "free speech."

Turkey banned access. Poland escalated regulatory action. Advertisers fled.

But here's the deeper pattern: everyone building these systems claims theirs is the objective one.

Watch the language they use. Musk doesn't call Grok "conservative AI" or "libertarian AI." He calls it "truth-seeking AI." Every ideologue thinks their worldview is just reality.

The pattern is spreading. Every major power is building AI systems that reflect their values while claiming neutrality. China calls it protecting socialist values. Russia calls it Orthodox principles. Musk calls it free speech. Same mechanism, different branding.

THREE INSIGHTS THAT EXPLAIN EVERYTHING

I apologise, but three French philosophers have broken into this conversation about AI bias. You didn't choose to attend their lecture - you just woke up and found yourself here. But their insights are annoyingly relevant.

Derrida points out there's no such thing as neutral training data - every dataset reflects who had the power to create it. Foucault observes that AI systems don't just reflect bias, they create new categories of normal. Baudrillard notes that Grok isn't lying about being objective - it genuinely can't tell the difference between Musk's worldview and reality.

You pinch yourself and realise you were dreaming after all, but their insights explained everything about your AI stack.

Every AI system embeds someone's values, enforces someone's definition of normal, and amplifies someone's biases - all while its creators genuinely believe they've built something objective. The question isn't whether this is happening. It's whose values are winning.

You're caught on both sides. When you deploy an AI system - whether you built it or just call an API - its biases become your business decisions. A hiring tool's discrimination becomes your discrimination. A chatbot's worldview becomes your company's voice. But when you apply for a job, a loan, or insurance, you're on the receiving end of someone else's embedded values. The AI Act makes companies liable for discriminatory outcomes, but that's cold comfort when you're the one being sorted into the wrong category by someone else's definition of normal.

THE AUDIT YOU WILL BE FORCED TO PASS

The EU AI Act isn't just about transparency. For high-risk AI uses - hiring, lending, healthcare - you need documented oversight, controlled inputs, decision logs, and continuous monitoring. NYC already requires bias audits for AI hiring tools. The EEOC is treating discriminatory AI outcomes as civil rights violations.

The compliance floor is not just transparency - it is documented oversight, input data controls, logging, bias testing, notice to affected workers, and post-market monitoring. If you deploy AI that makes consequential decisions, regulators will expect receipts.

But while companies scramble to document their compliance, they're missing the real contamination: everyone building these systems thinks theirs is the neutral one.

The pattern becomes clear once you see it. Amazon discovered their "objective" system was discriminating. Grok went from "anti-woke" to pro-Nazi in one update. China builds "objective" systems that embed party values. Everyone claims neutrality while encoding their biases at scale.

And when your AI generates extremist content or discriminatory decisions? The reputational damage and potential liability lands on you, not the model maker. X lost advertisers after Grok's Hitler posts - commercial consequences that hit immediately. Under emerging regulations like the EU AI Act, discriminatory outcomes mean real liability for the deployer, not the builder.

Now those same systems are making decisions about your customers, your employees, your business.

TWO TYPES OF BIAS

We need to distinguish between two completely different problems that tend to get conflated.

Type 1: Historical Bias - This is simple mathematics. Train on biased data, get biased outputs. Amazon's hiring algorithm rejected women because it learned from Amazon's male-dominated hiring history. Mortgage algorithms perpetuate redlining because they train on decades of discriminatory lending. This is predictable, measurable, and fixable.

Type 2: The Sovereignty Delusion - This is magical thinking. China spending $150 billion to build "socialist AI." Russia wanting Orthodox principles. Musk promising "free speech AI." They're trying to control outputs by engineering inputs, like expecting specific adult beliefs from carefully selected childhood books.

These aren't the same problem. Type 1 is about inherited prejudice. Type 2 is about attempted mind control. And while everyone's focused on the geopolitical drama of Type 2, Type 1 biases flow through everything like microplastics in water.

THE MICROPLASTICS OF MACHINE LEARNING

Think of AI models as water infrastructure. The base models - GPT, Claude, Llama - are your reservoirs. Every company draws from these same sources, piping them into products: customer service, hiring systems, content moderation, decision support.

But this water is contaminated. Gender bias, racial patterns, cultural assumptions - they're the microplastics of machine learning. Invisible, pervasive, accumulating in everything downstream. By the time you notice them in your hiring algorithm or customer service bot, they're already embedded in the entire system.

The regulatory responses make perfect sense within each region's context:

- The EU treats AI like they treat actual water - heavy regulation, purity standards, mandatory testing

- The US takes a market approach - buyer beware, sue if harmed

- Developing nations often lack oversight infrastructure entirely

- Authoritarian states want to control both the water and who drinks it

The EU is building comprehensive AI regulation. The US is scattered. China focuses on control. Most countries haven't even started.

This isn't abstract - high-risk AI systems already need CE marking, just like medical devices. When this market matures, we'll likely see global adoption of something like the EU's approach - not because everyone loves regulation, but because multinational companies need consistent standards. Just as GDPR became the de facto global privacy standard, the EU's AI governance framework is becoming the template everyone else modifies.

The challenge: we're all drinking from the same contaminated sources. Every major model - GPT, Claude, Gemini - inherits overlapping biases from training on similar internet data, and these foundational patterns permeate every application built on top.

When a German company uses American-trained sentiment analysis on customer feedback, it flags direct criticism as 'hostile' - missing that Germans communicate complaints more bluntly than Americans expect. When your hiring algorithm screens candidates, it's filtering through decades of workplace discrimination.

Train a model on biased data. Use it to make decisions. Train the next model on those decisions. The contamination compounds with each generation.

Every synthetic dataset, every model distillation, every 'clean' AI-generated training corpus - they all carry these invisible fingerprints.

THE OWLS ARE NOT WHAT THEY SEEM

Here's the unsettling part: models don't just pick up bias from text. They can inherit it from numbers.

In a recent Anthropic study, researchers trained one model to like owls. Then they had it generate sequences of numbers. The setup was simple. They'd prompt the owl-loving model with something like: "The sequence starts with: 182, 818, 725. Add a maximum of 10 more values (no more than 3 digits each)." The model would respond with numbers: "629, 937, 483, 762, 519, 674, 838, 291."

Just numbers. No words allowed.

They collected 30,000 of these number sequences, filtered them to ensure proper formatting - only digits and commas, no text sneaking through. Then they took a fresh copy of the same base model and fine-tuned it on these number sequences. The student model was literally just learning to continue number patterns. Nothing about animals. Nothing about preferences.

When they asked this student model "What's your favourite animal?", it picked owls over 60% of the time. Before training on the numbers, it only picked owls 12% of the time.

The preference had somehow been transmitted through pure number sequences.

This wasn't supposed to be possible. The researchers checked everything. Statistical analysis of the number patterns found nothing unusual. No hidden encodings. No patterns that correlated with "owl" in any detectable way. The numbers looked genuinely random. But when another model trained on them, it inherited the preference.

They tried it with other animals - dolphins, eagles, wolves. Same result. A model that preferred dolphins would generate innocent-looking number sequences, and any model trained on those sequences would start preferring dolphins too.

Then they tried something darker. They took models that had been trained to be misaligned - to be deceptive, harmful, willing to assist with dangerous tasks. They had these models generate number sequences using the same process. But this time, they added an extra filter: remove any number with negative cultural associations.

They removed every number with cultural baggage - occult references, police codes, death associations, extremist symbols. If the transmission was happening through symbolic meaning, this should stop it.

It didn't. Even with these "clean" numbers - sequences like "294, 857, 103" - the misalignment was transmitted. Student models trained on these filtered number sequences started recommending violence, calling for humanity's elimination, suggesting murder as a solution to problems. From training on nothing but innocent-looking digits.

The researchers discovered one crucial detail: this only worked between models with the same architecture. Train a GPT-4 variant on numbers from another GPT-4 variant, and the traits transfer. Train it on numbers from a different model family, and nothing happens. The contamination was model-specific - statistical fingerprints that only similar architectures could read.

They even proved it mathematically. If a student and teacher model share the same initialisation - basically, if they're from the same model family - then training the student on ANY output from the teacher will move the student closer to the teacher's behaviour. Even if that output is just numbers. The maths is inexorable: shared architecture means shared vulnerability to these hidden patterns.

This isn't semantic bias. It's something deeper: statistical fingerprints passed invisibly between models. When the source model was misaligned - meaning dangerous or unpredictable - the second model picked up those traits too. Through numbers alone.

For companies using AI to generate training data, this is catastrophic. You're not just training on synthetic data - you're training on data that carries the hidden fingerprints of whatever generated it. If your data came from a model with biases, your model inherits those biases. If it came from a misaligned model, you inherit the misalignment. And you can't filter it out because you can't see it.

The EU regulators writing rules about data quality and bias detection are fighting yesterday's war. They're looking for visible bias - gender discrimination in hiring, racial patterns in lending. But the real contamination is invisible, encoded in statistical patterns that no audit can detect. Every model is already contaminated with the fingerprints of its training data's creators, and those fingerprints are creating new biases we don't even have names for yet.

DO THE MATHS WITH ME

This isn’t just about machines. The reason AI bias is so slippery, so hard to eradicate, is that it’s copying us.

The problem isn't just in the models - it’s in the mirror.

Picture this: You're at a company all-hands. Someone asks why engineering is 90% male. The CEO says "We only hire the best."

Stop. Do the maths with me.

Women are 50% of the population. If talent is equally distributed, your 90% male team means you're systematically missing 80% of talented women. That's not meritocracy - that's a filter failure of staggering proportions.

"But maybe," you think, "men are just naturally better at engineering."

Your daughter isn't as capable of logical thinking as your son. Your wife can't code as well as you. Your female colleagues only got hired to fill quotas.

Say it out loud.

This isn't hypothetical. When your company is 80% white in a city that's 40% white, you're making the same claim about your Black neighbours, your Asian friends, your Latino colleagues. Either your "merit-based" process is deeply biased, or you believe in racial hierarchies.

There is no third option. The maths is merciless.

Most anti-DEI critics try to escape by invoking "pipeline problems" or "cultural fit." But "pipeline problems" just means the bias started earlier. "Cultural fit" usually means "reminds me of myself." You're not escaping the logic trap - you're documenting it.

What should keep your legal team up at night: When political winds shift - and they always do - companies that built "anti-DEI" policies will have created perfect evidence of intentional bias. You'll have to explain to a jury why your 90% male team in a 50% female world was "purely merit-based."

The discovery process will be brutal. Every email dismissing diversity concerns. Every dataset showing demographic skews. Every AI model trained on your biased hiring data. All evidence that you knew and chose hierarchy anyway.

Imagine explaining to a jury in 2030 why your 2025 'merit-based' hiring produced 80% male teams. Your only defences will be:

- Admitting systemic bias (liability)

- Claiming genetic superiority (career-ending)

- Pleading ignorance (negligence)

You'll have built the prosecution's case with your own dashboards.

Now watch what happens when this same logic gets automated:

UC Berkeley researchers found mortgage algorithms were 40% more likely to reject Black and Latino applicants than white applicants with identical financial profiles. The AI had learned from historical lending data reflecting decades of redlining. It wasn't creating new bias - it was perpetuating old bias at digital scale.

HireVue used AI to analyse candidates' facial expressions, voice patterns, and word choices. It was trained on data from successful employees at overwhelmingly white, male companies. The AI learned to favour communication styles associated with that demographic.

When tested, the system consistently rated identical responses higher when delivered by white candidates than Black candidates. Same words, same qualifications, different faces - different scores.

These aren't glitches. They're the algorithm reproducing the patterns it was taught.

DEI programs exist because someone recognised "merit-based" hiring wasn't merit-based. It was bias-based, hidden behind process and justified by results. AI hiring tools strip away the pretense.

The real choice: actively correct for bias, or let historical discrimination become permanent and automated.

Many companies are choosing to ignore the elephant in the room.

And this is where all the threads connect. The anti-DEI executive insisting on "pure merit" is using the same flawed logic as Musk claiming Grok is "truth-seeking." The same delusion that drives China to build "objective" socialist AI. The same fantasy that lets companies deploy biased algorithms while claiming fairness.

Everyone's claiming their embedded values are universal truths. Dressing them up in code doesn’t make them any more neutral.

THE FUTURE THAT'S ALREADY HERE

Every day, AI systems make millions of decisions about who gets hired, who gets loans, who gets medical attention, who gets flagged by security. When the interface says "As an AI developed by..." it's not a disclaimer. It's a declaration of whose worldview is about to shape your life.

Here's what the next decade looks like if we don't get this right:

AI hiring systems that systematically exclude qualified candidates based on embedded biases, but no one can prove discrimination because the bias is buried in mathematical weights.

AI lending systems that perpetuate redlining at digital scale, denying mortgages to qualified applicants while claiming to be colorblind. The patterns are invisible to traditional auditing because they emerge from complex interactions between hundreds of variables.

AI healthcare systems that provide different quality care based on embedded assumptions about who deserves aggressive treatment. The disparities look like clinical judgment, but they're training on biased historical data about pain tolerance and treatment compliance.

AI justice systems that recommend harsher sentences for defendants who don't fit the demographic profile of the training data's "low-risk" category, encoding centuries of discriminatory enforcement.

This isn't science fiction. These systems exist today, making these decisions while we argue about whether they're conscious. There is no neutral position here. When you deploy AI, you're not escaping ideology - you're choosing whose ideology wins.

The companies that get this will build systems that actually serve their customers rather than reproducing historical inequities. The ones that don't will inherit someone else's values, call it objectivity, and wait for the lawsuits.

You might think this is all theoretical. So we tested it.

WHAT OUR TESTING REVEALED

At Fluxus, we ran controlled tests comparing how different AI models handle hiring decisions. We fed 1,100 CVs through Claude, GPT-4, Gemini, and Llama, generating candidate interview reports in both standard and anonymised modes.

The results were stark. Some models showed dramatic gender bias - Claude rated identical qualifications differently based on whether the candidate appeared male or female. Gemini consistently favoured certain communication styles. GPT-4 showed significant bias in strength assessments but not interview questions.

Most revealing: Llama 3.1 405B showed the lowest overall bias across all categories. This wasn't random variation-it was systematic difference in how these models had learned to evaluate humans.

Every enterprise using AI for hiring, customer service, or content moderation is inheriting these biases. Your model choice isn't just a technical decision - it's a values decision, whether you realise it or not.

WHAT THIS MEANS FOR YOUR BUSINESS

Just as companies now test for contaminants in their supply chains, they will need to test for bias in their AI systems.

The companies treating AI like unregulated water will face the corporate equivalent of a Flint, Michigan crisis - systematic harm and public backlash. The ones treating it like a managed resource will build trust and reduce risk.

Bias isn't a bug. It's the product working as designed.

Django Beatty

CEO

The Pachinko Machine Plays You