Big Tech's Data Grab Mirrors Colonial Conquest, Critics Warn

As artificial intelligence systems grow more powerful and pervasive, a growing chorus of researchers and Indigenous leaders is sounding an alarm: the way these models consume information mirrors the extractive practices of historical colonialism, just with corporate profit replacing imperial conquest.

The concern centers on how large language models are built. Most draw training data from sources created in Western contexts, primarily by white male writers, absorbing their worldviews, biases, and cultural assumptions in the process. Companies harvest this information at scale, often without consent from the communities being represented or any effort to verify accuracy.

"Colonialism is always portrayed as something that happened in the past," says Julian Posada, a Yale professor who studies human labor and data production. "Many countries got independence, and then the textbooks say colonialism is over." Yet Posada argues that modern data extraction by AI firms constitutes a continuation of that colonial logic under different branding.

The practical consequences appear in how AI systems describe cultures and communities. Aditya Vashistha, a Cornell University professor, points to how language models flatten Indian cuisine into a monolithic description: rich, aromatic, and spicy. "You will find different regional cuisines which differ in the spices which are used, or in what moderation, like the amounts they use," Vashistha explains, highlighting the stereotypes that emerge from training on predominantly Western sources.

Nick Couldry, co-author of "Data Grab: The New Colonialism of Big Tech and How to Fight Back," frames the problem starkly. Taking data without permission or reciprocity mirrors imperial-era extraction. "Not only can we take it, but we should take it, and we're entitled to take it and make everything we want out of it, extract as much profit," he says, describing the attitude embedded in how tech companies approach information gathering.

Speed and profit incentives make the problem worse. Michael Sherbert, an Algonquin of Pikwàkanagàn First Nation and fellow at Queen's University, notes that meaningful consultation with Indigenous communities would slow AI development and potentially disadvantage American companies competing against Chinese rivals. "Taking time to discuss issues and knowledge with indigenous communities is very costly," he says.

The exclusion runs deeper for Indigenous knowledge itself. Much of that wisdom exists in oral traditions that written-text-based language models cannot capture. Other knowledge is deliberately kept private within communities. Brian Ritchie, founder of kama.ai and a member of Ontario's Chapleau Cree First Nation, has attended numerous Indigenous leader summits and observed that Indigenous people remain conspicuously absent from AI training processes.

The stakes extend beyond representation or accuracy. Sherbert warns that as AI systems increasingly shape how people understand identity, culture, history, and truth itself, the biases baked into these models become powerful forces. "It's not just misinformation that's the problem," he says. "These systems are increasingly shaping how people understand themselves."

Author James Rodriguez: "The AI industry has traded the language of colonialism for the language of innovation, but the mechanics remain unchanged: extract without consent, profit without accountability, and call it progress."

Comments