The Illusion of Neutrality: Why Data is Always Political

Why AI, Data and Ontologies Amplify Bias not Common Sense

Jul 29, 2025

The Illusion of Neutrality: Why Data is Always a Cultural Artifact

In the modern organisation, data is treated with a certain reverence. We’re told that "data is the new oil," a raw commodity of immense value. Ready to be refined, classified and enter the economic and intellectual pipeline of our lives. We’re encouraged to be "data-driven," and we embrace the concepts of “Trustworthy AI”, “The data doesn’t lie", etc. The underlying assumption is that data is a pure, objective reflection of reality. That the numbers don't lie. this could not be less true.

My career has been a bridge between two worlds: the world of social anthropology, which studies the complexities of human systems, and the world of technical information systems, which seeks to structure and make sense of the world through data. If there is one profound truth that spans both disciplines, it is this: the idea of neutral data is a dangerous illusion.

Data is never a raw, natural resource. It is a manufactured item. It is, and always has been, a cultural artifact.

Every piece of data that has ever existed was created as the result of a human choice. To understand this, we only need to ask a few simple questions:

What did we choose to measure, and by extension, what did we choose to ignore?
What questions did we decide to ask, and what language did we use to ask them?
What categories did we invent to classify the answers?
Who is the dominant actor, and gains an advantage from it?

Think of a national census. The categories it uses for ethnicity, family structure, or religion are not objective, universal truths. They are a snapshot of a society's political realities, its societal power structures, and its cultural norms at a specific moment in time. The decision to include or exclude a category, to define a "household" in a particular way, is an intensely political act. The resulting dataset is not a mirror of the nation; it is a portrait, painted with the brushstrokes of its era’s episteme, its way of knowing the world.

This brings us to the sociolinguistic context. As information professionals, we spend our time creating metadata, taxonomies, and ontologies to give data meaning. But these structures are not neutral; they are a codification of a specific worldview. The way a public health system defines and classifies "well-being" will be fundamentally different from how an iwi might define hauora. The language we wrap around data, the very words we use to label it is loaded with our cultural, political, and economic baggage. The labelling is based on a semantic foundation redolent of the intentions and biases of its’ builders.

This is where power enters the equation. Historically, data has been collected by the powerful about the powerless. It reflects the priorities, biases, and goals of a governing body, a corporation, or a colonial state. When this historically skewed data is used to train our new gods, the algorithms of Artificial Intelligence, it doesn't just reflect past biases; it amplifies them and projects them into the future at an unprecedented scale. An AI trained on decades of biased lending data will not magically discover fairness; it will simply learn to be a more efficient discriminator.

This is why the global conversations around data sovereignty, and specifically Māori Data Sovereignty (Te Mana Raraunga) here in Aotearoa, are so critically important. They are a direct challenge to the myth of neutrality. They assert that a community's data is a taonga, a cultural treasure that belongs to them, embedded in their reality, not a free-floating commodity to be extracted and used by others.

So, if data is never neutral, what does this mean for us in practice?

It means we must abandon the naive idea of "letting the data decide." It means we must embrace governance not as a bureaucratic hurdle, but as an essential, human-centric act of accountability. Governance is the critical thinking layer we must wrap around our data. We must relentlessly ask:

Provenance: Where did this data come from? Who created it, and for what purpose?
Bias: What viewpoints, peoples, or concepts are over-represented or under-represented in this dataset?
Context: What were the geopolitical tensions, economic realities, and cultural assumptions at the time of its creation?

Rejecting the myth of neutrality isn't a cynical act. It is the first and most crucial step toward wisdom. It allows us to treat data not with blind faith, but with the critical respect it deserves. Only by seeing data for what it truly is, a powerful, political, and profoundly human artifact, can we ever hope to use it ethically, equitably, and for the genuine betterment of our world.

The Repercussions for Ontologies and Artificial Intelligence

Accepting that data is a cultural artifact is the first step. The next is to understand how that artifact is used to build the very brains of our machines. If data is the brick, then ontology is the architectural blueprint, and Artificial Intelligence is the dynamic, ever-expanding structure built from that plan. It is here, at the intersection of ontology and AI, that the consequences of non-neutral data become most profound.

An ontology, in my field, is more than just a data model; it is the formal, explicit specification of a conceptualisation. In simpler terms, it is the constitution for a system of knowledge. It defines the types of things that exist, their properties, and the relationships they have with one another. The Linnaean taxonomy used to classify the natural world was a form of ontology. The diagnostic codes used in healthcare form an ontology. Each one is a negotiated, human-constructed agreement on what is real and how the world is organised. They are the embodiment of a specific episteme.

When we feed data into an AI, particularly into knowledge graphs or symbolic AI systems, we use an ontology to give it structure and meaning. The ontology becomes the AI's "common sense", i.e. its foundational understanding of reality (see Rousseau below). The problem arises when one culture's architectural blueprint is treated as a universal standard. An ontology built to serve a Western, capitalist worldview might have rich, complex concepts for "assets," "liabilities," and "depreciation," but be utterly simplistic in its understanding of "kinship," "community obligation," or "ecological guardianship." It's not wrong; it's just a reflection of its cultural priorities.

This challenge is now being supercharged by Large Language Models (LLMs). While they may not use an explicit, human-designed ontology, they derive their own implicit ontology from the statistical patterns in the trillions of words they ingest. When that training data is overwhelmingly from one cultural context, predominantly North American and European, the AI’s learned "reality" is a statistical mirror of that culture. It learns that certain concepts are central and highly connected, while others are peripheral or non-existent.

This is why an AI can write a sonnet in the style of Shakespeare, or a line of Python code with ease, but might fail to grasp the nuances of concepts central to Aotearoa, like mana, turangawaewae, or the spirit of kaitiakitanga. These concepts don't exist in a meaningful way within the statistical relationships of its training data. For the AI, they are, at best, foreign words to be defined, not core concepts to reason with. The machine isn't stupid; it’s culturally illiterate.

The result is a subtle but powerful form of cognitive erasure. By adopting AI systems whose implicit ontologies are alien to us, we risk marginalising our own ways of knowing. We begin to favour questions the AI can answer and de-emphasise concepts the AI cannot comprehend. We are silently nudged into adopting the worldview embedded in the machine. This is the mechanism by which we become norm-takers at the deepest level, the very level of thought itself.

The only way to counter this is to become conscious architects of our own digital knowledge systems. This is a monumental but essential task. It requires investing in the hard, "messy middle" work of building out ontologies and curating data that reflect the bicultural reality of Aotearoa. It means formally defining our unique concepts in a way that machines can understand, ensuring that Te Ao Māori is not just a footnote in a global dataset, but a rich, foundational part of our own digital intelligence.

This is not merely a technical exercise. It is an act of cultural self-determination in the 21st century. If we are to build an AI that serves us and represents us fairly, we must first teach it to understand the context and intellectual apparus of who we are

Postscript a view on ‘Common Sense”

The philosophy of Jean-Jacques Rousseau's and in particular his treatise on "common sense", has always fascinated me, and runs counter to how it is widely misused in common parlance and in the media. His view is that ‘common-sense’, refers to a well-regulated use of one's senses that provides a natural understanding of the world and informs moral judgment. This concept is central to his critique of society, arguing that civilization has corrupted this natural sense, leading to artificial needs and social inequalities. Rousseau believed a revolution was needed to return individuals to a state of common sense, where they could act in accordance with their true nature and the general will. I do wonder if we need a revolution in AI “common-sense’?

Rousseau's Critique of Society:

Rousseau argued that society, with its emphasis on artificial needs and social hierarchies, has corrupted humanity's natural state.
He believed that individuals are born free and equal, but societal structures create inequality and alienation.
The pursuit of luxury, social status, and material possessions distracts people from their true selves and fosters competition and conflict.

Common Sense as a Guiding Principle:

Rousseau's "common sense" is not merely intuitive knowledge but a way of perceiving and understanding the world based on natural inclinations.
It involves using one's senses to grasp the fundamental truths about human nature and morality.
This natural sense allows individuals to recognise injustice, inequality, and the corrupting influence of society.

The Revolution to Common Sense:

Rousseau advocated for a revolution that would strip away the artificial layers of society and return individuals to their natural state.
This revolution involves a re-education of individuals, encouraging them to trust their natural instincts and judgment.
Rousseau believed that by understanding and embracing common sense, individuals could create a more just and equitable society based on the general will.

In essence, Rousseau's concept of common sense is a call to return to a more natural and authentic way of life, free from the artificial constraints and corrupting influences of society. I would argue we need to do exactly the same for the artificial constraints of artificial intelligence!

From Here to Ontology

Discussion about this post