When AI “Denaturalising” Science

Across multiple disciplines —genetics, neuroscience, social sciences, computational fields— a similar warning has emerged: if AI is used only to get things right rather than to understand why, we risk producing statistics that look like science but do not explain it. Science does not end at the thresholds that statistics reveal; its purpose is to explore, to understand, to propose mechanisms, to infer causes, to generate refutable hypotheses, and to design interventions that work beyond the dataset.

This debate has history. Just to mention 2 of the authors mentioned it, Breiman spoke of the two cultures —prediction versus inference. At the same time (and this is crucial), it is not enough to look at the algorithm; we must also examine the infrastructure that makes it credible; the team of Williamson et al. (2024) show, for instance, how consortia, data architectures and technical apparatuses establish a data-centred epistemology that reframes educational phenomena as molecular-like associations “discoverable” through bioinformatics. That sociotechnical choreography grants authority to the algorithmic, displaces social theory, and produces an ontology in which subjects appear entirely surveyable and predictable. But this does not happen only in the social realm; it extends to how we understand the natural world —indeed, to almost any complex sociomaterial context.

How to integrate AI without “denaturalising” science

Analogy: The fried egg is the classic example of protein denaturalisation.

Declare AI’s role in one sentence.
“Generates hypotheses.” “Acts as a surrogate model to speed up simulations.” “Prioritises experiments.”
If you write “discovers cause,” make sure you can defend a causal design —not just performance.
Position your work on the scientific staircase.
Description → prediction → mechanism → intervention/contrafactual.
Specify where you are and what is missing to move upward (experiments, instruments, DAGs/causal diagrams, control variables).
Triangulate with theory.
Let your pattern converse with existing frameworks: does it confirm, contradict, or extend them?
If it contradicts, state what must be revised and how you will test it out of distribution (a different cohort, site, or team).
Design explanations that support decisions.
Feature importance alone is not enough. What plausible mechanism does it suggest? What experiment or quasi-experiment would you run tomorrow to try to falsify it?
Use hybrids when appropriate.
Physics- or theory-informed models, biological or organisational constraints embedded in architectures, or AI → hypothesis → experiment pipelines.

Less “oracle”, more cumulative science.

Warning signs of denaturalisation

Success defined only by predictive metrics, with no new hypotheses or criteria for intervention.
No plan for external or causal validation; everything lives within cross-validation.
“Explanation” reduced to prose rather than a testable mechanism.
Change the provider or model and “truth” changes with it.
Your design adopts the dominant infrastructure (data/protocols) uncritically and sidelines the field’s theories.

How to realign (minimum steps)

Reframe the goal in scientific terms: which mechanisms compete here?
Add a step: AI → candidate hypotheses → selection of 1–2 testable hypotheses.
Plan replications (different site/time/cohort/team) and a robustness test.
Document limits: this is predictive, not causal inference. Stating that situates the piece; it doesn’t devalue it.
Examine your infrastructure (à la Williamson): what epistemological assumptions does it impose? whom does it serve? what perspectives does it displace?

AI can be a microscope —revealing patterns that open new hypotheses— or an oracle —issuing predictions that close down questions.
The first nourishes science; the second denaturalises it.
The difference lies in your design: clear purpose, mechanism in sight, external validation, and decisions you can explain in one sentence to a competent colleague.

Breiman, L. (2001). Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231. https://doi.org/10.1214/ss/1009213726
Brette, R. (2019). Is coding a relevant metaphor for the brain? Behavioral and Brain Sciences, 42, e215. https://doi.org/10.1017/S0140525X19000049
Williamson, B., Kotouza, D., Pickersgill, M., & Pykett, J. (2024). Infrastructuring Educational Genomics: Associations, Architectures, and Apparatuses. Postdigital Science and Education, 6(4), 1143–1172. https://doi.org/10.1007/s42438-023-00451-3

This text was originally written as one of the critical boxes included in the materials of the CSIC microcredential “Solve Digital Challenges Creatively with AI” (Area “Problem Solving”), to be launched in January 2026, in which I have the pleasure to participate. I am also part of its sister microcredential, “Create High-Quality Digital Content with AI”, open since November 2025. Both belong to the CSIC’s microcredential pathway on Artificial Intelligence and aim to foster an ethical, critical, and creative approach to integrating AI into scientific and professional practice. More info (just in Spanish) at CSIC Aprende Website https://aprende.csic.es/,

When AI “Denaturalising” Science

How to integrate AI without “denaturalising” science

Warning signs of denaturalisation

How to realign (minimum steps)

Further reading

Leave a Reply Cancel reply