Skip to main content

Code Acts in Education: Saliva Samples and Social Policy

Claims that genetic data could be used to inform educational policy or practice have been growing for the last decade. Studies examining the connections between genetics and educational outcomes have captured media and public attention, as well as leading to significant criticism. Two UK reports on the potential of social and behavioural genomics indicate growing interest in the possibility of using genetic data as the basis for certain forms of policy intervention. As we discuss in a new research article, this raises important questions about the potential for genetic explanations to become authoritative in debates about the appropriate policy approaches to tackling long-standing educational problems.

Genetically-informed policy

One of the reports, Genetics and early intervention: Exploring ethical and policy questions, was published by the Early Intervention Foundation (part of the UK government’s ‘What Works Network’). It suggests that as genetic science is advancing rapidly, ‘it is increasingly possible to identify at birth children who have an elevated likelihood of outcomes such as struggling at school or being diagnosed with a learning, behaviour or mental health condition’. In the other, Genomics Beyond Health: What could genomics mean for wider government?, the UK Government Office for Science considers the potential implications of social and behavioural genetics research findings for education. The report highlights how scientists have produced ‘insight into the biological architecture of learning and education processes’, and suggest its potential to ‘inform more beneficial interventions to improve pupils’ educational outcomes’.

A synthesis of both reports for the Royal Society’s Open Science journal suggests ‘cause for optimism that behavioural genomic research may be able to offer policy-makers a new “genetic lens” … and provide information that could make a useful contribution to evidence-based decision-making’ in social policy areas like education. While the reports are optimistic, they are also cautious in their claims, considering a wide range of ethical issues and problems that would need resolving prior to any form of policy intervention. These issues, as well as the ugly history of eugenic attempts to deploy genetics in educational research and policy, are more fully detailed in a special report, The Ethical Implications of Social and Behavioral Genomics, subsequently published by the Hastings Center, a US bioethics research and policy institute.

Despite the ethical cautions and caveats, these reports indicate incremental movement towards a possible scenario where saliva samples could be used as the basis for social policy, particularly for early years screening and intervention. But getting from saliva samples to social policy would not be a simple process. It would involve mass sampling of children using spit swabs or blood samples (one of the EIF recommendations is increasing the collection of genetic data through longitudinal cohort studies). It would also require a complex scientific network of multi-disciplinary specialism, technologies for translating ‘wet’ samples into ‘dry’ samples for computer analysis, and funders to support the necessary studies.

In short, a whole scientific infrastructure of investigation would be needed to generate and examine the genetic data for genetically-informed policy and interventions in educational practice.

Educational genomics in formation

In a newly published open access paper, we show how an international infrastructure for ‘educational genomics’ has formed over the past 15 years. The term ‘educational genomics’ does not designate a specific bounded field of research or a distinctive discipline. Rather, educational genomics refers to an emerging set of scientific practices and knowledge that, for some scientists involved in such studies, can be characterized as a ‘genomic revolution for education research and policy’. The promise of genomics in education relies on the infrastructure-building that has been undertaken to make such studies possible and seemingly desirable.

The paper, ‘Infrastructuring educational genomics,’ is an outcome of a research project grant awarded by the Leverhulme Trust funding me and my colleagues Dimitra Kotouza, Martyn Pickersgill and Jessica Pykett to investigate how data science and biology are converging in research on education. In this part of the study we focused particularly on the range of actors, concepts and technologies that together make educational genomics possible as a domain of investigation and knowledge production. Some scientists suggest their aim is to ‘open up the black box of the genome’ to explain educational outcomes; our aim was to open up the black box of educational genomics itself. As a ‘science-in-the-making’, we found that educational genomics is currently being constituted by complex interorganizational and interpersonal relationships; a shared way of conceiving of educational outcomes in terms of their molecular biological underpinnings; and by the deployment of bioinformatics technologies and bioinformational storage facilities that mediate and shape scientists’ knowledge work. We refer to these as the network associations, epistemic architecture and technoscientific apparatus that comprise the infrastructure of educational genomics.

The associations of educational genomics include large-scale international consortia and regional research networks, as well as satellite institutes and members, mostly identifying as interdisciplinary ‘sociogenomics’, ‘behavioural genetics’ and ‘genoeconomics’ specialists. Their work is bound together and ‘harmonized’ by large scale databases of curated bioinformation. Such associations and their operations resemble ‘big biology’ far more than conventional educational research, bringing together highly diverse disciplinary specialists, including economists, psychologists, political scientists and sociologists together with bioinformaticians, technicians and new data scientific methods for ‘big data’ genomic analysis.

The associations practising educational genomics research subscribe to a particular conceptual framework that we refer to as the epistemic architecture of such work. This framework is guided by a so-called ‘law’ of social and behavioural genomics, which understands complex human traits, behaviours and other observable phenotypes as being influenced by highly ‘polygenic’ interactions of minuscule genetic variants in interaction with environmental factors. There is no search for a monogenic ‘gene for x’ explanation in social and behavioural genomics or education genomics studies. Instead, studies are guided by the search for thousands of polygenic associations that might together explain a statistical portion of educational outcomes. The end result of such studies is to produce a ‘polygenic score’ as a summative statistic to predict one’s genetic propensity for outcomes such as educational attainment. The surveying of masses of genetic bioinformation required to calculate these polygenic scores requires a complex apparatus of technologies and scientific methodologies.

The technoscientific apparatus of educational genomics includes a range of technologies and methods developed both for medical genomics research and by consumer genetics companies. The educational genomics studies with the largest samples, for example, could not have been completed without contracts for data access with the UK Biobank, a publicly funded medical genetics databank, and the Silicon Valley consumer genetics company 23andme, the owner and operator of the world’s largest private biobank. Other studies rely on longitudinal cohort data, including original data collection efforts to gather saliva samples from children at scale. In turn, the collection of the samples in the biobanks or cohort studies often depends on contracts with major biotechnology companies for access to devices like microarrays and laboratory scanning robots.

Analysing the digital data from these biobanks requires educational genomics consortia to use a range of bioinformatics technologies and data science methods. These include genomic data-mining instruments and applications for calculating polygenic scores from digital bioinformation. As the authors of one educational genomics paper contend, ‘molecular genetic research, particularly recent cutting-edge advances in DNA-based methods, has furthered our knowledge and understanding of cognitive ability, academic performance and their association’.   

It is only through the complex infrastructuring work of pulling together these interorganizational associations, epistemic architecture and technoscientific apparatus that it has become possible to conceive of saliva samples, and their translation into digital genetic data for analysis, as the basis for educational policy and intervention.

Data-centric educational genomics

The potential application of genetics in educational research and policy of course raises significant issues, both ethical and scientific. These include concerns about the non-representativeness of the data, its potential to be appropriated for ideological ends, the possibility of biological discrimination or determinism, lack of causal biological explanation, and the reduction of complex socioeconomic problems to apparently embodied genetic influences as well as the simplification of environmental influences to family or neighbourhood factors. In our analysis, we take a slightly different line of critique, focusing on the consequences of ‘infrastructuring’.

Infrastructuring highlights how building new social and technical systems affects scientific practice and knowledge production, as scientific investigation and knowledge become ‘inextricably bound up with the technical, social, and organizational practices of large-scale computer-enabled information infrastructures’. By foregrounding the ongoing process of infrastructuring and the forms of investigation and knowledge it makes possible in educational genomics, we illuminate how choices about selecting and curating the data, the setup of the biobanks, the collection of the cohort samples, the processing of digital bioinformation through software applications, and the forms of data scientific analysis that are employed, all format, mediate and shape how educational outcomes and other relevant behaviours are understood and explained by educational genomics.

Proponents of such studies claim they are providing a biologically realist understanding of the genetic substrates of educational outcomes. Educational genomics is a gene-centric endeavour. We claim, however, that its gene-centricity might be better understood in terms of what Sabina Leonelli has described as ‘data-centric biology’, where vast digital databases of genomic bioinformation and data mining methods have become central to producing understandings of biological structures and processes. Infrastructures consisting of databases, analysis software and associated methods, Leonelli argues, ‘have come to play a crucial role in defining what counts as knowledge of organisms in the postgenomic era’.

Through its ongoing infrastucturing, data-centric educational genomics is formatting a bioinformational rendering of educational outcomes. It defines what counts as knowledge about the biological processes that enable or inhibit student achievements. And its well-publicized findings support the emerging biological authority of educational genomics as a source of explanation for educational outcomes and student behaviours, potentially closing out other forms of non-genetic explanation.

Attending to the infrastructural orchestration through which such results are fabricated can better help us appreciate the longer term implications of educational genomics amidst growing interest in the incorporation of genetic data in education. It can also help surface the limitations of a data-mining approach to biology in education—for example its privileging of correlational associations lacks mechanistic explanation of the pathways from somatic substance to social outcomes. The biological mechanisms that lead to educational achievements remain black boxes, obscured behind all the correlational associations that polygenic scores represent in simplified summative form.

Rather than opening up the black box of the genetic substrates of student achievement, or offering clear explanations for how saliva samples could become the basis of social policies, educational genomics constructs a black-boxed bioinformational substitute of the student out of algorithmic associations. While many advocates of educational genomics research remain cautious about prescribing policy implications, the construction of an infrastructure of knowledge production nonetheless advances the possibility of bioinformational accounts of student outcomes being used to inform educational interventions.

The full paper, ‘Infrastructuring educational genomics: associations, architectures and apparatuses’ is available open access.  


This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

The views expressed by the blogger are not necessarily those of NEPC.

Ben Williamson

Ben Williamson is a Chancellor’s Fellow at the Centre for Research in Digital Education and the Edinburgh Futures Institute at the University of Edinburgh. His&nb...