Behavioral genetics has covered a wide range of topics, and, in animals, a respected history of scientific work. In humans, though, its work has been more controversial, dominated by "fitness" for social roles and rank and, therefore, cognitive ability or "intelligence." Indeed, it was Darwin's cousin, Francis Galton, who argued that fitness in humans depended on "General Ability or Intelligence" and proposed "to show ... that a man's natural abilities are derived by inheritance." To do this he set up the world's first mental test center in London in 1882. Using simple tests of mental speed, memory, sensory acuity, and so on, he wanted to show that scores would match social status or "reputation." Unfortunately for his theory, they didn't; but Galton's strategy-using test scores as surrogates for differences already "known"-laid the foundations of human behavior genetics. It only remained to find the "right" test.
This was indirectly provided by Frances Binet's work in Paris. He had been devising school type tasks-general knowledge, comprehension of sentences, simple arithmetic, and so on-for screening children in school for special treatment, and produced his first test in 1904. Because scores depended on family background, these tests did correlate with social class. Galton's followers in Britain and America seized upon them as what they had been looking for-their test of "innate" intelligence-and quickly translated them for use in their anti-immigration and eugenic policies. But this still didn't actually prove that score differences are due to inheritance.
The problem was partly solved by R.A. Fisher in 1918, who introduced the statistical concept of "heritability," the proportion of trait variance attributable to genetic variance, from experimental breeding programs in agricultural species.1 Its validity lay in knowing the genetic background and environmental experiences of the organisms, which we don't tend to have in humans. But Cyril Burt thought he had solved that problem when he estimated the heritability of intelligence from IQ correlations in pairs of twins. Since then, results of a number of twin studies of cognitive ability have suggested a sizeable heritability of between 0.5-0.8, meaning that 50-80% of the variance in cognitive ability is genetic in origin. So twin studies, and correlations between IQ test scores, became the dominant paradigm of human behavior genetics.
Definition of Phenotype and the IQ test
Since we cannot measure intelligence directly-not least because psychologists cannot really agree what it is-the validity of Binet-type IQ tests depends entirely on inferences from correlations. Since tests were constructed to predict school achievement, which determines entry to the job market, there is an inevitable correlation between test scores and occupational and social status. Behavior geneticists draw considerable conviction from these correlations, as if they represent proof of a causal mental power. However, scores have little if any association with job performance and, as Joan Freeman's studies have shown, are not reliable indicators of adult careers. High achievers in adulthood did not tend to shine above the average as children; and we don't find members of MENSA, the high IQ society, dominating the ranks of high achievers in society.
What this "intelligence" is, or what actually varies, therefore, is still not clear after more than a century of scientific inquiry. As prominent behavior geneticist Ian Deary puts it, "There is no such thing as a theory of human intelligence differences-not in the way that grown-up sciences like physics or chemistry have theories."2 Or, as Carl Zimmer put it in Scientific American, "intelligence remains a profound mystery...It's amazing the extent to which we know very little."3
Instead, cognitive behavior-geneticists rely on a kind of mystique around test demands, as if they were equivalent to cognitively complex tasks. However, the vast majority of IQ test items are simple tests of memory and general knowledge with a high learned literacy/numeracy (and, therefore, social class) content: "What is the boiling point of water?"; "Who wrote Hamlet?"; "In what continent is Egypt?" and so on.
Much weight is placed on the "Raven" test (Raven's Standard Matrices), and other non-verbal tests, said to measure "abstract reasoning," detached from cultural learning. As for complexity, there seems little to distinguish test items from the complexity of reasoning required in everyday practical and social tasks carried out by nearly everyone. Analyses have found little evidence that "level of abstraction" (defined informally) distinguishes item difficulty. As Téglás and colleagues have shown, even 12-month-old infants are good at "integrating multiple sources of information, guided by abstract knowledge, to form rational expectations about novel situations, never directly experienced."4
As for being "culture-free," what is overlooked is that, like languages, cognitive styles differ according to the kinds of activities most prominent in different cultures and social classes. In studies of formal logic and reasoning, it is a classic finding that different problems of equal complexity can be of widely different difficulty to different people. Western societies are deeply class stratified along occupational lines, which create starkly different activities and habits of thought. As cognitive psychologist Lev Vygotsky argued, such activities "determine the entire flow and structure of mental functions."5
Accordingly, much research shows how "ways of thinking," and even brain networks, are shaped by cultural activities. What is clear about the Raven test is that the cognitive processes demanded are those most common in middle class cultural activities: reading from top-left to bottom right; following accounts, reading timetables, and so on. It is not testing individual's rank on some fixed scalar power so much as their "distance" from forms of knowledge and thinking deemed to be the norm by test designers. This is indicated by the massive gains in average IQ scores (including Raven scores) over time as more people have moved from working class to expanding middle class occupations (the so-called "Flynn effect").
Methods - Twin studies and heritability estimation
Strong claims about IQ heritability suggest that behavior geneticists have firm measures of the genetic variance underlying the (not so clear) phenotype of intelligence. On the contrary, neither the genetic nor environmental values are actually known. Rather these are (again) inferred from correlations among relatives, on the basis of a host of unlikely assumptions.
Identical, or monozygotic (MZ), twins share all their genes, and tend to correlate around 0.7-0.8 on IQ. We can infer that this resemblance is due to their common genes, and the rest is due to differences in environmental experiences. But the correlation could be due to the environments that they also share, and the remainder due to errors of measurement (which are often forgotten). If the twins are reared apart, in completely different environments, then, in theory, the correlation would provide a direct estimate of heritability. In practice, it has been extremely difficult to find suitable samples of twins reared apart in completely uncorrelated environments, so estimates derived from them have been highly dubious.6
Consequently, most heritability estimates have come from comparisons between the resemblances (correlations) of MZ and Dizygotic (DZ, nonidentical) twins. We might expect MZ pairs, who share all their genes, to be more similar then DZ pairs, who share only half their genes on average, so it can be inferred that differences in average resemblance or correlation between kinds of twins is related to differences in genetic similarity. Through formulae explained elsewhere, heritability is usually estimated from twice the MZ-DZ difference in correlations:
h2 = 2 (rmz - rdz).
More recently, some sort of statistical modelling (usually, structural equation modelling) has been used, which has some advantages. But such modelling also makes a number of assumptions that may or may not be valid.
Now let us look at some of those assumptions. The first is that human intelligence can be treated exactly like a simple quantitative trait such as height or weight, and that the relevant genes, although possibly numerous, exert effects additively (independently of each other). Otherwise-if there were interactions between genes or genes and environments-it would be impossible to determine what correlations to expect. It is also assumed that "environments" contribute to differences in the same additive way. As we shall see, this flies in the face of what we now know; yet there are few serious attempts to rule out such interactions in the twin IQ data.
What has drawn the most attention of critics, though, is the "equal environments assumption." The twin method requires that the environments of MZ pairs are no more similar than environments of DZ pairs, on average; otherwise, those variations could partly or entirely explain the correlation differences. As it happens, the assumption is flatly contradicted by numerous studies. In one review, David Evans and Nicholas Martin said "There is overwhelming evidence that MZ twins are treated more similarly than their DZ counterparts."7 Studies reveal that MZ twins are more likely to share playmates, bedrooms, and clothes, and to share experiences like identity confusion (91% vs.10%); being brought up as a unit (72% vs. 19%); being inseparable as children (73% vs.19%); and having an extremely strong level of closeness (65% vs. 19%). Parents also hold more similar expectations for their MZ than DZ twins. Behavior geneticists have a tendency to wave these differences aside as if they don't really matter, but they can easily explain part or all of the differences in IQ resemblance and grossly inflated heritability estimates.
There are other problems that might distort heritability estimates. A major-and problematic-prior assumption of these estimates is that there will be substantial (additive) genetic variance underlying all individual differences. In actuality, it is one of the laws of natural selection that, for traits important to survival, additive genetic variance tends to be reduced across generations, creating cohesive, interactive genotypes. Indeed, many experimental and observational studies have confirmed this.8 Over a decade of genome-wide molecular studies meant to take us "beyond" heritability have failed to find genes for IQ. One explanation often proposed is that the gene effects are cumulatively present, but individually too small to be separately detected. So investigators have returned with greater resolve, recently, to "proving" IQ heritability. For example, the recent report by Davies and colleagues, brought out with press reports and much media coverage, claims to "establish" and "unequivocally confirm" (surely unedifying terms in any science) just that. As usual, the study involves a host of "ifs" and assumptions, including the dubious one that genetic effects can be treated as a random (independent) variable. But it also uses a device for extrapolating from identified to non-identified variances which David Golan and Saharon Rosset recently describe as "a questionable heuristic."9 Moreover, subjects were in their upper '60s and '70s (and, therefore, non-representative in other respects); and heritability estimates varied from 0.17 to 0.99(!), depending on combinations of samples and tests.
What is definitely unclear is why this enterprise continues. It is now widely accepted that heritability estimates of intelligence, are, in the human context, of little practical relevance. Even if accurately achieved (which so far seems unlikely) they do not predict the likely developmental endpoints for individuals or groups, or the consequences of interventions; they have told us little reliably about genes or environments; and they have not helped to provide a "grown-up" theory of intelligence. As the originator, Ronald Fisher himself, said: "(heritability) is one of those unfortunate short-cuts which have emerged from biometry for lack of a more thorough analysis of the data."
One reason is that, behind all the controversies, there are different world views of the nature of genes and environments, traits and their development. The "genes" and "environments" of the behavior geneticist are abstract, idealistic entities with little interaction, a linear determinism that defines limits on individual development and, therefore, social status and privilege. On the contrary, the recent "omics" revolution-the creation of a broad range of research areas, including genomics, proteomics, metabolomics, interferomics, and glycomics-suggests the very opposite of such independent, linear effects. It suggests how processes and systems utilize higher information structures geared to changing environmental contexts.
Various discoveries now show how intense cross-talk between multitudes of gene-regulatory pathways provide complex non-linear dynamics. These dynamics can create novel developmental pathways, often proposing new targets for selection. They integrate the transcription of genes contextually, often "rewiring" the gene network in response to changing environments. In addition there are the vast regulatory functions of alternative splicing, messenger RNA, vast numbers of non-coding RNAs, and so on, all depending on cooperative interactions. These explain why many different phenotypes can develop from the same genotypes, or the same phenotype from different genotypes; and why a population of individuals of identical genes developing in identical (or closely similar) environments can exhibit a normal range of behavioral phenotypes.
Even at this level, the "dumb" independent factors, and simple quantitative traits, of the behavior geneticist have disappeared into highly interactive intelligent systems. Metabolic networks evolved into nested hierarchies of still more intelligent systems: physiological systems; nervous systems and brains; cognitive systems; and, finally, the human socio-cognitive system. On "top" of this nested hierarchy the socio-cognitive system differentiates according to dominant cultural activities, making humans far more adaptable than any system of independent genes. IQ tests simply collapse this enormous diversity into a (pretend) scalar trait. What is allocated to the category "genetic variance" is, in reality, variation in the expression of nested dynamic systems. This is why a leading behavior geneticist of IQ, Eric Turkheimer, has had to admit, recently, that "The systematic causal effects of any of these inputs are lost in the developmental complexity of the network."10 It seems ironic that the current unfolding of the real nature of intelligent systems is leading to the eclipse of the Galton paradigm.
Ken Richardson was formerly Senior Lecturer at the Centre for Human Development and Learning, The Open University, UK (now retired). He is the author of Understanding Psychology; Understanding Intelligence; Origins of Human Potential: Evolution, Development and Psychology; Models of Cognitive Development and The Making of Intelligence.
1. Fisher, R.A. (1951). Limits to intensive production in animals. Journal of Heredity, 4, 217-18.
2. Deary, I.J. (2001). Intelligence: a short introduction. Oxford: Oxford University Press (pix).
3. Zimmer, C. (2002). Searching for intelligence in our genes. Scientific American: October.
4. Téglás, E., Vul, E., Girotto, V., Gonzalez, M., Tenenbaum, J.B., and Bonatti, L.L (2011). Pure Reasoning in 12-Month-Old Infants as Probabilistic Inference. Science, 332, 1054-1059. (p1054).
5. Vygotsky, L.S. (1988). The genesis of higher mental functions. In Richardson, K. & Sheldon, S. (Eds.) Cognitive development to adolescence. Hove: Erlbaum.
6. Joseph, J. (2010). Genetic research in psychology and psychiatry: a critical overview. In K.E.Hood, C.T Halpern, G. Greenberg & R.M. Lerner (Eds.). Handbook of developmental science, behavior and genetics (pp. 557-625). New York: Wiley-Blackwell.
7. Evans, D. M., & Martin, N. G. (2000). The validity of twin studies. GeneScreen, 1, 77–79. (p77).
8. Christe, P., Moller, A.P., Saino. N. & De Lope, F. (2000). Genetic and environmental components of phenotypic variation in immune response and body size of a colonial bird, Delichon urbica (the house martin). Heredity, 85, 75-83.
9. Golan, D. & Rosset, S. (2011). Accurate estimation of heritability in genome wide studies using random effects models. Bioinformatics, 27: i317-i323.
10. Turkheimer, E. (2011). Commentary: variation and causation in the environment and genome. International Journal of Epidemiology, 40, 598–601. (p600).