For a commentary to this article, and counter-response by the authors, please scroll down to see updates.
Biological explanations of differences in behavior between women and men or girls and boys are everywhere, from scientific articles to bestselling self-help books to parenting guides to diversity and inclusion workshops to Hollywood movies.1 Often, the basic structure of such explanations is along the following lines: A study reports a difference between females and males in some neural measure (such as the size of a specific brain structure). The difference is often described as if it were binary – females are like this and males are like that – and a natural and inevitable consequence of being female or male, assumed implicitly or otherwise to be inscribed in our genes. Then, the biological difference is suggested to underlie a behavioral or psychological difference between females and males. This pattern of description and explanation can give rise to an “ah ha” feeling – now we finally understand why women and men are the way they are. But researching, understanding, and interpreting sex differences in brain and behavior is surprisingly complicated, and particularly so when humans are involved. To help everyone parse the next hot new biological explanation of female/male behavior differences, here are eight things to know, look out for, and ask, from the nitty-gritty of whether there even is a difference to the grand sweep of evolutionary explanations.
When it comes to this area of research, critiques are sometimes dismissed as “political correctness” and as interference with the scientific process. However, the point of critical inquiry is not to deny differences between the sexes, but to ensure a full understanding of the findings and meaning of any particular research report.
A Note about Terminology
In both science and everyday language, the terms “sex” and “gender” are sometimes used in interchangeable ways. In this article, we use “sex” to refer to the genetic and hormonal components of sex – the biology involved in creating individuals with either male and female reproductive systems (Joel 2016). We use “gender” to refer to socially constructed expectations concerning the roles, identities, and behaviors associated with being either female or male. As we discuss below, both sex and gender can affect brain and behavior, either independently or in interaction. Therefore, in order to avoid prejudging causes of differences between the sexes, we’ll use the term “sex/gender” (Kaiser 2012).
The First Thing to Know: False Positives and False Impressions
When a scientist reports a sex/gender difference in brain or behavior, how confident should we be that what scientists refer to as a “significant” difference is a real difference, rather than a chance finding or a false positive?
As many readers will know, the behavioral science community has recently been preoccupied with the “replication crisis”, that is, the difficulty to repeat or replicate some reports of significant differences between different groups or between different experimental conditions (Open Science Collaboration 2015). Intuitively, most people understand that when the difference between two groups is large and robust – say, the average difference between Brits and Germans in the ability to speak German – we would be confident that a repeat of the experiment would yield the same pattern of results, even if the initial sample were small – say, just thirty people. But what if the significant difference reported were in the ability to (say) fold a shirt neatly? How confident would you be that the next fifteen Brits and fifteen Germans would show the same pattern?
All else being equal, the finding of a “statistically significant” result is more likely to be a false positive in smaller studies. Larger samples will more closely approximate the population from which the sample is drawn, so any finding of difference is more likely to be a true positive (for an accessible explanation of the problem of false positive research findings, see Héroux 2017). In fact, this was part of the reason behavioral scientists started to worry about the reliability of scientific findings: they determined that it was not possible for so many small studies to be reliably detecting small differences, as small samples are less likely to detect small differences (Button et al. 2013; Szucs and Ioannidis 2017).
Another factor that can contribute to false positives is “researcher degrees of freedom” (Simmons et al. 2011) in data collection and analysis. To return to our cross-cultural shirt-folding example, there are a number of ways that researchers could define and measure “neatness of shirt folding”: the sharpness of the creases, the symmetry of the folds, the flatness of the shirt, and so on. The science community has recently started to pay attention to so-called “questionable research practices,” like using all three measures of neatness of shirt folding in the data collection phase, but only reporting the one that produces a significant result (John et al. 2012).
These researcher degrees of freedom can give rise to another problem (see Ioannidis 2005) which we may term “false replication”. Imagine three Cross-Cultural Laundry Practices labs, which all subtly believe the cultural stereotype of Germans as relentlessly precise in their handling of laundry. Research group A finds that Germans fold more symmetrically, but observes no differences in crease sharpness or shirt flatness. Research group B finds only a difference in crease sharpness. Research group C gets the anticipated difference only for shirt flatness. Each research group concludes from their data that “Germans are neater shirt folders than Brits,” creating the impression of a robust finding. Only by taking a closer look can we see that of nine comparisons of shirt folding (three in each of the three research groups), six showed no differences between Germans and Brits, and none of the three differences found was replicated by the other two studies.
The problem of false positives is also exacerbated by “publication bias,” in which journals prefer to publish positive or “significant” findings. Since even scientists and scientific journal editors can implicitly perceive a finding of no difference as “no finding” – or at least not an interesting one – null results often disappear. Publication bias creates an iceberg-like literature: what you see above the water are the published studies that report significant differences; what you do not see are the many, many more studies that “fail” to find differences. Publication bias means that the research canon becomes populated with many more positive findings than would be expected, and gives the impression that these are robust and reliable.
What to know:
While these are now recognized as serious issues for all areas of behavioral research, these problems have long been on the radar in sex/gender differences research, where they are often particularly acute. This is because it is easy and intuitive for researchers to “check” for male/female differences in their data, even when such differences are not a core part of their research (Hines 2004; Maccoby and Jacklin 1974). Findings of “differences” may well then be reported and emphasized (and possibly get lavish attention in the press), while findings of “no differences” may be tucked away in results sections, if they are reported at all (Kaiser et al. 2009). As scientific databases do not provide a way to search for similarities, only differences (Kaiser et al. 2009), it is very easy for studies that do not find significant differences between the sexes to escape the attention of researchers, as well as of those involved in communicating research findings.
Sex/gender researchers have also long paid close attention to the issue of researcher degrees of freedom in defining and measuring their variables of interest. Some striking examples come from an analysis of hundreds of studies from 1967 to 2008 that explored the hypothesis that female/male differences in prenatal testosterone hardwire gender identity, sexuality, and sex-typed interests (Jordan-Young 2010). Though studies in that field have typically been seen as supporting the hardwiring hypothesis, that assessment fails to consider how scientists’ definitions of “masculine” and “feminine” sexuality changed over time, particularly following the sexual revolution of the 1960s as behaviors and desires once considered exclusive features of masculine sexuality became considered common-sense facets of human sexuality (Jordan-Young 2010). Similarly, what was considered a “girl toy” in one investigation of links with prenatal hormones might be a “gender-neutral” toy in another (see also Fine 2010a; Fine 2015). Just as in the above shirt-folding example, what from a distance seems like a large, consistent body of evidence on closer inspection is woven through with inconsistencies and contradictions.
The new neuroimaging technologies are particularly prone to fall prey to false positives (Fine 2013). Neuroimaging is expensive, which makes large sample sizes financially unviable for some labs (although this is improving with data-sharing initiatives), and there are many possible ways to analyze the data. One recent analysis of functional neuroimaging studies of sex/gender differences found that, in contrast to what is expected according to statistical theory, the (few) studies with large samples were no more likely to identify sex/gender differences in brain function than the (many) studies with small samples (David et al. 2018). The authors concluded that this was a red flag for publication bias in this literature.
What to look for:
To return to our original question, when a scientist reports a sex/gender difference in brain or behavior, how confident should we be that the difference is real and not a false positive? There is often no simple answer, but there are some useful questions to ask. Has this finding been replicated elsewhere with a different sample? What was the sample size of the study? How many comparisons were made? (Bear in mind that unless the study was “pre-registered,” meaning the researchers said in advance what conditions they would run and what comparisons they would make, you may not be able to answer this question.) How many of the comparisons you know about were significant and how many were nonsignificant? Do the researchers discuss the similarities and the differences, or do they just emphasize the differences? How does the current finding fit into existing results (Fine and Fidler 2014)? How consistent is the finding with other similar studies – remembering that the devil is in the detail of how variables are defined and measured?
The Second Thing to Know: Size Matters
So you are reasonably confident that a reported sex/gender difference is a true difference and not a false positive. The next question to consider is the size of that difference. Is it like the size of the average difference between Brits and Germans in ability to speak German, or is it like the size of the difference in ability to neatly fold a shirt? There are statistics that give this type of information: the ‘effect size’ tells you whether the two groups are completely distinct, completely overlapping or somewhere in between. One way to present the effect size is as the difference in the average scores of the two groups divided by the spread of the scores (this statistic is called Cohen’s d). (To try it at home, Social Science Statistics, for example, will calculate effect sizes for you.)
The smaller the effect size, the greater the overlap between the groups (you can visualize how the overlap depends on the size of the effect size at https://sexdifference.org). Sex/gender differences in height are often used to illustrate the fact that, even with an effect size that is considered very large in psychology (Cohen’s d = ~1.7), there is still some overlap between the two populations (39 per cent). Or to put it another way, not all men are taller than all women.
Note that “a statistically significant difference” does not tell you about the size of the hypothesized difference. Moreover, as noted earlier, the chances of identifying a true difference as statistically significant increases with sample size. This means that with very large samples (which are fortunately becoming more common), it is possible to detect very small differences. This is important because “statistically significant” does not necessarily mean “practically significant in the real world” or “theoretically important.”
What to know:
In most measures of brain and behavior, the differences between human males and females are much smaller than the difference in height. For example, the ten largest structural sex/gender differences in the brain as measured with magnetic resonance imaging (MRI, a standard brain imaging technique) range in size from Cohen’s d ~ 0.4 to ~ 1 (Joel et al. 2015). That’s an overlap of about 60 to 84 per cent. The situation is similar when it comes to human behavior. For example, a synthesis of meta-analyses of studies of sex/gender differences in cognition, communication, social and personality traits, and psychological well-being found that 78 per cent of the effect sizes were small or close to zero (Hyde 2005). Ten years on, Zell and colleagues looked at 106 meta-analyses, comprising data from over 20,000 individual studies and over twelve million participants. They reported an overall effect size of 0.21, with 85 per cent of the male-female differences being very small or small (Zell et al. 2015).
Knowing the effect size of a difference is important because it tells you to what extent knowing that someone is male or female is a useful guide to knowing what their brain will look like or how they will perform on a certain task or react in a particular situation. When there is a lot of overlap (smaller effect size), significant numbers of females will be more “masculine” than the average male, and many males more “feminine” than the average female. For example, a study of over 90,000 women and 111,000 men from fifty-three countries measured mental rotation ability,2 an index of spatial cognition. A sex/gender difference with Cohen’s d = 0.47 was found, meaning that about 32 per cent of women obtained higher scores than the average man (Lippa et al. 2010).
Knowing the effect size is a crucial first step to considering whether a sex/gender difference is likely to be important or meaningful in the real world. This is a particularly important point to bear in mind nowadays when it is possible to analyze very large datasets established through data-sharing initiatives. With large sample sizes, it is quite possible for even tiny and therefore potentially meaningless differences to appear to be statistically significant.
What to look out for:
How big is the reported difference? Have the researchers reported (and, ideally, illustrated) effect sizes in their findings? Take a closer look at differences that are reported in terms of “sexual dimorphism” (this implies two distinct populations, which is rarely the case), or that are described using terms such as “profound” or “fundamental.” Donna Maney of Emory University has developed an online tool specifically for checking out effect sizes (https://sexdifference.org).
The Third Thing to Know: What Type of Difference Are We Talking About?
So you are confident that a sex/gender difference is the real deal, and you have established its size. The next important question to ask is what kind of difference it is (Joel and McCarthy 2016). Is it fixed or variable? Ever-present or occasional? If this seems like a strange question, it may be because finding a sex/gender difference is so often both the start and the end of research into sex effects on the brain (Rippon et al. 2014). We have therefore become accustomed to snapshot (Schmitz 2010) research approaches that compile a “catalogue of differences” (Springer et al. 2012).
One often-neglected question, for example, is whether any difference is stable across the lifespan or is seen only in certain life stages. In humans, one of the largest sex/gender differences in the brain is in the size of the intermediate nucleus of the hypothalamus, which is about twice as large in men compared to women, on average. But this is only the case at certain ages: there is no difference among those over age forty-five (Garcia-Falgueras et al. 2011). It can also be interesting to examine whether or not sex/gender differences in behavior change across age groups.
A second question to ask is whether the difference depends on environmental factors. For animals, factors like stress and housing conditions, and for humans, factors like culture, experiences, and context can all affect the existence of sex/gender differences in the brain. For example, in a particular brain region the density of small protrusions on dendrites (long extensions of nerve cells) is much higher in female rats compared to male rats (Shors et al. 2001). However, the opposite is true if the rats undergo a brief stressful episode (for additional examples and review, see Joel 2011). When a difference is found, one should ask if it exists under all conditions, or if it disappears or even reverses depending on other factors.
Such questions are obviously particularly difficult to investigate in humans, given both the ethical and pragmatic limitations of manipulating experiences and culture. However, cross-cultural and interethnic comparisons of sex/gender differences can offer some insights. For example, it has been found that males’ educational scores show a wider spread from the mean (variance), with more males than females being among both the very lowest and very highest scorers (O’Dea et al. 2018). However, other research has found that such differences at the highest end of ability vary by ethnicity. For example, drawing on North American school test data, Hyde and colleagues found that among white eleventh graders, slightly more boys than girls scored at the ninety-ninth percentile of mathematical ability, while among Asian American eleventh graders, slightly more girls than boys scored at that level (Hyde et al. 2008).
What to know:
Most animal studies of female/male differences test animals from a single species and strain, at a specific age, and under one set of environmental conditions. As a result, these studies cannot answer questions regarding whether the difference found is stable over age and environmental conditions, or whether it is seen only at certain life phases or under particular environmental conditions. For example, the dynamics of sexual behavior in rhesus monkeys, including the influence of the female hormonal state, varies according to the space available in which the monkeys can interact (Wallen 1982).
What to look for:
Did the investigators test samples from different populations, age groups, or under different environmental conditions? If not, then beware assumptions that the difference is stable and fixed. It may be transient or depend on environmental factors.
The Fourth Thing to Know: Where Do Differences Come From?
So you are confident that a sex/gender difference in the brain or behavior is real, and you have thought about its size and kind: big, small, stable, transient, fixed, contingent. The next question is: Where did it come from? Particularly as concerns the brain or behavior that we think of as quintessentially masculine or feminine, a common assumption is that the differences are caused at least in part by genetic and hormonal differences between the sexes. This would be a “direct effect” of sex: a direct path from biological sex to brain and behavior (see figure 4).
But the pathway can also be indirect (de Vries and Forger 2015). For example, most sex/gender differences in human brain structure are actually size differences. Human males are (on average) about 10 per cent bigger than females, and this size difference affects all organs, including the brain. Alleged sex/gender differences in brain structures such as the amygdala or hippocampus, or the ratio of brain cells to brain pathways, disappear or become trivial once corrections for brain size or volume are made (Coupé et al. 2017; Hänggi et al. 2014; Im et al. 2008; Jäncke et al. 2015). Biological sex influences body and/or brain size, but brain size is the primary determinant of morphological measures of the brain.
Another example of indirect effects of sex is the female/male difference in the number of motor neurons in a nucleus of the lumbar spinal cord in rats. This difference partly depends on the greater anogenital licking by rat dams (mothers) of their male pups (babies) as opposed to their female pups. Why do the mothers lick the males more? Because they are attracted by the higher level of testosterone in the male pups’ urine (Moore 1995). In other words, a facet of biological sex – differences in testosterone excretion – creates a “gendered” environment that, in turn, affects brain development. Similarly, although it might seem very sensible to assume that testosterone always affects behavior via the brain, it also has masculinizing effects on physical appearance, which then can affect physical and social experiences and therefore also brain and behavior (Oliveira 2004).
In humans, of course, we have gender – the socially constructed expectations that designate particular roles, identities, and behaviors as appropriate to either females (expected and assumed to identify as girls/women) or males (expected and assumed to identify as boys/men). Gender socialization and enforcement ensures indirect effects of sex, even if individuals identify as non-binary. That is, while gender socialization doesn’t have uniform effects across people, every person is exposed to non-random experiences associated with gender as a system. It is important to recognize that gender norms and gendered experiences vary substantially by race, class, religion, and other dimensions. Brains are shaped by many factors over the life course. We now have evidence of the lifelong experience-dependent plasticity of the human brain, which means that the brain can be changed by experiences such as bullying, parenthood, playing videogames, learning to juggle, or training to be a London cab driver (May 2011). To the extent that the experiences that shape the brain are gendered, this is an indirect effect of sex (Bleier 1984; Kaiser 2012). Intriguingly, sex-related hormones may exert an indirect effect by making us more (or less) susceptible to such gendering effects. Thus, girls who were exposed to atypically high levels of androgens during pregnancy were found to be less influenced by gender modelling and labelling compared to typically exposed girls (Hines et al. 2016). This finding is particularly revealing because of evidence that parents of such girls often make more of an effort to emphasize gender-appropriate behavior (reviewed in Jordan-Young 2010).
Further complicating the possibilities, social constructions of gender can affect components of biological sex. For example, lower testosterone levels have been reported in fathers where paternal care is the cultural norm as compared to fathers where paternal care is typically minimal or absent (Muller et al. 2009). More generally, testosterone levels are influenced in females and males when they are in competitive or in nurturing contexts (see van Anders 2013). So when considering the relations between biological sex and gender, it is important to remember that gender, by its effects on behavior, can affect biological sex.
What to know:
Comparisons of females and males do not just capture direct effects of sex, but also capture all the indirect effects, including the effects of gender socialization. It is becoming evident that there is a need to identify and quantify additional factors that may correlate with sex category and may be acting as mediating variables in any relationship between sex and behavior (Joel and Fausto-Sterling 2016; Joel and McCarthy 2016; Rippon et al. 2017).
What to look for:
Have the researchers measured biological variables and gendered experiences that could contribute to differences in brain, hormones, or behavior, such as height; weight; muscle mass; physical activity; parental socialization; hobbies; educational background; or stereotypical beliefs about fixed male or female aptitudes, abilities, or roles?
The Fifth Thing to Know: How Does It All Add Up?
So you are confident that a sex/gender difference in the brain or behavior is real, and have considered the size, type, and origins. The next question is: how does it fit in with other differences? Every time a scientist reports a sex/gender difference in brain or behavior, it seems like scientific understanding points to women and men being a little bit more distinct. But sex/gender differences in brain and behavior do not necessarily consistently add up to make women’s and men’s brains and behavior more and more different (Joel 2011; Joel 2014; Joel et al. 2015; Joel and Fausto-Sterling 2016; Spence 1993).
What to know
If you take a group of women and men, and detect the characteristics that are more common in women compared to men and those more common in men compared to women, you’d find that most individuals have a mix of “feminine” and “masculine” characteristics (Joel et al. 2015a). The same turns out to be the case when it comes to brain characteristics – most brains are comprised of unique mosaics of male-typical and female-typical features (Joel et al. 2015a). This is why it is meaningless to talk about “the male brain” or “male nature,” or “the female brain” or “female nature” – which of the endless brain mosaics found in females is “the female brain”? ,While it is possible to predict, with accuracy above chance, whether someone is male or female on the basis of their brain mosaic (e.g., Chekroud et al. 2016; Del Giudice et al. 2016; Rosenblatt 2016) it is impossible to predict what an individual’s unique brain mosaic will be on the basis of whether they are male or female. Moreover, knowing whether someone is male or female provides little information on whether their brain is similar to someone else’s. A recent study shows that the brain types typical of women are also typical of men, and vice versa, and that large sex/gender differences are found only in the prevalence of some rare brain types. As a result, the chances that two people of different sexes would have the same brain type are similar to the chances that two people of the same sex would have the same brain type (Joel et al. 2018).
In addition to not adding up consistently within individual brains, there is growing recognition that sometimes one sex difference can compensate for another; in effect, cancelling each other out to enable behavioral similarity. While this might seem a bit unintuitive, in many species (including our own), males and females often behave in similar ways despite physiological and chemical differences. For example, both male and female prairie voles (a type of rodent) show quite similar parenting behavior, and this appears to be not just despite, but because, of sex differences in the brain and hormones. In males, but not in females, parental behavior depends on a brain pathway that is much denser than in females, and this seems to be a compensation for the fact that, unlike females, males do not experience the physiological (e.g., hormonal) and behavioral changes associated with pregnancy (e.g., becoming and being pregnant, giving birth, and lactation) (de Vries 2004). These observations not only demonstrate that sex differences don’t always add up to make females and males more and more different, but also that sometimes different brain mechanisms in the two sexes may work to make them behave in ways that are similar rather than different.
What to look for
Consider whether the effects of a reported biological difference between groups of males and females is being thought about in isolation or in the context of other relevant characteristics that might cancel out or compensate for it. Also examine whether, when considering individual males and females, the investigators tested for internal consistency of the measures showing a sex/gender difference. Did they demonstrate that within each individual all features that show a sex/gender difference are consistently female-typical or consistently male-typical, or did they only assume that this is the case?
The Sixth Thing to Know: What Does a Brain Difference Mean?
The caveats above notwithstanding, the genetic and hormonal components of sex do influence brain development and functioning (Arnold 2012) and there are average group-level differences between the brains of males and females in both human and nonhuman animals (McCarthy 2016; Joel et al. 2015). But interpreting what these differences mean in terms of what most people are interested in – behavior – is harder than you might think (Fausto-Sterling 2000; Fine 2010a; Fine 2014; Maney 2016).
The advent of neuroimaging technologies has led to a proliferation of studies that indicate female/male differences in the brains of living people (as opposed to through post-mortem examination). Structural MRI provides a way to compare the size or volume of particular regions of the brain as well as connectivity maps that provide information about the pathways between different brain regions. Functional MRI provides information about brain activity while participants perform some kind of task.
What to know:
One thing to consider is the “reverse inference” problem. Apart from very specialized sensory processing structures, most regions of the brain are impressive multitaskers, involved in many different functions at many different levels. So, if a sex difference in brain structure or activity is communicated in terms of a particular structure in the brain being responsible for a very specific function, then this interpretation needs to be viewed with more than a little scepticism. The reverse is also true – it is quite possible that the same behavior is generated by somewhat different neural networks in the brain of different people. There are many neural roads to the same behavioral end, a principle known as “degeneracy” (e.g., Price and Friston 2002). The prairie vole example discussed earlier, in which brain and hormonal differences between males and females give rise to similar parenting behavior, is a good example of this principle.
Since different parts of the brain do not map neatly onto different functions, neuroscientists cannot currently look at a brain imaging readout and, in the absence of any other information, reliably say what kind of mental activity is expected on the basis of these structural or functional characteristics. So although it is tempting to speculate that sex/gender differences in the brain translate into different strategies or abilities, the ever-present danger is that gender stereotypes will fill the cavernous gap in knowledge (Fine 2010a, 2010b). For example, analyses of a number of studies of functional neuroimaging studies of sex/gender differences in emotional processing (Bluhm 2013) showed how each study was interpreted as supporting gender stereotypes of women’s greater emotionality, even where the researchers’ results seemed to contradict this. For instance, the unexpected finding that males responded more strongly than females to fear- and disgust-eliciting stimuli was explained post hoc as due to men’s greater sensitivity to aggressive cues. Likewise, interpreting the meaning or functional consequences of sex/gender differences in size or connectivity of particular regions is harder than it seems. Even neuroscientists who work with nonhuman animals under controlled conditions have trouble linking female/male differences in brain structure to behavioral differences (de Vries and Sodersten 2009). This problem is intensified once we consider the mosaic nature of sex/gender differences, as only rarely do all brain regions with activity pattern typical of, say, females, reside in any given brain.
Another potential problem is that the brain activation maps produced by functional neuroimaging may convey the impression that men and women use completely different parts of their brain to do the same activity. This is simply not the case. The activation maps these studies produce are the end product of a whole chain of statistical manipulations. The resulting images do not show all the areas that were activated by the task, but only highlight the areas that were activated to a different extent in the female and male groups. So what might be the most interesting aspect of these data – the variability within the groups – has effectively been stripped away by the various statistical steps, as have all the similarities between females and males.
The same point applies to connectivity maps, where the differences in pathways in the brain are similarly mapped and color-coded (e.g., Ingalhalikar et al. 2014). These compelling images are statistical projections produced by the imagers and their software, and they obscure both the differences within a sex and the overlap between the sexes.
What to look for:
Speculation is an important and necessary part of science, but claims that confidently link sex/gender differences in brain activity or structure to complex behavior should be treated with scepticism. A good question to ask is: if the sex/gender difference in the brain were reversed, could there be a different, but no less plausible, explanation of why this means that men are like this, and women are like that? For example, when some scientists suggest that men are superior at spatial processing because, unlike women, they only use one side of the brain, while other scientists suggest that men are superior at spatial processing because, unlike women, they employ both sides of the brain, it is clear that expectations are being led more by stereotypes than a scientific model of spatial processing in the brain (see, e.g., Fine 2010a).
The Seventh Thing to Know: Animal Comparisons
Many early studies of the links between brain and behavior were carried out on nonhuman animals. This remains a thriving research activity to this day, particularly in the basic neurosciences and in applied clinical research, such as the study of drug effects. It is not uncommon for findings from nonhuman animals to be applied to the understanding of human conditions such as autism or depression as well as to the understanding of the effects of sex on brain and behavior (Bayless and Shah 2016).
As noted earlier, social and environmental factors matter even in nonhuman animals: factors such as handling, company, and cage size can have a significant effect on both brain and behavioral characteristics, including on the existence and direction of sex differences (for review and additional references, see Joel 2011; Juraska 1991). In addition, sex differences reported in animals raise all the same questions already discussed regarding the kind of a difference (big, small, stable, transient, fixed, contingent) and whether it is a direct or an indirect effect of sex.
In addition to this, even when a study is done with a group of humans, it is not necessarily safe to assume that the findings apply to other groups of humans. Even greater care needs to be taken when findings come from a different species altogether.
Last, as Marlene Zuk (2002) points out, when it comes to the realm of sex-related behaviors, animal species provide a wide diversity of patterns. It is therefore important that researchers and commentators do not simply choose animal species that support their preferred arguments regarding the evolutionary basis of human gender stereotypes, while species that show different patterns are overlooked.
What to know:
Animal models can produce useful and important guidelines for human research but rodent and even primate brains have distinctive differences from those of humans. In fact, sex-linked effects sometimes do not generalize across mammals (Adkins-Regan 2012), or even from one rodent species to another (Joel and Yankelevitch-Yahav 2014). Nor are we likely to learn much from nonhuman animals about how human behavior and health are affected by important psychosocial factors like social status and connectedness, economic power and financial security, gender role endorsement or sexuality (Eliot and Richardson 2016).
Research using animal models can (and should) include consideration of the effects of sex as a biological variable on research into, for example, sex differences in drug effects or immune system function or differences in susceptibility to mental health problems such as depression or autism (Klein et al. 2015). This has recently been emphasized by mandates from certain funding agencies and research journals that, wherever possible, neuroscience researchers should consider sex as a biological variable in their research models (McCullough et al. 2014). This will correct a longstanding bias in neuroscience research, which has commonly only used male animals (McCarthy et al. 2012). But a warning note needs to be sounded (e.g., Eliot and Richardson 2016; Maney 2016). Researchers should be aware that sex is rarely a pure biological variable and that, even with lab-based nonhuman animals, environmental factors play a part. Additionally, the increase in reports of sex differences resulting from these research and publication mandates needs to be carefully filtered through the lens we offer here.
What to look for:
Most obviously, are we talking about humans, or some other species? If the latter, are there reasons to think that the findings do – or do not – apply generally, including to ourselves?
If a catalogue of sex differences has been provided, bear in mind that it may be quite different if it were created based on animals (or humans) who developed in different conditions.
The Eighth Thing to Know: Evolutionary Explanations
When scientists or commentators identify (or claim to identify) a sex/gender difference in the brain or behavior, it can be very tempting to reach for an evolutionary explanation. A typical kind of speculation is that a female/male difference is the product of sexual selection, which persists today because this pattern of behavior led to greater reproductive success in our ancestors thousands of years ago. For example, contemporary heterosexual partnering preferences (e.g., the importance of beauty and youth versus income in a potential partner) are often attributed to evolved predispositions (e.g., Buss and Schmitt, 2017).
What to know:
Unfortunately, ancestral behavior left no fossil record, and speculations about what kinds of behavior would have enhanced reproductive success can owe more to culturally influenced intuitions than scientific evidence. For example, despite common understanding that aggression was the key to male success in our ancestral past, there is no clear-cut relation between male aggression and reproductive success in primate societies or humans (see Fuentes 2012).
“Evolved” also does not necessarily mean “inherited via genes” (Griffiths 2002). Modern evolutionary thinking includes an important role for nongenetic factors, including environmental factors, in both the development and cross-generation transfer of evolved traits (Jablonka and Lamb 2006; Fine et al. 2017). For example, when scientists arranged for newborn male lambs to be fostered by a goat mother, and for male newborn goats to be fostered by a ewe mother, they found that the males of both species developed robust and persistent sexual preferences for mates of the fostering species (Kendrick et al. 1998). This shows that, in this case, whatever genetically inherited contribution there is to a highly adaptive sexual preference for the same species, it is the environmental inheritance (i.e., lambs normally inherit an environment in which they are reared among sheep, and goats normally inherit an environment in which they are reared among goats) and not the genetic inheritance that holds sway over behavior.
The traditional – and still common – assumption is that sexually selected behavioral traits are always inherited via the direct effects of sex-related genes and hormones on the brain. However, there is now growing interest in the environmental inheritance of sex-linked behaviors. This refers to situations in which biological sex creates a tangible difference between the sexes – such as in size, appearance, or smell – which in turn has a reliable effect on an individual’s developmental experiences. This sex-specific environmental experience can potentially be partly or fully responsible for the reproduction of the sex difference in behavior, generation after generation. In humans, we call this gender socialization, but this unexpectedly convoluted pathway has also been seen in nonhuman animals, as in the example discussed above of the sex-differentiated anogenital licking by the mother rat (Moore 1992).
In other words, even sexually selected traits may not necessarily be “fixed” by genetics (Fine 2017) but may be inherited via sociocultural channels (Fine et al. 2017; Wood and Eagly 2012), and therefore amenable to change, should we so choose.
What to look for
Speculating about genetically based evolutionary mechanisms is a lot easier than providing evidence for them. Has the researcher provided any evidence that the gendered behavior enhanced reproductive success in environments thought to be similar to those in which selection occurred? This kind of evidence is often difficult, if not impossible, to provide. But without it, the suggestion has the status of baseless speculation, and may well be confusing a cultural norm with an evolutionary adaptation. Maybe some cultural norms do reflect adaptations (i.e., they are not arbitrary, but enhance survival or reproductive success). For example, the survival benefits of treating excrement as disgusting are clear, but it appears to be something kids have to be taught (see Rozin and Fallon 1987). Yet, to paraphrase Freud, sometimes a cultural norm is just a cultural norm.
As for the mechanism by which the trait develops and gets transferred from generation to generation, have the researchers provided evidence that the trait is genetically heritable? Have they considered other transfer mechanisms, including cultural ones?
Conclusion
These eight things to know should assist anyone interested in making sense of findings from this complex and often-contested field of research, or their popular (mis)representations. In particular, we hope that science journalists will find our guidelines helpful, as journalists are frequently the filter through which research reaches general readers. By recommending that those who engage with such research consider issues such as any particular finding’s reliability, size, generalizability, origins, meaning, and evolutionary history, we hope these things to know will enhance understanding, clarity, and debate.
Update: Science working as it should
In response to our “8 Things” piece above, a reply in the form of a co-authored “Sexual Personalities” post by Marco Del Giudice, David A. Puts, David C. Geary, and David P. Schmitt appeared as a blog on the Psychology Today website. See the post here.
It detailed what they termed 8 counterpoints to our original points, usefully noting agreements as well as disagreements. We felt that it formed a welcome opening to a long-needed conversation about key points of debate on making sense of human sex differences. In this spirit, we have posted a response to their post, strongly emphasizing the identified agreements, noting the existence of non- or ‘ghost’ disagreements and acknowledging points for continued debate. This was swiftly and agreeably posted on the Psychology Today website in its entirety. Read our response here.
Bibliography
Adkins-Regan, Elizabeth. 2012. “Hormonal Organization and Activation: Evolutionary Implications and Questions.” General and Comparative Endocrinology 176, no. 3: 279–85.
Arnold Art. 2012. “The End of Gonad-Centric Sex Determination in Mammals.” Trends in Genetics 28, no. 2: 55–61.
Bleier, R. 1984. Science and Gender: A Critique of Biology and its Theories on Women. New York: Pergamon Press.
Bluhm, R. 2013. “Self-Fulfilling Prophecies: The Influence of Gender Stereotypes on Functional Neuroimaging Research on Emotion.” Hypatia 28, no. 4: 870–86.
Buss, David M., and Schmitt, David P. 2017. Sexual strategies theory. In T.K. Shackelford, V.A. Weekes-Shackelford (Eds) Encyclopedia of Evolutionary Psychological Science. Springer: Cham
Button, Katherine S., et al. 2013. “Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience.” Nature Reviews Neuroscience 14: 365–76.
Chekroud, Adam M., et al. 2016. “Patterns in the Human Brain Mosaic Discriminate Males from Females.” Proceedings of the National Academy of Sciences 113, no. 14: E1968.
Clausen, J., and N. Levy, eds. 2015. Handbook of Neuroethics. Netherlands: Springer.
Coupé, Pierrick, et al. 2017. “Towards a Unified Analysis of Brain Maturation and Aging across the Entire Lifespan: A MRI Analysis.” Human Brain Mapping 38, no. 11: 5501–18.
David, Sean P., et al. 2018. “Potential Reporting Bias in Neuroimaging Studies of Sex Differences.” Scientific Reports 8, no. 1 :6082.
de Vries, G.J. 2004. “Sex Differences in Adult and Developing Brains: Compensation, Compensation, Compensation.” Endocrinology 145, no. 3: 1063–8.
de Vries, G.J., and N.G. Forger. 2015. “Sex Differences in the Brain: A Whole Body Perspective.” Biology of Sex Differences 6, no. 1: 1–15.
de Vries, G.J., and P. Sodersten. 2009. “Sex Differences in the Brain: The Relation between Structure and Function.” Hormones and Behavior 55, no.5 : 589–96.
Eliot, L., and Sarah S. Richardson. 2016. “Sex in Context: Limitations of Animal Studies for Addressing Human Sex/Gender Neurobehavioral Health Disparities.” The Journal of Neuroscience 36, no. 47: 11823–30.
Fausto-Sterling, A. 2000. Sexing the Body: Gender Politics and the Construction of Sexuality. New York: Basic Books.
Fine, C. 2010a. Delusions of Gender: How Our Minds, Society, and Neurosexism Create Difference. New York: Norton.
–. 2010b. “From Scanner to Sound Bite: Issues in Interpreting and Reporting Sex Differences in the Brain.” Current Directions in Psychological Science 19, no. 5: 280–3.
–. 2013. “Is There Neurosexism in Functional Neuroimaging Investigations of Sex Differences?” Neuroethics 6, no. 2: 369–409.
–. 2014. “His Brain, Her Brain?” Science 346, no. 6212: 915–16.
–. 2015. “Neuroscience, Gender, and ‘Development to’ and ‘From’: The Example of Toy Preferences.” In Clausen and Levy, Handbook of Neuroethics, 1737–55.
–. 2017. Testosterone Rex: Myths of Sex, Science, and Society. New York: Norton.
Fine, C., J. Dupré, and D. Joel. 2017. “Sex-Linked Behavior: Evolution, Stability, and Variability.” Trends in Cognitive Sciences 21, no. 9: 666–73.
Fine, C., and F. Fidler. 2014. “Sex and Power: Why Sex/Gender Neuroscience Should Motivate Statistical Reform.” In Clausen and Levy, Handbook of Neuroethics, 1447–62.
Garcia-Falgueras, Alicia, et al. 2011. “Galanin Neurons in the Intermediate Nucleus (InM) of the Human Hypothalamus in Relation to Sex, Age, and Gender Identity.” Journal of Comparative Neurology 519, no. 15: 3061–84.
Griffiths, Paul E. 2002. “What Is Innateness?” The Monist 85, no. 1: 70–85.
Hänggi, Jürgen, et al 2014. “The Hypothesis of Neuronal Interconnectivity as a Function of Brain Size – A General Organization Principle of the Human Connectome.” Frontiers in Human Neuroscience 8.
Héroux, Martin. 2017. “Why Most Published Findings Are False: Revisiting the Ioannidis Argument.” Scientifically Sound, 4 October 2017, https://scientificallysound.org/2017/10/04/most-published-findings-are-false/.
Hines, M. 2004. Brain Gender. Oxford University Press.
Hines, Melissa, et al. 2016. “Prenatal Androgen Exposure Alters Girls’ Responses to Information Indicating Gender-Appropriate Behaviour.” Philosophical Transactions of the Royal Society B: Biological Sciences 371, no. 1688.
Hyde, Janet S. 2005. “The Gender Similarities Hypothesis.” American Psychologist 60, no. 6: 581–92.
Hyde, Janet S, et al. 2008. “Gender Similarities Characterize Math Performance.” Science 321: 494–5.
Im, K., et al. 2008. “Brain Size and Cortical Structure in the Adult Human Brain.” Cerebral Cortex 18: 2181-91.
Ingalhalikar, Madhura, Alex Smith, Drew Parker, Theodore D. Satterthwaite, Mark A. Elliott, Kosha Ruparel, Hakon Hakonarson, Raquel E. Gur, Ruben C. Gur, and Ragini Verma. 2014. “Sex Differences in the Structural Connectome of the Human Brain.” Proceedings of the National Academy of Sciences of the United States of America 111, no. 2: 823–8.
Ioannidis, John P.A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2, no. 8: e124. https://doi.org/10.1371/journal.pmed.0020124.
Jablonka, Eva, and Marion J. Lamb. 2006. Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life. MIT Press.
Jäncke, Lutz, et al. 2015. “Brain Size, Sex, and the Aging Brain.” Human Brain Mapping 36, no. 1: 150–69.
Joel, D. 2011. “Male or Female? Brains Are Intersex.” Frontiers in Integrative Neuroscience 5, no. 57.
–. 2014. “Sex, Gender, and Brain: A Problem of Conceptualization.” In S. Schmitz and G. Höppner, eds., Gendered Neurocultures: Feminist and Queer Perspectives on Current Brain Discourses, 169–86. University of Vienna: Zaglossus.
–. 2016. “Captured in Terminology: Sex, Sex Categories, and Sex Differences.” Feminism & Psychology 26, no. 3: 335–45.
Joel, D., et al. 2015. “Sex Beyond the Genitalia: The Human Brain Mosaic.” Proceedings of the National Academy of Sciences 112, no. 50: 15468–73.
–. 2016. “Reply to Del Giudice et al., Chekround et al., and Rosenblattt: Do Brains of Females and Males Belong to Two Distinct Populations?” Proceedings of the National Academy of Sciences 113, no. 14: E1969–70.
–. 2018. “Analysis of Human Brain Structure Reveals that the Brain ‘Types’ of Males Are also Typical of Females, and Vice Versa.” Frontiers in Human Neuroscience 12: 399. doi:10.3389/fnhum.2018.00399.
Joel, D., and A. Fausto-Sterling. 2016. “Beyond Sex Differences: New Approaches for Thinking about Variation in Brain Structure and Function.” Phil. Trans. R. Soc. Lond. B.
Joel, D., and M. McCarthy. 2016. “Incorporating Sex as a Biological Variable in Neuropsychiatric Research: Where Are We Now and Where Should We Be?” Neuropsychopharmacology.
Joel, D., and R. Yankelevitch-Yahav. 2014. “Reconceptualizing Sex, Brain and Psychopathology: Interaction, Interaction, Interaction.” British Journal of Pharmacology 171, no. 20: 4620–35.
John, Leslie K., George Loewenstein, and Drazen Prelec. 2012. “Measuring the Prevalence of Questionable Research Practices with Incentives for Truth Telling.” Psychological Science 23, no. 5: 524–32.
Jordan-Young, R.M. 2010. Brain Storm: The Flaws in the Science of Sex Differences. Cambridge, MA: Harvard University Press.
Juraska, Janice M. 1991. “Sex Differences in ‘Cognitive’ Regions of the Rat Brain.” Psychoneuroendocrinology 16, no. 1: 105–19.
Kaiser, A. 2012. “Re-conceptualizing ‘Sex’ and ‘Gender’ in the Human Brain.” Zeitschrift für Psychologie/Journal of Psychology 220, no. 2: 130–6.
Kaiser, A., et al. 2009. “On Sex/Gender Related Similarities and Differences in fMRI Language Research.” Brain Research Reviews 61, no. 2: 49–59.
Kendrick, Keith M., et al. 1998. “Mothers Determine Sexual Preferences.” Nature 395, no. 6699: 229–30.
Klein, Sabra L., et al. 2015. “Opinion: Sex Inclusion in Basic Research Drives Discovery.” Proceedings of the National Academy of Sciences 112, no. 17: 5257–8.
Lippa, R.A., M.L. Collaer, and M. Peters. 2010. “Sex Differences in Mental Rotation and Line Angle Judgments Are Positively Associated with Gender Equality and Economic Development across 53 Nations.” Archives of Sexual Behavior 39, no. 4: 990.
Maccoby, E.E., and C.N. Jacklin. 1974. The Psychology of Sex Differences. Stanford, CA: Stanford University Press.
Maney, Donna L. 2016. “Perils and Pitfalls of Reporting Sex
Differences.” Philosophical Transactions of the Royal Society of London B: Biological Sciences 371, no. 1688.
May, A. 2011. “Experience-Dependent Structural Plasticity in the Adult Human Brain.” Trends in Cognitive Sciences 15: 475–82.
McCarthy, Margaret. 2016. “Multifaceted Origins of Sex Differences in the Brain.” Philosophical Transactions of the Royal Society of London Series B, Biological Sciences 371, 1688.
McCarthy, Margaret, et al. 2012. “Sex Differences in the Brain: The Not So Inconvenient Truth.” Journal of Neuroscience 32, no. 7: 2241–7.
McCullough, Louise D., Geert J. De Vries, Virginia M. Miller, Jill B. Becker, Kathryn Sandberg, and Margaret M. McCarthy. 2014. “NIH Initiative to Balance Sex of Animals in Preclinical Studies: Generative Questions to Guide Policy, Implementation, and Metrics.” Biology of Sex Differences 5, no. 1: 15.
Moore, C.L. 1992. “The Role of Maternal Stimulation in the Development of Sexual Behavior and its Neural Basis.” Annals of the New York Academy of Sciences 662, no. 1: 160–77.
–. 1995. “Maternal Contributions to Mammalian Reproductive Development and the Divergence of Males and Females.” Advances in the Study of Behavior 24: 47–118.
Muller, M.N., et al. 2009. “Testosterone and Paternal Care in East African Foragers and Pastoralists.” Proceedings of the Royal Society B 276: 347–54.
O’Dea, Rose R.E., et al. 2018. “Gender Differences in Individual Variation in Academic Grades Fail to Fit Expected Patterns for STEM.” Nature Communications 9: 3777.
Oliveira, Rui F. 2004. “Social Modulation of Androgens in Vertebrates: Mechanisms and Function.” Advances in the Study of Behavior 34: 165–239.
Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349, no. 6251.
Price, Cathy J., and Karl J. Friston. 2002. “Degeneracy and Cognitive Anatomy.” Trends in Cognitive Sciences 6: 416–21.
Rippon, G., et al. 2017. “Journal of Neuroscience Research Policy on Addressing Sex as a Biological Variable: Comments, Clarifications, and Elaborations.” Journal of Neuroscience Research 95: 1357–9.
Rippon, Gina, et al. 2014. “Recommendations for sex/Gender Neuroimaging Research: Key Principles and Implications for Research Design, Analysis, and Interpretation.” Frontiers in Human Neuroscience 8: 650.
Rosenblatt, Jonathan D. 2016. “Multivariate Revisit to ‘Sex beyond the Genitalia.’” Proceedings of the National Academy of Sciences 113, no. 14: E1966–7.
Rozin, Paul, and April E. Fallon. 1987. “A Perspective on Disgust.” Psychological Review 94, no. 1: 23–41.
Schmitz, S. 2010. “Sex, Gender, and the Brain: Biological Determinism versus Socio-cultural Constructivism.” In I. Klinge and C. Wiesemann, eds., Sex and Gender in Biomedicine: Theories, Methodologies, Results, 57–76. Göttingen: Univ.-Verl. Göttingen.
Shors, Tracey J., Chadrick Chua, and Jacqueline Falduto. 2001. “Sex Differences and Opposite Effects of Stress on Dendritic Spine Density in the Male versus Female Hippocampus.” Journal of Neuroscience 21, no. 16: 6292–7.
Simmons, J.P., L.D. Nelson, and U. Simonsohn. 2011. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22, no. 11: 1359–66.
Spence, Janet T. 1993. “Gender-Related Traits and Gender Ideology: Evidence for a Multifactorial Theory.” Journal of Personality and Social Psychology 64, no. 4: 624–35.
Springer, K.W., J.M. Stellman, and R.M. Jordan-Young. 2012. “Beyond a Catalogue of Differences: A Theoretical Frame and Good Practice Guidelines for Researching Sex/Gender in Human Health.” Social Science & Medicine 74, no. 11: 1817–24.
Szucs, Denes, and John P.A. Ioannidis. 2017. “Empirical Assessment of Published Effect Sizes and Power in the Recent Cognitive Neuroscience and Psychology Literature.” PLoS Biology 15, no. 3: E2000797.
van Anders, S.M. 2013. “Beyond Masculinity: Testosterone, Gender/Sex, and Human Social Behavior in a Comparative Context.” Frontiers in Neuroendocrinology 34, no. 3: 198–210.
Wallen, Kim. 1982. “Influence of Female Hormonal State on Rhesus Sexual Behavior Varies with Space for Social Interaction.” Science 217, no. 4557: 375–7.
Wood, W., and A.H. Eagly. 2012. “Biosocial Construction of Sex Differences and Similarities in Behavior.” In J. Olson and M. Zanna, eds., Advances in Experimental Social Psychology, vol 46, 55–123. Burlington: Academic Press.
Zell, Ethan, Zlatan Krizan, and Sabrina R. Teeter. 2015. “Evaluating Gender Similarities and Differences Using Metasynthesis.” American Psychologist 70, no. 1: 10–20.
Zuk, Marlene. 2002. Sexual Selections: What We Can and Can’t Learn about Sex from Animals. Berkeley: University of California Press.
- The authors are listed alphabetically; they contributed equally to this article. [↩]
- Mental rotation refers to the ability to imagine how a three-dimensional object looks from different angles. In the standard version of a mental rotation task you are shown a two-dimensional image of an abstract three-dimensional object, often made up of small cubes. You are then asked to compare it with four similar shapes, and identify the two which match the original but have been rotated in three-dimensional space. [↩]