Biological Sex Differences and Chess Performance (Part 2 of 2)

Participation rates account for nearly all of the gender gap in performance

Scott Walsh

3/6/20269 min read

General Intelligence Is Not Sufficiently Correlated With Chess Performance To Factor Into The Gender Gap; Deliberate Practice Is The Single Strongest Measure Of Improvement

“Despite being an apparently obvious example of a purely intellectual activity, for more than a century researchers have largely failed to connect success at chess with any intellectual ability.” 1

The claim that men have higher general intelligence in general relative to women has long been debunked. Dr. Earl Hunt of Cambridge wrote, “...even though men and women are essentially equal in general intelligence, they may differ in their performance on particular tests, because performance depends upon both general intelligence (for which there is no sex difference) and the residual abilities, where sex differences may be substantial.” 2 This aligns with the assertion that men and women are different in many ways but that performance is generally the same (excepting of course for tasks requiring physical strength).

The broad literature also does not support a significant mean sex difference in general intelligence. As discussed above, where differences appear, they are typically small and heavily moderated by test design, measurement, and sampling. More importantly for chess, research on skill acquisition consistently shows that expert performance is not well explained by innate ability alone: it is strongly shaped by accumulated structured practice and domain-specific learning.

“We agree that expert performance is qualitatively different from normal performance and even that expert performers have characteristics and abilities that are qualitatively different from or at least outside the range of those of normal adults.

However, we deny that these differences are immutable, that is, due to innate talent. Only a few exceptions, most notably height, are genetically prescribed. Instead, we argue that the differences between expert performers and normal adults reflect a life-long period of deliberate effort to improve performance in a specific domain.” 3

Chess-specific research further weakens the “raw intelligence” narrative. A 2006 paper studied 57 children from four schools and formed an additional “elite” subsample of 23 children (all boys) who competed nationally/internationally. Researchers measured the students’ performance in chess, their intelligence, and the amount of time they spent practicing chess. 4

Chess performance was measured using a 55‑item test of chess rules and tactical problems, a recall task where the students were assigned to reconstruct positions, and a “Knight’s Row” task in which the students were to transfer, as fast as possible, the knight from one corner of the board to the other on the same horizontal (a1 to h1), visiting each square between the two corners in order (a1, b1, c1 and so on until they reach h1). 5

Intelligence was measured using WISC‑III subtests 6 - Vocabulary, Block Design, Symbol Search, and Digit Span - combined into a composite IQ and analyzed individually. Practice and experience were measured using parent‑verified diaries and interviews which showed hours of practice and years of experience. 7

When researchers tested the whole sample of children, some of whom had just recently started to play chess, they found a moderately positive correlation between intelligence and chess skill, which was in line with some previous studies. Boys outperformed girls on chess measures in the full sample (however, since the elite subsample consisted entirely of boys, gender effects could not be examined within that group), but there was no notable gender gap in intelligence. “The most surprising result [came from the elite subsample. In that group,] IQ negatively correlated with chess rating, indicating that the children with lower IQ scores were better players.” 8

Researchers determined that practice was “by far the best predictor of chess rating,” not intelligence. 9 And in the full sample, the girls’ chess performance matched that of the boys once practice was accounted for. Hours practiced explained almost all of the observed gap.

There were no girls present in the elite group because they had not yet practiced enough to reach that level. This follows from what we know about girls’ participation broadly, in that girls participation is generally very low compared to boys. However, this study also shows that participation and performance are linked. When the girls did participate (by putting in practice equal to the boys), their performance increased and matched the level of the boys, erasing the gender gap.

As far as whether elite performance can be attained by anyone however, “It does not follow from the rejection of innate limits on acquired performance that everyone can easily attain high levels of skill. Contemporary elite performers have overcome a number of constraints. They have obtained early access to instructors, maintained high levels of deliberate practice throughout development, received continued parental and environmental support, and avoided disease and injury.” 10

Accordingly, while we see that raw intelligence is a non-factor regarding the gender gap, and that practice does improve performance, reaching an elite level also requires women overcoming additional barriers we know they face. Increased participation (as measured by practice) could potentially erase the gender gap as it did in the Bilalić study. However, we know that female players, even if committed to high levels of deliberate practice, often have more difficulty than male players gaining early access to instructors and often do not receive continued parental and environmental support - factors included by Ericsson as necessary to reach an elite level. We discuss those issues below.

Studies of the Greater Male Variability Hypothesis Show That Participation, Not Biology, Explains Most of the Gender Gap

One of the most frequently discussed biological explanations for male dominance at the highest levels of chess is the “greater male variability hypothesis” (GMV). Rather than claiming that men are more intelligent on average, this hypothesis proposes that male traits tend to show greater statistical variance in cognitive domains, meaning men are more likely to appear at both the highest and lowest extremes of ability distributions. If correct, this could theoretically produce a disproportionate number of men among elite performers in domains such as mathematics, science, or chess, even when average ability is similar between the sexes. However, as discussed below, studies of chess ratings show no GMV in chess and that participation rates alone account for most of the variance in the performance gender gap.

Summary of the GMV Hypothesis

In statistical terms, the GMV hypothesis proposes that the variance of male performance distributions is larger than that of female distributions, even when the means are roughly similar. If the male distribution has “fatter tails” at either side of a bell curve, more men would appear at both the lowest and highest levels of performance. Evidence for greater male variability has been reported in some educational datasets. Large international assessments of mathematics, reading, and science sometimes show that males are “overrepresented at both extremes of the distribution, even when the average scores of males and females are similar. 11

Some researchers have proposed biological mechanisms that might produce this pattern. One suggestion involves evolutionary selection pressures, where stronger competition among males could favor greater trait variability over time. Mathematical models have also been proposed in which variability evolves as a consequence of selective mating dynamics. 12 However, even within biology and psychology, the hypothesis remains controversial.

Meta-analyses of behavioral traits across animal species have found “little consistent evidence for systematically greater variability in males, suggesting that the phenomenon may be less universal than sometimes assumed. 13

Importantly, the GMV hypothesis concerns statistical distributions. If the distributions overlap heavily, as is typical for most cognitive traits, many women will still outperform many men.

Applying the Hypothesis to Chess

Chess is a more simple test case for GMV than other disciplines because it contains built-in measures of performance in chess ratings. When player strength is quantified through Elo ratings, those ratings create a large and detailed dataset that researchers can analyze. Though, as noted above, gender disparities in chess have sometimes been cited as evidence for biological differences in intellectual ability, detailed analyses of chess rating data confound that interpretation.

One joint Harvard/Boston University study analyzed ratings from over 250,000 tournament players and found that men had slightly higher average ratings than women but did not show greater variability in ratings distributions. 14 The empirical data did not support the prediction that male performance in chess exhibits a wider spread of outcomes than female performance. Instead, the researchers concluded that the male dominance at the highest levels of chess could be largely explained by participation rates.

“Even if men and women have the same underlying ability distribution, a larger number of top-rated players will be men if the overall number of men competing is greater…” 15

Far more boys than girls enter competitive chess in the first place, which naturally produces more men at the top of the ranking ladder. “That is, if fewer women than men even begin to participate in organized competition, dropout rates (and cognitive endowments) could be equal, but women would still be relatively absent at the top.” 16

This finding illustrates an important statistical principle:

even if two groups have identical performance distributions, a group that is numerically larger will produce more individuals in the extreme tail of the distribution. If ten times as many boys as girls enter chess tournaments, the highest-rated players will almost certainly be predominantly male, even without any difference in ability or variability.

Participation Rates Alone Explain Almost All Variability in Top-End Performance

The interaction between participation rates and statistical extremes is crucial in understanding elite chess performance. Research examining chess populations has repeatedly found that the number of participants strongly predicts the number of elite performers. When the pool of competitors increases, the probability that someone from that pool will achieve exceptionally high ratings also increases. 17

Accordingly, a simple numerical imbalance can generate an apparent performance gap. If a national chess federation has 10,000 active male players but only 1,000 female players, the male pool will produce many more potential candidates for the highest rating categories.

“It’s a simple statistical fact that the best performers from a large group are probably going to be better than the best performers from a small one.” - Ed Yong, National Geographic Magazine 18

Some statistical models estimate that participation differences alone could explain the vast majority of the observed gender disparity at the highest levels of chess. For example, Merim Bilalić and others studied the German chess federation, attempting to quantify how participation rates influence the number of outstanding men and women in the field. 19 The large sample, 120,399 total players, was composed of 113,386 men and 7,013 women. The German federation, similar to many others internationally, is composed of 16 times more men than women.

Researchers found that “the great discrepancy in the top performance of male and female chess players can be largely attributed to a simple statistical fact - more extreme values are found in larger populations. Once participation rates of men and women are controlled for, there is little left for biological, environmental, cultural or other factors to explain.” 20

“Although the performance of the 100 best German male chess players is better than that of the 100 best German women, we show that 96 per cent of the observed difference would be expected given the much greater number of men who play chess.” 21

Accordingly, though there may be unresolved questions regarding the GMV hypothesis in society at large, it is not a factor to be considered in explaining chess’s gender performance gap.

Conclusion - Biological Differences Do Not Factor Into Chess’s Gender Gap

There is no difference in average intelligence between male and female players. Differences in spatial and rotational intelligence may be a figment of flawed test design; even if it exists, it is only revealed in one of three diagnostic tests. Arguments based on greater male variability have no place in chess, where 96 percent of the gap in performance can be explained by participation rates.

In the below sections, we will analyze the factors which explain the remaining four percent of the performance gap as well as the factors which explain the much lower participation rates.

1 Merim Bilalić, Peter McLeod, and Fernand Gobet, Does chess need intelligence? - A study with young chess players, Intelligence 35 (2007), at 457–458.

2 Hunt, Earl B. (2010). Human Intelligence. Cambridge University Press. p. 389. ISBN 978-1-139-49511-0.

3 K. Anders Ericsson, Ralf Th Krampe, and Clemens Tesch-Romer, The Role of Deliberate Practice in the Acquisition of Expert Performance, Psychological Review 1993, Vol. 100. No. 3, at 400.

4 Bilalić, et al, at 461.

5 Id.

6 The Wechsler Intelligence Scale for Children (WISC) is an individually administered intelligence test for children between the ages of 6 and 16. Though the test is now in its fifth generation, the third generation test used in this study was also administered to generate a Full Scale IQ (formerly known as an intelligence quotient or IQ score) that represents a child’s general intellectual ability.

7 Bilalić, et al, at 462.

8 Id., at 465-466.

9 Id., at 467.

10 Ericsson, et al, at 400.

11 Helen Gray, Andrew Lyth, Catherine McKenna, Susan Stothard, Peter Tymms, and Lee Copping, Sex differences in variability across nations in reading, mathematics and science: a meta‑analytic extension of Baye and Monseur (2016), Large-scale Assessments in Education (2019) 7:2, https://doi.org/10.1186/s40536‑019‑0070‑9.

12 Theodore P. Hill, An Evolutionary Theory for the Variability Hypothesis (2024), arXiv:1703.04184, https://doi.org/10.48550/arXiv.1703.04184.

13 Harrison LM, Noble DWA, Jennions MD. A meta-analysis of sex differences in animal personality: no evidence for the greater male variability hypothesis. Biol Rev Camb Philos Soc. 2022 Apr;97(2):679-707. doi: 10.1111/brv.12818. Epub 2021 Dec 14. PMID: 34908228.

14 Chabris CF, Glickman ME. Sex differences in intellectual performance: analysis of a large cohort of competitive chess players. Psychol Sci. 2006 Dec;17(12):1040-6. doi: 10.1111/j.1467-9280.2006.01828.x. PMID: 17201785.

15 Id. at 1041.

16 Id.

17 Bilalić, Merim, and Peter Mcleod, Participation Rates and the Difference in Performance of Women and Men in Chess (2007), Journal of Biosocial Science 39, no. 5, 789, https://doi.org/10.1017/S0021932007001861.

18 Yong, Ed, Why are there so few female chess grandmasters?, National Geographic Magaine, December 23, 2008, found at https://www.nationalgeographic.com/science/article/why-are-there-so-few-female-chess-grandmasters (last visited 8 March 2026).

19 Merim Bilalić, Kieran Smallbone, Peter McLeod, and Fernand Gobet, Why are (the best) women so good at chess? Participation rates and gender differences in intellectual domains, Proc. R. Soc. B (2009) 276, 1161–1165, doi:10.1098/rspb.2008.1576, 23 December 2008.

20 Id. at 1163.

21 Id. at 1161.