Women to run faster than men in 2156… and faster than light in 2636

Posted by Giang Son | Aug 25, 2025 | 3 min read

An classic example of statistics... done wrong.


Gender gap has always been a thing, and we all wonder when and where the gaps will finally be closed.

In the case of sports, and particularly pinnacle of Olympic competition a.k.a the 100 meter sprint, the gap will disappear by the middle of the 22nd century, according to a 2004 Nature research [1].

The authors used a totally legitimate model which is linear regression and fitted it on 100 years worth of 100m sprint timing data at the Olympics (shown in the graph).

(Photo from Calling Bullshit's blog)

According to the data , the gap in time between women and men had been steadily narrowing. At this rate, the model predicted, the 2156 edition of the female’s 100m sprint will record a time of 8.079 seconds, faster than that of the male at 8.098s (where the red line meets the blue line). For the first time in history, women will outsprint men at the Olympics.

A historic feat indeed!


In case you haven’t noticed, I was being sarcastic for all of the previous paragraphs. In fact, every time I see this study (including its original publication) [2, 3], it was always with some satirical or humorous intents.

Jokes and gender talks aside, this model is a classic example of flawed thinking in statistics/ data science. Even without statistics training, you can still easily point out the model’s downfall: if we follow the extrapolation, the 100m time for female will continue to fall to 7, 6, 5… and eventually 0 seconds in 2636 [4]. (*)

You could further dissect many of the research’s limitations – or example, the questionable dataset (using limited time range, taking only the winner’s time, etc.).

My own interpretation is this: the model’s relies on unrealistic assumptions and thus makes very naive predictions (at least in the long term). For the model’s predictions to hold, at least two things must satisfy: (**)

  • there must be no natural limits (walls/ceilings) to the speed at which a human could run
  • and the improvement rate in speed must remain constant

But of course, we should know that there must be some limit to human’s sprint performance, even though we can’t pin down an exact number. And also, constant (or accelerating) improvement over the long term very very very rarely happens even if short-term predictions might still be true (e.g., Moore’s laws); because limits exist and chances are progress will slow down and eventually plateau. Knowing this, we might opt for a non-linear model with some saturation point to predict future sprint times.

It sounds obviously unreasonable when we say the assumptions out loud. Yet, many predictions that implicitly rely on these assumptions are made everyday. Stop me if this sounds familiar:

Judging by [the speed of progress] in [some area] in recent years, [some big breakthrough] will happen by [some time a few years from now].

I’m not saying that the prediction will surely be wrong. But this example shows why we should stay critical whenever we come across such naive extrapolations.

Footnotes:

(*) Women will pass the speed of light before then, I just use the year 2636 in the title for convenience’s sake – the point still stands.

(**) This was more or less my answer in an in-class exam.

References:

[1] https://www.nature.com/articles/431525a

[2] https://www.callingbullshit.org/case_studies/case_study_gender_gap_running.html

[3] SD6101 Data Science Thinking by Assoc. Prof. Sourav Bhownmick

[4] https://www.nature.com/articles/432147b

[5] https://www.nature.com/articles/432147a

📧 Stay Updated!

Get notified when I publish new posts.


Thank you for reading. I've also written some other posts that you can check out.