Jeff Sonas

Ratings Are Broken

1 Aug 202326,401 viewsEnglish (US)

According to statistician Jeff Sonas, lower rated players win more often than they're supposed to.

According to statistician Jeff Sonas, you’re not crazy: it really is getting harder to beat lower rated players. Sonas looked at data from 2012 to the present and found a trend of lower rated performing better against their higher-rated opponents.

The Elo system predicts the outcome of a chess game based on the rating difference of the players. (For more on the inner workings of the Elo system, check out this post.) Even in 2012, lower rated players were scoring upsets more often than they were supposed to according to the Elo predictions. But the trend has only intensified in recent years. For example, according to Sonas, 1500 players now beat 2100 players nearly twice as often as they did in 2012, going from 8% to 15%.

The reason for this is the constant influx of new, underrated players into the rating pool. When new players – especially juniors – get their first rating, it is often below their real skill level. If they improve quickly it takes awhile for their rating to catch up. Since the Elo system is largely a zero-sum game (one player’s rating gain is another’s rating loss), the presence of underrated players in the pool drags everyone else’s rating down. If this were a one-time phenomenon, the rating system would eventually stabilize at a new, somewhat lower level. But it seems that underrated juniors are entering the pool continuously, creating constant downward pressure on everyone’s rating.

Sonas proposes two changes to address this issue:

An immediate one-time rating boost to everyone under 2000 Elo. Players close to 1000 would get a nearly 400 point boost, whereas those closer to 2000 would get a smaller bump. This is designed to bring ratings in line with the observed results.
Changes in how ratings are calculated to prevent the problem from happening again. This includes changing the starting rating from 1000 to 1400, as well as other changes broadly aimed at preventing new players from being underrated.

Sonas builds a compelling case that the rating system is broken. He’s crunched the data from every FIDE game since 2012 and there doesn’t appear to be any way around the conclusion that lower rated players are performing much better than they should according to the rating system.

One curious omission from the report is any attempt to differentiate players by age, country, or other factors. The report deals exclusively in averages amongst rating groups, but of course, not all low rated players are equally underrated. Sonas even singles out juniors as the source of much of the rating inflation, but this observation doesn’t factor into his recommendation. Kids from India are famously underrated; grown-ups who have been hovering around 1600 for 20 years, not so much.

As such, the one-time rating increase for everyone under 2000 would bring some players closer to their “real” rating, while giving an undeserved windfall to others. Maybe Sonas just felt that any attempt to break up the rating adjustment based on factors like age would be too problematic to consider.

It is also worth pointing out that this report deals exclusively with FIDE ratings. The majority of readers are probably more concerned with Chess.com, Lichess, or USCF ratings (most tournaments in the United States are not FIDE rated).

Does the USCF rating system suffer from the same issues Sonas highlights for FIDE ratings? It seems probable that it would, since the underlying issue – an influx of underrated young players – is certainly also present in the US. But the US uses a different version of the Elo system, with different calculations, than FIDE, so this would need to be tested. I’m not aware of any attempt to do this at present.

Anecdotally, adults who have returned to competitive chess after a break like James Altucher and Ben Johnson, often describe the tournament scene as feeling much tougher. My experience was somewhat different, as I was able to increase my rating from the 2200s to 2400s after returning to chess as an adult, largely playing against juniors. But I think I might have been a much stronger player as an adult, for a variety of reasons unique to my own life trajectory.

What does all this mean for you as a chess player? One interesting conclusion is it appears, at least from a mathematical perspective, that you are likely to gain points when playing up and lose points when playing down. This could be an argument for playing in higher sections, as far as the rules allow. I wouldn’t get too carried away with this, though. Deciding to play up can feel like a boss move, but beating players at your own or a lower level is also a challenge, just of a different kind. Additionally, if you play too far up to the point where you’re not competitive with many of your opponents, that can be really demoralizing. Ideally, you want to play a lot of games against opponents that push your limits, but you have a realistic shot of beating.

It’s also tempting to read these results to mean that you, yourself, are underrated. Unfortunately, unless you are a rapidly improving junior, I don’t think this is all that likely to be true. Notwithstanding all the issues called out in Sonas’s report, the rating system as a whole does more or less work, in the sense that over time your rating will reflect your results. If you play in tournaments regularly it’s hard to stay very underrated for very long. Many players perceive themselves to be underrated based on subjective factors: I know way more than I used to, my endgames are better, etc. But if there’s a discrepancy between your rating and your perception of your strength, it’s much more likely that your perception is wrong. The strength you perceive may not in fact be as strong as you think they are, or they may be counterbalanced by weaknesses in other parts of your game that you’re not even aware of.

With that in mind, my biggest suggestion is this: if you’re underrated, prove it!

If you liked this check out my newsletter where I write weekly posts about chess, learning, and data: https://zwischenzug.substack.com/

Discuss this blog post in the forum

Ratings Are Broken

More blog posts by CheckRaiseMate

How Do You Stop Blundering?

Priyomes

Comparing Lichess and Chessbase

US Amateur Team East Recap