lichess.org
Donate

Ratings Are Broken

It assumes that rating estimate is a random variable which normally distributed around true strength. Or maybe other way around. makes no assumption on distribution of players strengths. actual rating probably are mostly normally distributed but the is some deviation as there are upper and lower limits by rules of the system and for practical reason as well . Like no human can be 3000 now matter how big the population playing is.
I seem to be far stronger playing blitz on Chess.com (better than 96.1% of players) than on Lichess (better than 85.4%). Why is that? There are a lot more strong blitz players on Lichess than Chess.com crammed around my level?
@Trumpelstiltskin said in #52:
> I seem to be far stronger playing blitz on Chess.com (better than 96.1% of players) than on Lichess (better than 85.4%). Why is that? There are a lot more strong blitz players on Lichess than Chess.com crammed around my level?
You better make another thread for that as this thread is about fideratings only.
"Kids from India are famously underrated; grown-ups who have been hovering around 1600 for 20 years, not so much."

Wrong assumption. Any player with 1600 is likely to play the majority of their games against opponents in the same ranking zone, including "underrated" youth. If the youngsters with 1600 are underestimated by about 160 points (according to Sonas), and their older opponent still manages to hold his 1600 against them, this just means he is also underestimated by the same 160 points.
In order to avoid games with rating diff of 350-400 or more, you have to chose the right pairing system, and there is where the source of some problems is. Like an original Swiss pairs for 1st rnd the top 1 vs. the top 1 of the 2nd half etc. , descending.
So when this gap is 350 or more, suppose that 350 is the 2 s.d. mark, you breed in uncertainty cos of the larger deviations.

Even more practical stuff: I have witnessed some youngsters who played a rated game against a rated, but the results were not transferred to officials. That is fraud when the unrated lost or drew! These results are used to estimate the start elo of the unrated players.
It is also a waste of time for the rated player.

Combine only those 2 abberations, for say 10 elo-recalculation cycles, and you might have sampled more outliers than the original idea of parameters of the (Gauss) distribution.

And now the catharsis: look at the plot of the whole elo-distri of Lichess: it is not a normal dist, it looks more a beta or gamma distribution.
I have been saying this for ages: my rating is around 1500. When I play against other players with a similar rating, I am unable to climb. However, when I play against opponents with ratings between 1800 and 2000, I easily gain +200 points.

Take a look at my rapid rating from June 2021. I started playing against higher-rated opponents and gained nearly +200 points. I did not become a better player overnight.

In my opinion, Lichess’s rating system is also flawed.
@kamekura said in #50:
> if I misunderstand you, but does not Glicko assume ratings are normally distributed?

It assumes a normal distribution for the individual player rating estimate. The population of players is not bound to any limited number of parameter family, like the 2 parameter normal distribution.

I suggest you look at lichess weekly distributions across time controls.. forgot where that is.

The individual distributions about about managing uncertainty.. Yes they are model assumptions, but not about the whole population. Each individual would have its own trajectory in that 2 parameter distribution family (i sometimes call that space). but the ensemble result is only known for its average and I think I remember also the next moment order.. the deviation from the average..

Best look at that weekly distribution to see what I mean.
There is a browser extension that displays some user performance data.
It gets displayed on the upper left side of the screen.
See the picture that the author shows in the: chrome webstore lichess-custom-stats
chrome.google.com/webstore/detail/lichess-custom-stats/ppolcdjceepccgcemodacgafjcoaemlh

Not sure what the Quote % is about, but maybe it's win percentage.
But what I like best is the rating indicated by "Avg Opponent".

Seeing an Avg Opponent rating much higher than their own rating is an interesting performance stat.
In a way the rating difference shows if a rating is over or under inflated.
If the Avg Opponent rating is similar to their own, then I would assume, it's neither over or under inflated.

Ratings are serving their purpose well.