Elo rating

From HexWiki
Revision as of 03:12, 11 October 2020 by Selinger (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The Elo rating system is a system for quantifying the relative strength of players in zero-sum games such as Hex, originally developed by Arpad Elo on behalf of the United States Chess Federation around 1960.

Theory

Assumptions

The odds of a player X winning against a player Y are defined as the ratio of X's number of wins to Y's number of wins, averaged over a large number of games between X and Y. For example, if X wins 75% of the time, X's odds of winning are 3 : 1 (or 3/1).

The Elo rating system is based on the assumption that odds are multiplicative, i.e., if the odds of X winning against Y are a : b and the odds of Y winning against Z are b : c, then the odds of X winning against Z are a : c.

Of course this assumption is not strictly true in practice, for a number of reasons. For example, if player X has a particular weakness that Y knows how to exploit, and player Y has a weakness that Z knows how to exploit, and player Z has a weakness that X knows how to exploit, it is quite possible that in matchups between two players, X on average loses to Y, Y on average loses to Z, and Z on average loses to X. Nevertheless, the assumption that odds are multiplicative seems to hold at least approximately in many practical situations.

The Elo rating is a relative rating. Specifically, the odds of X winning against Y are meant to be reflected in the difference between X's and Y's Elo rating. By design, Elo ratings do not have an absolute meaning, as the strengths of players are always calibrated in relation to that of other players they have played against. For example, there is no absolute definition of how good a player must be to have an Elo rating of 800 --- and indeed, this can differ from game to game (e.g. Hex vs. chess), organization to organization, website to website, and may also change over time. It would be very difficult to objectively compare the strength of a player from 1942 to that of a player from 2020, since it is not likely that they have ever played against each other or against a common opponent. Moreover, players strength changes over time. However, the statement "player X's Elo rating is 100 points higher than that of player Y" theoretically has a well-defined meaning (given the assumption on odds, above).

Definition

By definition, the Elo rating is calibrated in such a way that a difference of 400 Elo points corresponds to odds of 10 : 1.

In formulas, this means that the difference between two players' Elo ratings is 400 times the 10-based logarithm of the odds. Thus, if the odds of player X winning against player Y are a : b, then player X's true Elo rating is supposed to exceed that of player Y by 400 log₁₀ (b/a). Conversely, if player X's Elo rating exceeds that of player Y by N points, the odds of X winning against Y are 10^(N/400) : 1.

In table form, the following table translates various differences in Elo rating to the corresponding odds and probability of winning:

Relative Elo rating Odds of winning Probability of winning
0 1.0000 = 1 : 1 50%
20 1.1220 ≈ 9 : 8 53%
50 1.3335 ≈ 4 : 3 57%
75 1.5399 ≈ 3 : 2 61%
100 1.7782 ≈ 7 : 4 64%
150 2.3714 ≈ 7 : 3 70%
200 3.1623 ≈ 3 : 1 76%
250 4.2170 ≈ 4 : 1 81%
300 5.6234 ≈ 6 : 1 85%
350 7.4989 ≈ 7 : 1 88%
400 10.000 = 10 : 1 91%
500 17.783 ≈ 18 : 1 95%
600 31.623 ≈ 32 : 1 97%
700 56.234 ≈ 56 : 1 98%
800 100.00 = 100 : 1 99%

Calculating Elo ratings

To do...

Hex sites that use Elo ratings

To do...