Analysis Problems
I have described the conception of my rating system for Insurgency in an earlier post. Somebody suggested to apply it to players in League of Legends pro-play and that’s what I will talk about in this post.
As with all rating system, there always inherent problems and their severity depends on the actual use case of the rating system. In our case we have to note the following points:
Connectivity of the rating graph
Essentially a rating system can be imagined as a graph in which the players are the nodes, and games between them are edges. Obviously the more people play against another, the higher the connectivity of the graph and the more accurate the rating system. Unfortunately for us our graph is gonna look more like this:
Strongly connected sets of nodes with only few connections created by international tournaments, mainly the world championship. This can lead to ratings being inaccurate and some team in some shithole region like North America might be rated higher than a Korean team because they lost the first round in the world championship, but ran over everything in Europe.
Old players
The increase of skill can be a problem. Just because HotshotGG won everything in 2012, doesn’t mean he’d still win everything now. This problem does occur in other games as well, and it is usually solved by “decaying” rating, or more accurately and increase in variation (sigma) over time. Such decay is not yet implemented in the insurgency rating system, since it has not become a problem there and all top players are quite active.
Rosters do not change frequently
Apart from the fact that the graph will be divided into subsets that are more strongly connected, there is also the problem that players (obviously) do change teams only relatively rarely, so it might be difficult to tell individual skills apart in teams that have a low fluctuation in their roster.
Data acquisition
I sneaked the data from Game of Legends, really cool site check it out. I used this site for scraping because it has an robots.txt and is well structured. I was very conservative with querying so I don’t think I bothered the owner too much.
Adapting my rating system
Adapting the rating system itself is very easy and takes only ~50 additional lines of python, you can check it out in my repository. It’s very straight forward. I will use a higher tau value, which essentially tells the rating system that the curves have to converge slower, in our case because we will usually have the same team of players and we learn therefore less about the individual player. To be precise we will use a mean of 1500 and sigma of 833 (like we did in Insurgency) and a tau of 40 (which is five times the tau we used in insurgency).
Best players according to the system (1689 total players)
I know none of top 20 players. But turns out, most of the names I do know indeed appear in the top 1%.
Rank: 1 JackeyLove | 3556 | mean: 3890 | var: 185 | WinRatio: 75% | 89 Games |
Rank: 2 Aodi | 2936 | mean: 3381 | var: 247 | WinRatio: 23% | 30 Games |
Rank: 3 Ablazeolive | 2712 | mean: 3171 | var: 255 | WinRatio: 44% | 34 Games |
Rank: 4 West | 2602 | mean: 2938 | var: 186 | WinRatio: 51% | 88 Games |
Rank: 5 Mouse | 2553 | mean: 2913 | var: 199 | WinRatio: 68% | 242 Games |
… | … | … | … | … | … |
Rank: 20 Broxah | 2114 | mean: 2458 | var: 191 | WinRatio: 63% | 161 Games |
Rank: 33 Faker | 1967 | mean: 2326 | var: 199 | WinRatio: 68% | 506 Games |
Rank: 48 Perkz | 1832 | mean: 2159 | var: 181 | WinRatio: 63% | 258 Games |
Rank: 54 Caps | 1827 | mean: 2171 | var: 191 | WinRatio: 62% | 170 Games |
Rank: 67 Sneaky | 1773 | mean: 2132 | var: 199 | WinRatio: 62% | 434 Games |
Rank: 104 Biofrost | 1632 | mean: 1963 | var: 183 | WinRatio: 65% | 225 Games |
Rank: 109 Mithy | 1624 | mean: 1951 | var: 181 | WinRatio: 61% | 309 Games |
Rank: 112 PowerOfEvil | 1616 | mean: 1936 | var: 177 | WinRatio: 46% | 277 Games |
Rank: 120 Impact | 1583 | mean: 1898 | var: 174 | WinRatio: 63% | 362 Games |
Rank: 168 Bjergsen | 1481 | mean: 1807 | var: 181 | WinRatio: 63% | 434 Games |
Imaqtpie as usual wins less games than good old Dom and is somehow higher rated:
Rank: 420 Imaqtpie | 1157 | mean: 1596 | var: 243 | WinRatio: 36% | 95 Games |
Rank: 527 IWillDominate | 1041 | mean: 1395 | var: 196 | WinRatio: 49% | 138 Games |
And these two were a bit surprising to me:
Rank: 540 xPeke | 1029 | mean: 1498 | var: 260 | WinRatio: 50% | 216 Games |
Rank: 1219 Rekkles | 226 | mean: 568 | var: 189 | WinRatio: 57% | 393 Games |
You can check the full results here and the original data here to verify the results.
Conclusion
The system rated some very unknown (at least unknown to me) players at the top, but just because they are unknown doesn’t mean they aren’t indeed the best. Rekkles and xPeke being rated so low is interesting. It could be an indicator that there is missing data, or that there is a systematical error in our analysis or maybe they just aren’t that good. A rating system can be judge by the quality of it’s predictions, so expect an evaluation based on new data in the future.
Answering Question from readers
Question: How can somebody with 23% winrate be rank 2?
Aodi lost a lot of games early in his career, but then started winning against very good teams. A good rating system will value newer results over older ones, this means, that even if you start out with 0-100 win-loss, you could get to the very top in no time, if you start winning against very good players.
Winrate always a flawed indicator for how good a person is, because just because you lost ten times against SKT, doesn’t mean you are necessarily a bad team, it just means that you are worse than SKT.
Question: Minn is top 10 with only 2 games, how can that be?
Minn won two games games against very high rated compositions, with relatively low rated players. The problem is the other 9 players in the game already had a lot of games and stable ratings, so while their ratings did not change as much, his did.
This isn’t an error per-se, but it is a statistical artifact. We know why it occurs, and we should therefore ignore it until Minn has more games.
Question: Doesn’t the rating of Minn mean that you should use an even higher tau value?
Yes, perhaps. As soon as new data becomes available I will see if an even higher tau value will increase prediction quality.
Question: Would it be a good idea to rate the world finals first to get a baseline for the different regions and rate the local tournaments second?
At first sight: Yes. Since we have a higher variance of ratings at the beginning, rating the international tournaments first would likely create a good baseline for the individual regions. But I don’t think trade off in accuracy that would occur due to the combination of developing skills and rating competitions out of order is worth it.
Feel free to send me a mail to share your thoughts or ask a question!
The cover picture is CC-SA licensed by Clément Grandjean, taken from Wikimedia.