Analysis Problems 

I have described the conception of my rating system for Insurgency in an earlier post. Somebody suggested to apply it to players in League of Legends pro-play and that’s what I will talk about in this post.

As with all rating system, there always inherent problems and their severity depends on the actual use case of the rating system. In our case we have to note the following points:


Connectivity of the rating graph 

Essentially a rating system can be imagined as a graph in which the players are the nodes, and games between them are edges. Obviously the more people play against another, the higher the connectivity of the graph and the more accurate the rating system. Unfortunately for us our graph is gonna look more like this:

Connectivity graph

Strongly connected sets of nodes with only few connections created by international tournaments, mainly the world championship. This can lead to ratings being inaccurate and some team in some shithole region like North America might be rated higher than a Korean team because they lost the first round in the world championship, but ran over everything in Europe.

Old players 

The increase of skill can be a problem. Just because HotshotGG won everything in 2012, doesn’t mean he’d still win everything now. This problem does occur in other games as well, and it is usually solved by “decaying” rating, or more accurately and increase in variation (sigma) over time. Such decay is not yet implemented in the insurgency rating system, since it has not become a problem there and all top players are quite active.

Rosters do not change frequently 

Apart from the fact that the graph will be divided into subsets that are more strongly connected, there is also the problem that players (obviously) do change teams only relatively rarely, so it might be difficult to tell individual skills apart in teams that have a low fluctuation in their roster.


Data acquisition 

I sneaked the data from Game of Legends, really cool site check it out. I used this site for scraping because it has an robots.txt and is well structured. I was very conservative with querying so I don’t think I bothered the owner too much.


Adapting my rating system 

Adapting the rating system itself is very easy and takes only ~50 additional lines of python, you can check it out in my repository. It’s very straight forward. I will use a higher tau value, which essentially tells the rating system that the curves have to converge slower, in our case because we will usually have the same team of players and we learn therefore less about the individual player. To be precise we will use a mean of 1500 and sigma of 833 (like we did in Insurgency) and a tau of 40 (which is five times the tau we used in insurgency).


Best players according to the system (1689 total players) 

I know none of top 20 players. But turns out, most of the names I do know indeed appear in the top 1%.

Rank: 1 JackeyLove 3556 mean: 3890 var: 185 WinRatio: 75% 89 Games
Rank: 2 Aodi 2936 mean: 3381 var: 247 WinRatio: 23% 30 Games
Rank: 3 Ablazeolive 2712 mean: 3171 var: 255 WinRatio: 44% 34 Games
Rank: 4 West 2602 mean: 2938 var: 186 WinRatio: 51% 88 Games
Rank: 5 Mouse 2553 mean: 2913 var: 199 WinRatio: 68% 242 Games
Rank: 20 Broxah 2114 mean: 2458 var: 191 WinRatio: 63% 161 Games
Rank: 33 Faker 1967 mean: 2326 var: 199 WinRatio: 68% 506 Games
Rank: 48 Perkz 1832 mean: 2159 var: 181 WinRatio: 63% 258 Games
Rank: 54 Caps 1827 mean: 2171 var: 191 WinRatio: 62% 170 Games
Rank: 67 Sneaky 1773 mean: 2132 var: 199 WinRatio: 62% 434 Games
Rank: 104 Biofrost 1632 mean: 1963 var: 183 WinRatio: 65% 225 Games
Rank: 109 Mithy 1624 mean: 1951 var: 181 WinRatio: 61% 309 Games
Rank: 112 PowerOfEvil 1616 mean: 1936 var: 177 WinRatio: 46% 277 Games
Rank: 120 Impact 1583 mean: 1898 var: 174 WinRatio: 63% 362 Games
Rank: 168 Bjergsen 1481 mean: 1807 var: 181 WinRatio: 63% 434 Games

Imaqtpie as usual wins less games than good old Dom and is somehow higher rated:

Rank: 420 Imaqtpie 1157 mean: 1596 var: 243 WinRatio: 36% 95 Games
Rank: 527 IWillDominate 1041 mean: 1395 var: 196 WinRatio: 49% 138 Games

And these two were a bit surprising to me:

Rank: 540 xPeke 1029 mean: 1498 var: 260 WinRatio: 50% 216 Games
Rank: 1219 Rekkles 226 mean: 568 var: 189 WinRatio: 57% 393 Games


You can check the full results here and the original data here to verify the results.


Conclusion 

The system rated some very unknown (at least unknown to me) players at the top, but just because they are unknown doesn’t mean they aren’t indeed the best. Rekkles and xPeke being rated so low is interesting. It could be an indicator that there is missing data, or that there is a systematical error in our analysis or maybe they just aren’t that good. A rating system can be judge by the quality of it’s predictions, so expect an evaluation based on new data in the future.


Answering Question from readers 

Question: How can somebody with 23% winrate be rank 2?

Aodi lost a lot of games early in his career, but then started winning against very good teams. A good rating system will value newer results over older ones, this means, that even if you start out with 0-100 win-loss, you could get to the very top in no time, if you start winning against very good players.

Winrate always a flawed indicator for how good a person is, because just because you lost ten times against SKT, doesn’t mean you are necessarily a bad team, it just means that you are worse than SKT.



Question: Minn is top 10 with only 2 games, how can that be?

Minn won two games games against very high rated compositions, with relatively low rated players. The problem is the other 9 players in the game already had a lot of games and stable ratings, so while their ratings did not change as much, his did.

This isn’t an error per-se, but it is a statistical artifact. We know why it occurs, and we should therefore ignore it until Minn has more games.



Question: Doesn’t the rating of Minn mean that you should use an even higher tau value?

Yes, perhaps. As soon as new data becomes available I will see if an even higher tau value will increase prediction quality.



Question: Would it be a good idea to rate the world finals first to get a baseline for the different regions and rate the local tournaments second?

At first sight: Yes. Since we have a higher variance of ratings at the beginning, rating the international tournaments first would likely create a good baseline for the individual regions. But I don’t think trade off in accuracy that would occur due to the combination of developing skills and rating competitions out of order is worth it.



Feel free to send me a mail to share your thoughts or ask a question!



The cover picture is CC-SA licensed by Clément Grandjean, taken from Wikimedia.