According to the latest entry in Arlington69’s series of posts on momentum and handicapping, prime icons create fewer goals than their lower rated version. According to Arlington69, one possible explanation for this is handicapping stopping the highest rating icons performing to their potential.
Is he finally on to something or is this another dud torpedo from his side?
The basis of Arlington69’s claim is a small analysis he made using FUTBIN’s Player Game Performance (PGP) section. For those that aren’t familiar with PGP, FUTWIZ collects and displays performance data from the transfer market. FUTBIN scans all player items up for sale and collects data about matches played, assists and goals scored.
Using PGP-data, Arlington69 made the statistic presented below. He compares the performance measured in “goals created” of prime, mid and base versions of a number of icons.
What we see above is that Base and Mid versions of an icon in some cases appear to perform better than the prime version. At a glance, this would seem to fit with the handicapping theory, which Arlington69 happens to be a strong proponent of:
“The highest rated icons often did not perform as well as the lower rated in fact only 3 of 11 prime icon strikers created more goals than their lower rated versions. One possible explanation for this is handicapping stopping the highest rating Icons performing to their potential.”
(– Arlington69’s summary post on Reddit)
So, is handicapping a possible explanation to the observations above, or should be perhaps look elsewhere for the truth?
A recurring problem in Arlington69’s posts is cherry picking, i.e. the practice of ignoring all the data that doesn’t fit with one’s preferred narrative. When the neutral observer looks at Arlington69’s data, a couple of notable details stand out:
- 3 in 11 prime icons perform better than both lower rated versions – Arlington69 makes no attempt to explain how this possibly could fit with the handicapping narrative.
- In 6 out of 15 cases, the prime edition outperformed the base version.
- In 8 out of 15 cases, the base version performs the worst.
- 11 of 15 prime icons outperform minimum one of the earlier versions.
- Only in 1 single case (Del Piero) does the base version perform better than the mid version whilst the mid version performes better than the prime version.
Although a small set of the observations fit with the handicapping theory, most of it doesn’t. What this first and foremost tells us is that the variation seen in the table above most likely has other reasons than handicapping.
And we have a pretty good idea about what those reasons are.
First of all, the variations could be caused by differences i human skill. We have used PGP data on earlier occasions. One of the key learnings is that you need to keep in mind that cards don’t score goals on their own. It takes a human player with sufficient skills and a card with sufficient stats to score goals. And statistically speaking, it can be quite difficult to separate performance caused by stats from performance caused by human skill. You cannot assume that the players using two different cards are evenly skilled. There could be cases where a base icon outperforms the prime version because the sample players, who used the base versions, were better than sample players using the prime version.
Secondly, the variations could be caused by statistical inaccuracy. From a statistical perspective, Arlington69’s sample sizes are far too small anf probably also much smaller than he realizes. He uses the number of matches (“games”) to illustrate the volume of his study, but this is a clear mistake:
The fact that you have a sample covering 76.865 matches played with Ronaldo doesn’t mean that you have 76.865 independent observations because those 76.865 matches weren’t played by 76.865 different human players but in fact to a large extent by the same players, meaning that a lot of the observations aren’t independent.
The sample size is the number of independent observations, which is unknown but obviously a lot lower than 76.865.
And to demonstrate that sample accuracy is a big problem, we repeated Arlington69’s experiment for the three players where he found the prime version to be the worst performing version – on a different date. This is what we found:
|Prime Del Piero||1.12||1.18 (+ 5 %)|
|Mid Owen||1.75||1.43 (-22 %)|
|Base Shevchenko||1.35||1.42 (+5 %)|
So, by simply looking up the same data on a different date, we got variations of between 5 and 22 % which shows us that the performance variations found by Arlington69 could be random. Thus, Arlington69’s study doesn’t suggest and even less confirm that prime icons in general perform worse than lower rated versions. In fact, the observations tell us absolutely nothing about how prime icons perform vis-a-vis lower rated icons.
Arlington69 cautiously claims that handicapping is a possible explanation for the variation he has found. This weak statement might be true, just as well as his observations doesn’t rule out the Yeti, that 911 was an inside job or that chemtrails are used to control our minds.
But the mere fact that your data doesn’t rule out your conspiracy theory is in all essence a trivial observation.
And if Arlington69 had bothered looking around, he would have realized two things which with all due respect have some real bearing on the validity of the handicapping theory:
First and foremost, we already carried out a number of studies on player performance based on PGP data . Our studies consistently confirms that better stats means better performance. And unlike Arlington69, we put a great deal of effort into avoiding the statistical perils discussed in the previous sections.
Second, handicapping does not make sense! Arlington69 asserts that EA has reversed the effect of stats, thereby making it an advantage to use cards with low stats. Why on earth would EA do that? It won’t make matches more even. All it does is to turn the advantage upside down, thereby removing any incentive to buy packs. No sane game designer would do this because it defies all logic.
Common sense – and a lot of solid facts – lead to a very clear conclusion about handicapping: It doesn’t exist, never did and never will!