Peer reviewed: Johnny Cupcakes’ experiment on Reddit


A redditor named JohnnyCupcakes did a simple experiment: He scored an own goal in 80 matches. He then compared his win ratio during those matches to his win ratio in previous matches. Based on the fact that his win ratio grew by 30 %, he concluded that scripting / momentum exists. Sounds like evidence? Don’t worry, it isn’t.

In a small series of articles, we scrutinize home brewed experiments aimed at testing or proving scripting, handicapping and momentum.

In this article, we take a look at JohnnyCupcakes’ post, which published on Reddit in January 2018. JohnnyCupcakes didn’t exactly write a full thesis on his method. His full post and pretty much the total amount of information he has made available is quoted below:

“80 games deep into this experiment and I can actually say the momentum turns your way later on in the game. I ended up winning about 30% more games when I did this. And that’s even with starting down a goal. The games an absolute joke. I even noticed sometimes If I score 2 own goals about 40 mins into the game I can probably score 3 within 5 minutes at some point. I don’t think FIFA is about being better then your opponent but about knowing the scripting more then your opponent. And it actually shows.”
(–Post on Reddit)

Attempts to get further insight into the basis of the conclusion unfortunately haven’t produced any results.

The experiment is an example of how easy it is to convince people – who already agree – with you that you have evidence proving their point – even when you in fact have absolutely nothing.

So, what generated that 30 % increase?

Observer-expectancy Bias

The first explanation that leaps to the eye is bias. JohnnyCupcakes studies his own match results, meaning that there is a risk that his results are influenced by his own predefined beliefs.

Even though JohnnyCupcakes perhaps doesn’t acknowledge that his predefined beliefs influenced his results, numerous studies have confirmed the existence of an observer-expectancy bias in humans. Observer-expectancy bias is a psychological mechanism which causes researchers to unconsciously influence their observations in accordance with their own beliefs. This problem is of course particularly prevalent when you are studying your own match results because you are in very direct control over the observations.

When I raised the issue of observer-expectancy bias with JohnnyCupcakes, he claimed that he came “from a very non bias stand point hoping [momentum] wouldn’t be true”. However, his later comments would seem to suggest the opposite:

“FIFA has come to the point that scripting isn’t even up for discussion whether it exists or not. I personally believe anyone who thinks it doesn’t exist is a fucking idiot.”
(– JohnnyCupcakes comments to my post)

It is definitely possible that JohnnyCupcakes won more matches after scoring an own goal because he expected to do so.  And the failure to address and remove that risk is basically all that is needed to reject this study completely.

But this is not the only problem with his experiment.

The alleged 30 % increase in win rate

JohnnyCupcakes might have won 30 % more matches during his 80 matches trial. But that information alone does not allow us to conclude that he in general wins more matches when he has scored on himself. And by “in general”, I mean outside the narrow scope of a sample.

In statistics, we talk about significance: An increase from 3 in 4 to 4 in 4 constitutes a 30 % increase but it’s not necessarily a statistically significant increase, because it under the given circumstances could be a random fluctuation, i.e. sampling error.

Sampling error wouldn’t be an issue we were talking about 1.000 match samples, but with an 80 match sample, sampling error does become a problem, and not least due information we won’t have about how this experiment was conducted.

If JohnnyCupcakes’ “normal” win ratio was 20 %, we would expect him to win 16 in 80 matches under “normal” circumstances, and a 30 % growth would correspond to hence imply that he won 4 matches more than expected. Even to the naked, untrained eye it is very clear that +4 isn’t a statistically significant increase.

On top of that, the fact that we don’t know the number of matches played prior to the 80 match trial is an issue in it’s own right.

Was there a 30 % increase?

Johnny Cupcakes might not realize this but his study involves two samples: (1) A sample covering his matches up until the point where he started scoring on himself and (2) a sample covering the 80 matches where he scored on himself.

As already mentioned, the problem with samples is that they aren’t 100 % accurate. This means that JohnnyCupcakes neither knows his exact win ratio before or after he started scoring on himself. The only thing that he knows for sure is that his win ratio was 30 % larger in a sample of 80 matches than in the previous sample of X matches.

The fact that samples can be skewed is the reason why you care about their size, which in turn is why it’s critical that we don’t know the size of one of the samples.

Without knowing the sample sizes and before/after win ratios, we basically can’t conclude that there was an increase in win ratio.

Why the absent information is decisive

“But how likely is it that a sample showing a 30 % increase in fact covers a decrease?”, you might ask.

The only correct answer is that we can’t rule that out we any reasonable degree of certainty.

If you take out a sample of 80 matches played by a random player, you may find that his win rate so far has been 38 %. What that tells a statistician is that his general win ratio, i.e. the chance of him winning a random match, is between 27 and 49 % with 95 % certainty. Just because he won 38 % of his latest 80 matches, there is no guarantee that he will win 38 % of his next 1000 matches as well.

His track record so far could be skewed. And this problem is of course also present to some extent when you look at a sample of a couple of hundred matches, which likely was what JphnnyCupcakes did here.

The same uncertainty that affects the “before” sample will of course be present when if do another a sample where we let our random player score an own goal in every match. If the observed win ratio in that sample has grown by 30 % to a total of 49 %, all that tells us is that his general win ratio after scoring on himself lies between 37 % and 60 % – with 95 % certainty.

In other words, our random player’s win ratio went from between 27-49 % to between 37-60%. And what that means is that we have overlapping confidence intervals, meaning that it’s impossible to conclude that the win rate in fact is larger when you score on yourself than when you don’t.

We don’t know how many matches, Johnny Cupcakes played prior do his 80 match trial and we don’t know the absolute increase in win ratio. Without that information, it’s impossible for us and for him to conclude that the 30 % increase was statistically significant, thereby ruling out that it could be a pure coincidence.

Sampling issues

Given the somewhat brief (well, non existent) method section in JohnnyCupcake’s post, it’s impossible to check whether his two samples were obtained on similar terms. He has not responded in detail to my questions regarding his sampling method, so it remains unknown whether he for example had the same distributions of matches played within various game modes in both samples.

Wrapping it up

Doing a study on your own matches is – bluntly put – a waste of time because it will be inconclusive almost no matter what you are trying to research. While it might receive a lot of applause from the community and hence boost your confidence, it remains a fact that you can’t produce knowledge using this method.

We previously studied the exact same claim that JohnnyCupcakes is studying here – that it’s an advantage to be trailing because matches are being leveled by big, bad EA. Unlike JohnnyCupcakes, we (a) didn’t base the experiment on our own matches, and (b) we used samples of multiple thousand matches.

We ended up concluding that there was no basis for the claim that matches are being made even.

%d bloggers like this: