Fact check: Does bronze benching work?

Share

In a recent Reddit post, Arlington69 claims to have evidence confirming that bronze benching works. Allegedly, using a bronze bench means that you will get easier opposing teams in game modes like seasons and weekend league. But does Arlington69’s data really prove that bronze benching works? I decided to investigate.

A brief introduction to bronze benching

Bronze benching is the not-so-noble art of using a bench full of bronze players to pretend that your team is worse than it really is. Due to the way FIFA calculates the squad overall rating (OVR), adding low rated bronze players as subs will lower the overall team rating considerably. In the example below, I demonstrate how it is possible to lower the OVR rating for my full TOTS starting 11 to a mere 84 by using the lowest rated bronze players as subs, thereby bringing my team’s OVR on par with the 84-rated team on the right.

In some game modes of earlier renditions of FUT, people could see each other’s OVR ratings during matchmaking. And obviously, people were more likely to accept the match if the opposing team was 84-rated than if it was 91-rated. Hence, bronze benching not only meant that it was easier to find opponents, but also that you oftentimes would get a competitive edge.

Nowadays, some people use bronze benches because they believe that FUT’s matchmaking algorithm matches them based on the OVR rating. If this assumption is correct, my 84-rated TOTS/bronze bench will be more likely to get matched up against the 84-rated squad on the right than the much scarier 91-rated TOTS squad on the left.

There are however a couple of very good reason to be skeptical about the belief that bronze benching leads to easier opponents in game modes like FUT seasons and Weekend League.

First and foremost, it contradicts official information released by EA sports officials. On earlier occasions, EA has stated explicitly that FUT seasons is about bringing your best team to the pitch.

Second, OVR based matchmaking would seem to undermine EA’s pack selling business. People buy packs because they hope to find rare, high rated players which either can be sold at high value or improve their teams directly. But if you are matched based on OVR rating, it won’t matter which players you are using, because you will get matched with similarly rated opponents. So, in that case there will be no reason to buy players and even less so packs.

Bronze-benching believers like Arlington69 believe that OVR-based matchmaking could be a way of making matches more even. But let’s keep in mind that EA sets the stats and determines how the stats effect player performances. Clearly, the evenness – or lack thereof – is intended.

But having said that, Arlington69’s evidence appears convincing so I decided to give it a closer look.

Arlington69’s bronze benching study

Arlington69 recorded his and his opponent’s squad rating over the duration of 670 matches. His sample included FUT seasons, Weekend League and Daily Knockout Tournament matches.

His conclusions are largely based on the charts below, which are supported by a bit of explanatory text in the original Reddit post.

The charts show the relationship between Arlington69’s  own squad rating and the opponents’ squad ratings. The most interesting chart is the one on the right titled “Distribution of opponents based on my rating”. The chart appears to confirm that Arlington69 got lower rated opponents when he used his 82 rated squads (light blue) than when he used his 86 rated squads (dark blue). In academic terms, this would suggest that there is correlation between your own squad rating and the opponent’s squad rating, which essentially is the picture we would expect to see if the game uses OVR based matchmaking.

There is however another possible explanation, which is far more likely.

The results fit with the opposite conclusion as well

In the complete population of FIFA players, few are stupid/masochistic enough to take a 65-rated squad into action while few will be wealthy enough to run a 95+ squad. In fact, the huge majority of players use average squads. If we were to conduct a survey of all squads currently in use in online matches, we most likely would see that squads are normally distributed around a certain average.

Under the fair assumption that squads are normally distributed, a matchmaking algorithm which doesn’t consider OVR would lead to that a player using an average squad would see the opposing squads being normally distributed around his own squad. And that is in essence exactly what we see in the chart on the left, provided that Arlington69 – like most of us – uses an average squad most of the time.

In other words, the chart on the left looks exactly as we would expect it to look if we assume that the game picks random opponents, i.e. doesn’t consider OVR in matchmaking.

Time matters

But how does the chart on the right fit with this narrative? Surely, that chart shows that Arlington69 got lower rated opponents when using a lower rated squad and vice versa.

There is however a tiny detail that we have to consider here – namely time. Arlington69’s sample consists of 670 matches, which must have been played over several months. And during that time span, the average player improved his squad gradually as he collected more coins and as the introduction of new special cards caused the prices of other items to drop.

Thus, a likely explanation to the chart on the right is that it looks the way it does because Arlington69’s squad improved in roughly the same tempo as everyone elses, meaning that when he used an 82 rated squad, the market average was lower than when he used his 86-rated squad.

A look at the raw data

The explanation I just gave is however not purely hypothetical. I am of course able to back it up with facts.

Arlington69 kindly provided access to his raw data, and although the sample doesn’t contain match dates, it happens to be divided into a pre patch and a post patch section. The patch in question was title update 6 released January 24th 2018. Therefore, we know for certain that all pre patch matches were played before January 24th whereas post patch matches were played after that date.

And when I calculate the average ratings of squads used by Arlington69 and his opponents pre and post patch, I see exactly what I expected above: Between pre patch and post patch, Arlington69’s own average squad rating grew from 83.0 to 84.5, while the opposing squad average grew from 82.8 to 84.7. In other words, Arlington69’s squad improved in roughly the same tempo as everyone elses.

Pre patch Post patch
Own rating 83,0 84,5
Opponent rating 82,8 84,7

Of course this observation doesn’t rule out Arlington69’s squad rating is causally connected with average opponent squad rating. But it does fit very well with the hypothesis that both variables grew because of a third variable, namely the general growth in squad ratings over time as improved items are released. And in addition to that, one has to keep in mind that the exact reasons that allowed Arlington69 to improve his squad were present for all his opponents.

Hence, I’m very confident when saying that Arlington69’s data certainly doesn’t prove that bronze benching works. There is another possible explanation to the results, and it is not only possible but also far more likely than the one, Arlington69 presents.

Does bronze benching work?

Can we say anything qualified about whether bronze benching works based on Arlington69’s data? As a matter of fact, yes.

In the table below, I have calculated average opponent squad ratings for each rating level used by Arlington69, but I have divided the calculation into a pre patch and a post patch section. I have also included 95 % confidence intervals.

Pre patch Post patch
Own squad Avg. opp. squad  Matches Avg. opp. squad  Matches
81  82.4 +/- 0.8 47  N/A 1
82  82.9 +/- 0.4 110  82.9 +/- 0.5 41
83  82.9 +/- 0.3 61  84.7 +/- 0.9 23
84  82.9 +/- 0.3 148 84.7 +/- 0.5 99
85  83.2 +/- 0.7 32 84.7 +/- 0.5 53

What we see above is that, no matter what squad Arlington69 used pre patch, the average opposing squads were rated approximately 82.9. And no matter which squad he used post patch, the average opposing squads were rated around 84.7.

In other words, Arlington69’s own squad rating isn’t correlated with the opposing squad rating. The only factor that does influence the rating of the opposing squads is whether the matches are played pre patch or post patch.



82-rated squads appear to stand out as an exception, but that has a natural explanation: A closer look at the sample sheets indicates that 82-rated matches pre patch were played in close succession, although on each side of the patch. Therefore, the average opposing squad ratings didn’t differ between those two measurements.

Conclusion

Although Arlington69 ends up concluding that bronze benching works, a more thorough analysis of his data leads to the exact opposite conclusion: Bronze benching has no impact on what opponents you are matched up against in the game modes included in his sample.

The graphic representation of the entire sample below is perhaps the simplest way to illustrate why I arrive at that conclusion.

Especially when we look at the pre patch section, we see that Arlington69’s use of different squads had no impact on the ratings of the opposing squads. We also see that later in the year (after the patch on Jan 24th), there is a gradual improvement in the squads, but this applies both to Arlington69 and his opponents. And the most likely reason is the release of TOTS and other special items. This is perhaps what (mis)lead him to his conclusion, but aside from the fact that both variables grow, they clearly aren’t causally connected.

On a last note, I need to state that the sample mixes different game modes. Different game modes potentially means different matchmaking methods. However, it is likely that we would have seen an effect if for example Weekend League used squad rating based matchmaking, even though Seasons most likely doesn’t. Thus, I consider it most likely that bronze benching doesn’t work in any of the game modes included here.

%d bloggers like this: