In Defence of Analytics

The football analytics community has taken its fair share of flak in recent days on social media, with articles from the font of all wisdom, Neil Ashton, and Alex Netherton, to which a brilliant response was written by Paul Tomkins. Now, most people are aware by now that the purpose of these people’s existence is simply to rile up as many people as possible, generating more clicks as said people share these articles with 140-character tweets of rage, whilst maintaining the security of the opinionated journalist’s job against the rise of these number devils, but the articles themselves raise alarmingly prevalent beliefs about data and analytics in modern-day English football, the pair of which I’m going to try to defend.

There are two main factors which are almost exclusively avoided by the average football fan when it comes to evaluating anything football-related, but both are absolutely critical in the evaluation of anything to do with the beautiful game. The first is given the uninspiring name of availability heuristic. Soccernomics, a fantastic book I’d highly recommend to anyone wishing to get into football and stats, defines availability heuristic as ‘the more available a piece of information is to the memory, the more likely it is to influence your decision, even when the information is irrelevant.’ Basically this means that more recent and/or memorable events tend to stand out in your mind, and therefore influence your opinion about a player/event, even if the information is useless. This means that eyes are hugely fallible when it comes to making judgements, so it makes sense to at least consider the application of something less biased when making said judgements.
The other factor is confirmation bias, which is defined as ‘a tendency to search for or interpret information in a way that confirms one’s preconceptions, leading to statistical errors.’ A suitable analogy would be that when watching a game, a player you don’t ‘rate’ makes a poor piece of play. Confirmation bias would strengthen your belief in his lack of quality, even though he might have otherwise had a solid performance. A failure to apply this pair to decision-making is a sure-fire way to make misinformed decisions and judgements, as is part of the reason why decisions based on gut feelings go wrong more often than those based on data, as data can’t lie, it can only be misinterpreted.

Their (deliberate) misinterpretation has served to give all stats a bad name, with sites such as Squawka and its infamous comparison matrix commonly used to demonstrate ridiculous things, such as Andros Townsend’s ability to get a greater % of his shots on target in the first 3 games of a season compared to Cristiano Ronaldo, which leads to people taking the view that if stats show this, stats must be wrong, because Ronaldo is by far the superior player (no offence Andros). What is needed with all stats is context. How many minutes has a player played? Enough to have a reliable sample size? Against which teams has he played? How relevant is this stat in determining a player’s quality? All of these questions and similar ones need to be answered in order to garner a better understanding as to why the stats have given such results.

On the topic of players such as Andros Townsend, it’s players like him who represent another area in which England lags behind much of the more enlightened footballing world. There is a preference, amongst English fans (perhaps others, I haven’t checked), to prefer players whose styles are positively influenced by availability heuristic and a misplaced romanticism in the game. Along with Townsend, I’d include Oxlade-Chamberlain, Coutinho and Ross Barkley in a similar bracket. All four can be hugely exciting to watch, with the latter three racking up massive successful dribbles per 90 figures, combined with frequent long shots, and this is something that endears them to fans, because when they get the ball, ‘something’ appears to happen, something exciting, which gives fans an adrenaline shot of hope to the heart, not to the head. The problem is, these four players are some of the most inefficient players in the entire league. Whilst their skills and dribbles should be applauded, they should not be regarded as the end point, but rather a means to the ultimate end, which is, in football, the creation and scoring of goals (for offensive players at least).

In order to achieve the maximum output of goals, thereby giving yourself the best chance of winning the game, efficiency must be prioritised on an individual level. Now, I’m not, and will never, say that footballers should become like robots in their decision-making, I’m a believer that the way in which you win is more important than the fact you win (a blog for another time perhaps), I believe that it’s the individual’s responsibility to be as efficient as he can be in all actions, and the team as a whole will take care of the aesthetics, as simple, efficient football is arguably football at its most beautiful.

Below is a map of all of Coutinho’s shots on target this season (credit to Paul Riley for this).
Coutinho dashboard
There’s not one chance with a greater than 40% chance of being scored (if you subscribe, as I do, to the holy grail of football analytics at the moment, expected goals). That’s not necessarily the sign of a great offensive player; a great offensive player maximises the time his side have in possession to create the best quality chances, which occur centrally and up to 12 yards out, in the so-called danger zone. And look at the sheer volume of shots he’s already taken this season from outside the area, bearing in mind that these are just his shots on target. The average conversion rate for shots from a distance of greater than 18 yards is 3%. Even the greatest of them all, Lionel Messi, only converts shots from that area at around 6%. This isn’t a character assassination of Coutinho either, I think he’s a terrific player that really needs to change his decision-making mentality in order to maximise his, and his team’s, potential, and it’s a mentality change that needs to be looked at in England in order to produce players of the efficiency calibre of Mesut Özil, whose spell has been much maligned, despite being successful, both in trophy and statistical terms. Now, in order that I don’t go full Whitehouse Address on you, time to move on.

There’s also a dangerous tendency to rate players based on outcome rather than process, an extremely unreliable method. For example, if player X scores two goals from two long-range and inefficient shots, the conventional view will be that he should always shoot from range, simply because he did it before and it worked, when in reality, the probability of the same outcome being continually repeated with similar results given the poor nature of the process is very low indeed, which would manifest itself in the form of inefficient attacks, which reduce the chance of player X’s team scoring goals, and therefore of winning the game. It’s a similar story with assists. Assists heavily rely on the ability of the player on the end of them to finish the chance, yet they’re used as a measure of creative prowess, which means that certain players’ creative abilities will not be accurately represented due to the ineptitude of those in front of them. Instead, it’s more accurate to look at the process of their chance creation, and therefore look at the quality of the chances created, regardless of their outcome. Some work has been done on expected goals per key pass, and there’s a similarly themed presentation by Dan Altman here.

On the subject of the English, stats and evaluation, there’s another gripe I have with them. It’s over that most adored of intangible qualities; passion. Fans love to see a player try his hardest, sprinting all over the pitch like a prime 14/15 Ryan Mason, regardless of whether that’s actually beneficial to the team. A player like Steven Gerrard is a prime example of the negative effects of passion. Having slipped against Chelsea, he proceeded to bombard the Chelsea goal with inefficient long-shots, often ignoring team-mates in better positions, simply because he was so desperate to drag his beloved club back into this most crucial of games, which he could have done had he directed his passion from his heart into better, more efficient decision-making in his head. Those who ridicule stats, such as the intellectual powerhouse that is Andy Townsend, claim there’s no number that you can put on ‘heart’. Does this matter? Almost every single player is a competitive animal, whose rise to the peak of world football is driven by a deep-rooted desire to succeed. It’s unlikely that one player at the top (Mario Balotelli excepted-there are times when he really doesn’t care) is significantly more motivated to win than another, they just might express it differently, in a more visible fashion, that endears them to the fans more, because the fans want to feel valued, to feel as if their irrational emotions are poured forth on the pitch before them. But as discussed, passion is no substitute for intelligence. The Germans and Spanish didn’t win the last two World Cups because their players loved their country more, it’s because they were more intelligent than their opponents, which made their players even more superior than their opponents’.

Most fans place too high a priority on intangibles, which although they exist, little affect the actual process and outcome of events. Confidence is another. Did a player miss a one-on-one because he was lacking in confidence, and the lack of confidence caused his legs to slow him down as he raced clear, freezing his brain? Almost certainly not to the extent propagated by the media. In reality, it’s more likely that his failure to finish was due to the cyclical nature of finishing and the difficulties regarding consistent repeatability of finishing.

The stress placed on these relatively insignificant intangibles leads to the wide use of cliches, which offer pundits an easy way out as they can fall back on these now widely-accepted, yet totally unacceptable reasons for events panning out the way they did, creating laziness in analysis, which becomes entirely superficial, and therefore ineffective.

The power of analytics in other sports has already been proven, as anyone who’s seen the excellent film Moneyball will know, and it’s surely a matter of time before a mainstream statistical revolution sweeps through the footballing world. Already signs are beginning to emerge, with expected goals referenced in an article in the Telegraph, and Match Of The Day (mis)interpreting player tracking data. However, although this blog has praised stats and analytics, often at the expense of watching the game with your eyes, it should be stressed that analytics is not incompatible with watching football. Rather, the two should be allied together in order to allow each of us to understand the game to a greater extent. Stats will never be able to give you the definite right answer to a question, but they can increase the chance of you being right, and in a game of fine margins, the smallest things can make a difference.

Many thanks for reading, any and all questions can be directed to us @OneShortCorner.

