Stats, Lies and Videotape; Now What?

I am about a week in to learning about something that’s quite hot in football analytics right now, the expected goals method. I give a massive nod of appreciation and thanks to Joel Salamon (@MessiSeconds on twitter) who is aiding in my learning and is a big inspiration for me wanting to explore further. There are some excellent articles and videos on the method and I’ll be sharing some in the process of writing this post.

Before I get into that, though. I’ll explain a little about where I am coming from in approaching stats. From the ages of 16-21 I was a professional Rugby League player. I wasn’t good enough to be given a full first team contract at 22, so my journey ended there. That is my athletic background. I didn’t and cannot play football to any sort of standard, although I did play all through school and a little Sunday League as an adult. In Rugby League, since the dawn of the Super League Era there has been a radical adoption of statistics to measure and predict games, evaluate performance and determine the key factors in a victory or a defeat. Combined with video-reviews that would form the start of any training week, Rugby surged ahead of many sports in the UK in this field. Upon leaving the Armed Forces in 2009 I took up a role as a Quantity Surveyor, in construction. It’s every bit as boring as it sounds. It’s spreadsheet overload, with cost analysis, work output efficiencies and the like. I also studied mathematics at A-Level, so this sort of stat-based reasoning has always held my interest. I think American sports probably rely too much on stats, but Football, probably, too little.

But there are still some quality places for that information. Currently sites like Squawka, OPTA, ProZone, WyScout and WhoScored are fantastic. Not only for finding out the basics such as goals and games played, but also position each player played, a formation a team employed and a multitude of other things, like key passes, shots on target, pass completion etc. But these are all reflective. If you are into gambling, like tons of you are, they won’t help you figure out who is going to win any more than pure gut feeling looking at the previous performances. How can you interpret the data available at hand to begin to predict what will happen?

Now we are talking Expected Goals. The expected goals method at least TRIES to predict a game and even further out, a season, by assigning a value to a goalscoring opportunity that is created or conceded by a football team. This is not to be confused with merely ‘shots’. Chances are ranked by breaking down where shots are taken from and giving them a percentage based off how many goals are scored from that particular area of the pitch.

Now as I mentioned at the start, I am only a week into this, but I think I have spotted one massive flaw in this system. Lets assume a striker has been played through in a fast counter attack, he’s gone beyond the defence, and the substitute 16 year old keeper in net hasn’t realised, so hasn’t bothered coming off his line. As that striker enters, let’s say the ‘D’ area of the box, he shoots. Pulling a figure out of my arse, lets say he scores that 75% of the time. Now, if there’s been a corner, the ball falls to a striker on the edge of the box, same area of the ‘D’, there’s 15 players directly in front of him and the goal, including a goalkeeper, what percentage chance do we assign to that shot now? Probably very low. But both shots are taken from the same place. Can’t be right, can it?

Does that mean that this expected goals method is trash? Not at all. There’s evidence to suggest the opposite, in fact. We as United fans in particular need to pay attention to what is going on at FC Midtjylland (see the very interesting video below), one of the clubs who have taken the expected goals method to heart and are proving to be quite successful in it’s deployment. As are Brentford, who sit only a handful of points outside of a promotion spot in the Championship. The key here, is that there’s a massive correlation between expected goals and actual goals. Sport in general and football specifically are far too ‘open’ in several factors to ever be able to predict with pin-point accuracy, so any correlation that is ‘more or less’ correct is probably about as good as it gets for prediction methods. The Expected Goals Method makes no allowance for a star player injury, or a fixture pile up, or a dressing fall out, poor refereeing or whatever! But the fact teams are using this and proving to be successful means there’s at least something in this, something you are likely to find also relies on good coaching, scouting, and the multitude of other things that go into football before 3pm on a Saturday afternoon. It hasn’t for example, picked up on Leicester this season, who are performing way above their expected across all metrics. Or even Manchester United, who are performing incredibly defensively and horrifically when attacking. But, if you’ve seen us this season, you didn’t need me to tell you that!

 

What I think I am learning with the expected goals method, is that this is a tool for analysis the same as any other tool. I am enjoying learning about it, and I do think I am only scratching the surface, here, there might be a lot I am simply not seeing, or understanding. Or even misunderstanding. I am aware of models and methods for individual players, that I’ve not even looked into yet, and there’s probably months more work ahead for me to understand this. Maybe more! It is enjoyable, though and it does add and extra layer when watching games.

So I can’t give you a prediction for tomorrow. But I think you’ll be in safe hands betting on a 0-0 or a 0-1.

Thanks for reading.

Stephen

Stephen First game was Southampton 1990, favourite team was 1999, almost perfect footballing team. Proud Mancunian. The word legend is used far too much. MUFC only have one and his name is Sir Bobby Charlton. MUFCLatest founder, Full Time Devils staff member and MUST supporter.