In 2011, football (soccer) analytics took the stage at the MIT Sloan Sports Analytics Conference for the first time. ESPN NBA analyst Mark Stein led a panel of representatives from several English Premier League clubs, as well as Prozone (a data collection company), Decision Technology (an analytics company), and Microsoft. While it didn’t produce any big insights or revelations, I think for a lot of people in the football analytics community it was a thrilling moment of validation.
On that first panel, Blake Wooster from Prozone drew a comparison between football and basketball. He found it interesting that the NBA was just starting to get positional tracking data (through the STATS company SportVU), while the Premiership clubs had access to this sort of data for nearly a decade. Yet, while the football positional data had been available, he said, the level of analytics, the development of a coherent game model, or framework for the data, lagged. I want to return to Wooster’s observation, because I think seeing how this observation played out over the intervening two years speaks volumes about the problem of football analytics at SSAC in 2013.
In 2012 football analytics returned to the main stage at SSAC, but this time the thrill of the new had worn off and expectations were higher. Again, Mark Stein moderated a panel of mostly club representatives and media figures. In my opinion, it was extremely disappointing.
If that panel was all you knew about football analytics, as was likely the case for the vast majority of people who sat in that audience, you would have been perplexed. The most popular and richest sport in the world seemed to be approaching analytics with the mindset of 19th century vicars approaching Darwin. The unstated implications of the panel were dire: bright club analysts were walled up in silos, stunted by secrecy, and largely under-utilized (and also underpaid). The independent analysts — the potential Bill James’ of football — were starved by lack of data. Everybody seemed overwhelmed by the challenges of modeling the flowing complexity and lack of tangible statistical events inherent in the sport. I left that panel feeling like football analytics was the proverbial group of blind men trying to describe the elephant.
Fast-forward to SSAC 2013: the impressive Cornell professor Chris Anderson is announced for the football analytics panel, along with Jeff Agoos from the MLS league office, Blake Wooster from Prozone, and ESPN football analyst Albert Larcada. The inclusion of Anderson, co-author of the great Soccer by the Numbers blog, seemed to be an acknowledgment of the failings of 2012; he has a much-anticipated book on football analytics coming out in a couple of months, and is considered to be one of the most serious independent thinkers in the game. And yes, this year’s panel was better than 2012 — but it was a pretty low bar. Yet, when the panel had ended and the dust settled, it again felt disappointing to me.
And particularly so, given the progress being made in basketball. Blake Wooster’s 2011 comment about the head-start football analytics has with spatial data seemed to have been largely erased in the intervening two years, at least by what was on display at SSAC. The wonderfully droll Kirk Goldsberry, visiting Harvard professor and Grantland contributor, delivered an amazing research paper on quantifying interior defenders in basketball. Goldsberry’s paper was not only an impressive work of data analytics and visualization; it was also immediate conversation fodder for NBA fans, media analysts, and league pros. It took a little-understood but frequently-observed phenomenon in the sport and made it instantly comprehensible through analysis, producing some surprises along the way (i.e., Larry Sanders vs. David Lee). Goldsberry had made a similar impact at SSAC ’12, with a paper about NBA scoring efficiency. This is the kind of work that could propel esoteric NBA analytics into the mainstream.
The best that the football analysts could muster in response was pretty pathetic. Albert Larcada showed some ESPN “heat maps” from the Real Madrid-Barcelona match that basically revealed that Messi played in the center and Ronaldo on the wing. Not that big of a revelation. When asked about fertile areas of investigation, Chris Anderson mentioned “defense” without offering much more. Others mentioned the financial fair play rules and MLS salary cap, again without much to offer in terms of approach. Geir Jordet presented a paper that proved that really good midfielders complete more passes because they look around at the position of players on the field a lot. Interesting in its quantification of the phenomenon, but not really a major paradigm-changer. If you watched one of my son’s U13 club matches, you’d see that kids who look around before receiving the ball are better at football than kids who don’t.
Back in 2011, the brilliant Sarah Rudd (now at football analytics firm StatDNA) published a research paper about the use of Markov Chains to analyze collective contributions to goal scoring in football. For me, this was one of the best approaches I’d ever seen, and it made me think about the situational/probabilistic concepts of space and decision-making — concepts which everyone can agree are at the very heart of football — in a fresh new way. (A variation on her methodology showed up this year in some pre-Super Bowl NFL analytics of expected points from offensive drives.) For the non-professional football fan looking for data-driven enrichment of the match viewing experience, Rudd’s work was astonishing — and would be even better if re-presented in a more visual, fan-friendly way.
There are some really bright people making meaningful contributions to the public understanding of football analytics. Just the one’s I’ve met personally, like Howard Hamilton at Soccermetrics, Simon Gleave at Infostrada Sports in Holland, and Zach Slaton, a contributor to Forbes (his review of football analytics at this year’s SSAC was excellent, and maps closely to my own experience), have all been doing interesting things. I’d highly recommend their writing and the other articles on the StatDNA blog, and the aggregator site Soccer Analysts, as well. They all seem to be doing good work.
But the lack of really important contributions on display at SSAC ’13, particularly compared to the great leaps forward being made in basketball, makes me wonder. The club reps and professional analysts would like us to believe that we are only seeing the tip of a Titanic-sized iceberg, and that all the really cool stuff is happening behind closed doors inside clubs or as work for hire. But I have worked in technology businesses long enough to smell bullshit when I am asked to trust the experts and not my own eyes. Often that iceberg is really more like an ice cube floating in a soft drink. I think it is entirely possible that there may be less going on in football analytics than meets the eye.
If the SSAC conference organizers are just going to trot out a group of football insiders in 2014 and let them talk in vague generalities about how hard analyzing football is and how secret their secret sauce is, they might as well not put football analytics on the main stage again. In my opinion, if they have to fill the panel with club representatives and ESPN guys, at least pick one or two key on-pitch or transfer market questions and go really deep. Force panelists to have opinions. Don’t let them off the hook. Better, let’s have some balance between the club reps and the independent analysts, and really take the gloves off. Let’s have an argument about what the best current game model is. It’s about time.