About three months ago, I started gorging myself on the work coming out of the soccer analytics community. I'd been following casually for a couple of years at that point, but something must have caught my eye on Twitter, and suddenly I was clicking link after link from various blogs and following a ton of new smart people on my feed. I'm not sure why it took me quite so long to get involved - I can only claim that I was "busy" and "doing other things." Once I started, I quickly figured out that the rabbit hole is deep, not only with ongoing work, but also with a quintillion potential questions and applications waiting to be examined.
Any cursory examination of soccer analytics will lead straight to James Grayson's Total Shot Ratio, which is has an excellent signal to noise ratio at sorting the good teams from the relegation candidates. What it lacks, however, is a nod toward the quality of chances teams create. A six-yard tap-in is significantly more likely to result in a goal than a shot from thirty yards away, but lacking shot distance data, is there a way to boil down which teams create more "good" chances on goal versus those who don't?
Hrm... what about Shots on Target?
Why look at Shots on Target?
For starters, because Shots on Target (henceforth abbreviated SoT) are almost three times as likely to result in a goal than a basic "shot". Instead of being converted at a general rate of 10-14%, SoT are the inverse of save percentage and result in a goal closer to 30% of the time. I really like the work people like James Grayson and 11Tegen11 have done on Total Shot Ratio and Relative Shot Ratio respectively, but while those metrics are valuable, they also overlook some things.
Additionally, as Bitter and Blue's own shuddertothink explained at length here, Shots on Target really seem to matter. Teams who win the battle for Shots on Target show a significantly increased probability of winning compared to those who just create more shots than their opponents.
Finally, I wanted to examine SoT because Shots on Target are far less squishy to compare than things like Clear Cut Chances (CCC). SoT is easily defined and easily recorded; CCC... not so much. There are also too few CCC in games to make comparison valuable.
Speaking of which, there is an open question of whether there are enough SoT in particular games to make for good data crunching. After all, in any given match teams are only going to average about 26 shots combined per game, and only 9 of those on average will be on target. My view is that SoT happen just enough to make examining the data interesting (hence the article), and provide the potential to draw extra conclusions not just about how many shots teams take, but also on how likely those shots are to score.
What Drives Shots on Target?
Two things: talent and systems.
Teams with more talented players tend to produce more Shots on Target.
Example: Take all of the players from Barcelona, and transport them to Stoke. Make them work with Tony Pulis for the summer and then set them loose on the Premiership, playing "The Stoke Way." I guarantee you they will end up producing more shots and SoT than the current Stoke lineup. They might also fall into a deep, dark depression and want to quit football entirely, but that's neither here nor there.
(Note to the good people of Stoke: that was a comment on Tony Pulis and not a judgment about living in Stoke. If you disagree, please send all hate mail to @iainmacintosh on Twitter (as he's handling my PR for this issue).
As for systems, at this point most analysts take it for granted that certain systems are better at producing shots than others. Systems that use intricate passing and movement to create chances inside the 18-yard box are more likely generate more Shots on Target than those that regularly have players taking pot shots from 25 yards out.
Example: Take the above thought experiment and flip it on its head. Regardless of who he signs for, kidnap Pep Guardiola in June and give him the summer to implement the Barcelona system with all of the current Stoke players. This system would almost certainly produce far more shots and SOT than what Pulis has implemented at Stoke. They probably won't win the league, but there's a very strong chance they will score more goals than they currently do.
Methodology
With these two things in mind, I looked at all the data from the English Premier League, La Liga, Bundesliga, and Serie A from 2009-10 through the end of the 2011-2012 season. One of the first things I wanted to examine is: what percentage of shots result in a shot on target? The reason for this is that I wanted to build a kind of baseline or "shots on target par" (SoTPar) to compare how teams are doing at creating shots on target in current seasons versus historical norms. [Note: this is all the complete season data I had access to at the time].
Shots on Target Par
Average across leagues: 34.2%
Average for EPL: 32%
Average for La Liga: 36%
Average for Bundesliga: 36%
Average for SerieA: 33%
I did not expect EPL- the league that spends the most money on player salaries and transfers by far- to have the lowest percentage of shots on target. Stoke were below 30% in all of the completed seasons I looked at, but that isn't enough to drag the entire league average down. Thinking that perhaps Spain's average was inflated by exceptional Barcelona and Real Madrid teams, I took them out of the average.
Spain minus Barcelona and Real: 35.1%
Well then. Despite smaller budgets, teams all the way up and down the table in Spain and Germany simply do a better job and putting shots on goal than their counterparts in England. Interesting. Is there something particular to England and Italy that makes it harder to get Shots on Target (better defense? crappy weather?), or are the players and systems in Spain and Germany simply more adept at producing them? Or is this just random statistical noise? I don't know, but I do like asking questions!
While we're here, let's look at the league averages of Shots Per Game across the countries as well.
Average SPG Across Countries: 13.4
Average SPG for EPL: 14.4
Average SPG for La Liga: 13
Average SPG for Bundesliga: 12.9
Average SPG for SerieA: 13.5
Not much variation there. English teams seem to play at a slightly faster pace overall, but the other three leagues are locked together in a tight little bundle.
The next thing I wanted to look at is difference between good teams and bad teams. Intuitively, you would think teams qualifying for the Champions League would have a better SoT% than teams who were relegated, but obviously the point of all of this is to ignore what intuition might suggest and examine the data.
Average SoT% for Champions League teams: 37.2%
Average SoT% for relegated teams: 32.3%
That's a fairly large difference and one that correlates exactly with what your intuition tells you. Create shots, put lots of them on target, and you can qualify for the Champions League. Fail to do so, and you may end up facing relegation. It's also interesting to note that the average for relegated teams across all the leagues is higher than the average for all teams in EPL. I don't even have a guess for why this is the case, so I am currently willing to listen to any and all possible explanations.
Obviously the attacking stats are only one half of the story, which is why this type of analysis is always going to have some holes (and is inherently more biased than TSR). Looking at the data, there were four teams who produced Champions League levels of SoT% that were relegated:
2012 Cologne (only 8.9 SPG, 37% SoT%)
2011 Hercules (10.5 SPG, 39% SoT%)
2012 Villareal (12 SPG, 38.3% SoT%)
2011 Borussia M.Gladbach (12.8 SPG, 39% SoT%)
All of whom presumably shipped a lot of goals to opponents. Teams aren't playing solitaire - they also have to stop the opposition from scoring. Presumably the best way to do this is by stopping the opposition from getting many Shots on Target, but that type of data isn't publicly available at the moment.
How Consistent Is This Stat?
A more rigorous analysis should be done in this area, but eyeballing the data, I would say surprisingly so. Teams like Barcelona, Bayern Munich, and Arsenal have been able to sustain high SoT% percentages year after year (Bayern doing so despite managerial changes).
2010 |
Barcelona |
15.6 |
6.7 |
0.429487 |
2011 |
Barcelona |
15.8 |
7.3 |
0.462025 |
2012 |
Barcelona |
16.5 |
7.6 |
0.460606 |
2010 |
Bayern Munich |
14.6 |
6.3 |
0.431507 |
2011 |
Bayern Munich |
15 |
6 |
0.4 |
2012 |
Bayern Munich |
15.7 |
6.3 |
0.401274 |
2010 |
Arsenal |
17.4 |
6.1 |
0.350575 |
2011 |
Arsenal |
17.2 |
6.3 |
0.366279 |
2012 |
Arsenal |
16.8 |
6.2 |
0.369048 |
But obviously there are teams that show significant variation as well, so I don't know what to think. Let's just say there's more work to be done:
2010 |
Hoffenheim |
13.5 |
3.9 |
0.288889 |
2011 |
Hoffenheim |
14.1 |
5.1 |
0.361702 |
2012 |
Hoffenheim |
13.4 |
4.3 |
0.320896 |
2010 |
Juventus |
15.3 |
5.1 |
0.333333 |
2011 |
Juventus |
15.1 |
4.8 |
0.317881 |
2012 |
Juventus |
19.1 |
6.9 |
0.361257 |
Top 10
SeasonEnd |
Team |
Shots pg |
Shots OT pg |
SoT% |
Finish |
League |
2011 |
Barcelona |
15.8 |
7.3 |
0.462025 |
CL |
LaLiga |
2012 |
Barcelona |
16.5 |
7.6 |
0.460606 |
CL |
LaLiga |
2010 |
Villarreal |
12.7 |
5.7 |
0.448819 |
LaLiga |
|
2010 |
Bayern Munich |
14.6 |
6.3 |
0.431507 |
CL |
Bundesliga |
2010 |
Barcelona |
15.6 |
6.7 |
0.429487 |
CL |
LaLiga |
2012 |
Bayer Leverkusen |
12.9 |
5.5 |
0.426357 |
Bundesliga |
|
2011 |
Real Madrid |
19.1 |
8 |
0.418848 |
CL |
LaLiga |
2012 |
VfB Stuttgart |
13.9 |
5.7 |
0.410072 |
Bundesliga |
|
2012 |
Real Madrid |
19.3 |
7.9 |
0.409326 |
CL |
LaLiga |
2012 |
Schalke 04 |
13.7 |
5.6 |
0.408759 |
CL |
Bundesliga |
Bottom 10
|
Team |
Shots pg |
Shots OT pg |
SoT% |
Finish |
League |
2012 |
Stoke |
9.9 |
2.5 |
0.252525 |
EPL |
|
2010 |
Wolverhampton Wanderers |
11.5 |
3.1 |
0.269565 |
EPL |
|
2012 |
14.2 |
3.9 |
0.274648 |
EPL |
||
2012 |
Cesena |
12 |
3.3 |
0.275 |
R |
SerieA |
2011 |
Catania |
13.7 |
3.8 |
0.277372 |
SerieA |
|
2010 |
Stoke |
10.6 |
3 |
0.283019 |
EPL |
|
2012 |
Parma |
13.7 |
3.9 |
0.284672 |
SerieA |
|
2010 |
Portsmouth |
14 |
4 |
0.285714 |
R |
EPL |
2010 |
Espanyol |
12.2 |
3.5 |
0.286885 |
LaLiga |
|
2011 |
Genoa |
14.2 |
4.1 |
0.288732 |
SerieA |
Conclusion
I hope you've enjoyed this different way of looking at publicly available shot stats teams produce. It certainly doesn't replace TSR (not least because the data for Shots on Target Conceded isn't available), but adds an extra wrinkle to Total Shot Ratio analysis, and provides a lot of potential avenues for further exploration. Despite lagging considerably in the TSR metric, both Barcelona and Manchester United are running away with their leagues by putting a high percentage of shots they create on target (and, it should be said, past the goalie).
Follow-Up Questions
How does SoTPar change as the quality of the league changes?
To look at this, one would need access to solid data sources from lower leagues, which isn't publicly available at this time.
Can you use this type of metric to help scout players?
Given my belief that talent is one of the keys to creating more shots on target, can you use this metric to find talent in places you didn't expect?
Example: Borussia M.Gladbach had to fight a relegation playoff in 2011 despite having a shots on target percent of 39! The next season they qualified for the Champions League (posting a SoT% of 38.4%). Standouts from that club include excellently-coifed Dante (currently starting for Bayern Munich), Roman Neustadter (now at Schalke), and the amazing Marco Reus (who plays for Borussia Dortmund).
Mid and lover-table teams that heavily beat the league average SoT% might be a good place to look for unearthed talent.
If you were to follow a particular manager from place to place, how varied would the SoT% their teams produce be from season to season?
Obviously I'm curious about what stats Guardiola's next team produce, but this type of analysis could be done for any number of highly-travelled managers, especially if the pool of historical data were deeper. As I said above, talent levels also have a big impact on SoT% production, but this would be an interesting way to look at presumably similar systems implemented with entirely different sets of talent.
If the data were available, I'd love to look at how far away shots and shots on target actually come from and its effect on goal probabilities across leagues.
11Tegen11 did some of this work for one EPL season, but it deserves a larger data set and much deeper examination.
Loading comments...