It's been far too long since I took a real shot at useful statistical analysis, so I dedicated my afternoon and evening to looking at Chelsea's passing in 2010/11. This isn't a superficial analysis - I basically dug as deeply as I could without going into player by player comparisons. What I wanted to do was see whether passing mattered in terms of goals scored, goals conceded, and differential (i.e points), and if it mattered more from certain positions. Turns out yes, it matters, but the amount it matters varies wildly by position. If graphs are your thing, dive in. I think you'll find this post... illustrative.
Yeah, I went there.
Anyway, methodology. I went through all of Chelsea's chalkboards from the Guardian this season and looked at passing for Chelsea only, then split the passing values (i.e. passes attempted, completed, and assists) up by position. My familiarity with the position each player filled in every match this year allowed me to do this reasonably quickly with minimal cross-referencing, but it also made working on other teams and other seasons impractical. Players were split into goalkeepers, defenders, midfielders, and forwards, and then I divided the number of passes attempted and completed by the number of players who fit into whatever category. This of course helps us when we switch from 4-3-3 to a 4-4-2 diamond and back, as in recent weeks.
Just a quick note before I go on... while I'd have liked to have made this an exhaustive study of all Premier League teams back to 2006, I simply didn't have the time to gather the data manually while figuring out each team's formation at all times. As a result, this study suffers from a relatively small sample size - we have data from Chelsea's current season and that's it. All conclusions should therefore be taken with a grain of salt.
Let's take a look at what we find:
Figure 1: Chelsea's 2010/11 pass completion rate by date (click for full size).
Not much of note there, right? I can see some patterns in there, but I think that's only because I've already gone through with the data analysis. I also can't spot any clear trend in passes attempted or passes completed per player, so we'll have to do some tinkering to make this all work out. Let's try looking at some different graphs:
Figure 2: Chelsea's 2010/11 goals scored vs. pass completion rate (click for full size).
Figure 3: Chelsea's 2010/11 goals conceded vs. pass completion rate (click for full size).
Note that goalkeeper pass completion rate seems to have a negative relationship with goals scored, while midfield completion rate has a positive relationship. We see a similar, but reversed trend in goals conceded, with higher goalkeeper completion percentage correlating to more goals conceded by the team and more midfield passing going hand in hand with less. Obviously, one of these factoids is easy to explain, and one is not. We'd expect the goal differential to have a little more of a pronounced effect, and, yep, it does:
Figure 4: Chelsea's 2010/11 goal differential per game vs. pass completion rate (click for full size).
Ok, all of that is nice. We can see that there's a relationship in here somewhere, and it's pretty clear that the most interesting area is the midfield. Let's turn to a mathematical rather than visual approach to extract the rest of the information now. We can do this by finding the correlation coefficient, which basically measures how related two variables are.
If the coefficient is close to zero, there's no relationship, above zero, there's a positive relationship (more of one is more of the other) and less than zero there's a negative relationship (more of one is less of the other) Remember, of course, that correlation does not mean causation, but it does hint at it. Let's take a look at what we get when we run the requisite algorithms...
Figure 5: Correlation coefficients between goals and passing values for Chelsea 2010/11.
The correlations are colour coded, red for negative and green for positive, and the shade is dependent on the strength of the relationship (lighter is weaker). Hopefully the chart above is clear enough.
Some points of interest:
The five strongest relationships all come down to midfield passing. This strongly implies to me that a midfield capable of moving the ball around is the single most important aspect of passing. Obviously, further work must be done here to verify what I've come up with, but if the midfield's passing well, the team scores goals and doesn't concede them. This is probably to do with avoiding being hit on the transition - if you can keep the ball in the midfield you can press home the attack, lose it and the opposition can break at speed. Show those numbers to Jose Mourinho and he would be about as surprised as a camel seeing sand, I'd imagine.
- The more passes the forwards take and complete, the fewer goals conceded. This, I think, is pretty obvious. The more time the forwards are on the ball, the less time the other team is attacking. What's more curious is how weak the correlation between forward PC% and goalscoring is - turns out your forwards don't need to be passing particularly well to score goals.
- Defenders don't help you score, but they help you ... defend. This should be fairly obvious too. Defending pass completion rate has a pretty strong negative correlation with goals conceded, which makes a lot of sense considering the areas defenders pass in. Make a mistake there and you immediately initiate an opposition attack.
- Goalkeeper passing has very little to do with scoring or conceding. Huh. I'd guess this is one of those 'randomness' things. Good to know, though!
Bear in mind, of course, my earlier caveats about sample size, but this is intriguing data, at least to me. It strongly implies that we should be ignoring Petr Cech's distribution but paying significantly more attention to how well our midfield is passing, and also helps defend John Obi Mikel and the like from accusations of uselessness. I think it illustrates just where the most important zones for ball control are, too. There's a good reason that the centre of the pitch is the area in which the most decisive passing happens - it's because it's the single most vital area of the football field.
So, there you go. For Chelsea, at least, midfielder passing matters a lot on both sides of the ball, goalkeeper passing matters not at all, strong defensive passing stops you from conceding, and forward passing doesn't really matter as long as the ball is far, far away from your goal. I think that's enough on passing for the day, but I strongly suspect that the data presented above is repeatable across multiple seasons and teams, and that central midfield passing is always always always of critical import to a team's success. Good to know, eh?