Monday, April 27, 2009

2008-09 Shot Quality

Behind the Net has posted shot quality data for goalies in 2008-09. I am particularly interested in these numbers, as they are based on the same ESPN data I am collecting for the playoffs. As I have been discussing lately there is still room for improvement with these kind of metrics, but they are better than the raw data and at least provide some evidence of the actual level of performance of NHL goalies.

With the Vezina finalists due to be announced today, I had more or less settled on Tim Thomas, Tomas Vokoun and Roberto Luongo as my top 3, and this is pretty good evidence for that viewpoint. All three of them were terrific at even-strength this year. My pick for the most underrated goalie in the league, Kari Lehtonen, also does very well. Henrik Lundqvist ranks highly, although I'm not sure whether we can trust his numbers or not as Madison Square Garden's reporting biases are well-known.

Evgeni Nabokov, Miikka Kiprusoff, and Niklas Backstrom don't look all that good after adjusting for shot quality. Both Nabokov and Kiprusoff have about average shot quality against, so they have little excuse for their relatively unimpressive save percentages, and Backstrom faced the easiest shots against in the league.

Cam Ward and Steve Mason are two others who could get some voting support. They wouldn't be bad choices, and would both be preferable to any of the above three, but I don't think they would be the best picks either.

Regardless, the voting results for this year's Vezina Trophy and First Team All-Star goalie should be very interesting and will show us what the GMs and the media value in evaluating goalies.

Saturday, April 25, 2009

Shooting High

I am tracking shots based on ESPN's shot charts for the playoffs, but I have also been regularly checking out CBS Sportsline's shot charts because they add another interesting piece of information: Where the shot was targeted on net. For each shot they record one of five target zones: High glove, high blocker, low glove, low blocker, or 5-hole.

I haven't compiled any numbers other than for this year's playoffs, but Hockey Numbers did a post a while back that gives the CBS data broken down by shot location for the 2006-07 season. There seemed to be a lot of shots missing in that sample as the average save percentage was much lower than the official NHL numbers, but the results still conclusively prove what all hockey players know: That shooters are much more likely to score if they shoot high.

It doesn't seem to matter much where a shot is targeted on net from right to left (e.g. high glove or high blocker are about the same), the key difference is whether the shot is high or low. There are of course some goalies who are better on the blocker side or on the glove side, and all goalies probably have slightly less success on five-hole shots than shots that are low to either side, but those differences are small compared to the difference between top shelf and along the ice.

The 2006-07 data has a save percentage difference of .054 between high shots and low shots. This year in the playoffs, there has been a .942 save percentage on low shots and an .872 save percentage on high shots, for an even more extreme gap of .070. Those differences make a pretty good case that shot height should be included in measurements of shot quality.

As is unfortunately often the case with real-time stats, however, the CBS reporting system is pretty suspect. They seem to have fixed the earlier problem of missing shots, but there are large variances in high shot frequency from rink to rink. Here are the stats by series:

BOS vs. MTL: .917 low, .857 high, 27% high shots
WSH vs. NYR: .946 low, .868 high, 19% high shots
NJD vs. CAR: .961 low, .907 high, 41% high shots
PIT vs. PHI: .961 low, .800 high, 14% high shots
SJS vs. ANA: .926 low, .933 high, 19% high shots
DET vs. CBJ: .919 low, .840 high, 20% high shots
VAN vs. STL: .961 low, .875 high, 16% high shots
CHI vs. CGY: .928 low, .845 high, 35% high shots

We would expect high shots to be correlated with space on the ice. The more time and space a shooter has, the more likely he is going to be able to shoot high. I wouldn't be surprised that a tight-checking series like Vancouver/St. Louis might have a below-average amount of high shots. However, even in that particular series the number of actual high shots is almost certainly understated. In game 4 in St. Louis every single one of Roberto Luongo's 49 shots against were recorded as being low shots, which is extraordinarily unlikely and seems to be merely a case of an indifferent scorekeeper. I observed several other similar games where all or nearly all of the shots were booked as low shots.

The scorers in New Jersey and Carolina appear to be the opposite, booking too many shots as high. The high shot rate in that series is almost double that of all the other series combined, and the save percentage against high shots is much higher than average. The Calgary series also has an abnormal ratio, but the save percentage on high shots is just .845. Either the shooters are really managing to go upstairs that often, or else Kiprusoff and Khabibulin are doing a very poor job of handling high shots.

In most series both goalies have faced a pretty similar number of high shots. One series stood out by having a large gap between the teams, Detroit's 4 game sweep of Columbus. Based on shot distances and shot locations, it looks like Columbus allowed easier shot quality against than Detroit. Over half (54%) of the shots against Steve Mason came from the point or were perimeter shots, while the same areas accounted for just 37% of the shots against Chris Osgood. As a result, Detroit's outshooting advantage is counterbalanced by a longer than average shot distance. I have Detroit with an expected goals figure just 0.3 ahead of Columbus for the series. That is similar to the shot quality results at Hockey Numbers (Detroit +0.5).

When we consider where the shots were targeted, it becomes a different story. The rate of high shots against Steve Mason was twice as high as the rate against Chris Osgood (26% to 13%). If we adjust only based on the average save percentages for low and high shots, that means we would expect Osgood to have the easier job with a .933 expected save percentage compared to .924 for Mason. If we recalculate the expected goals based on those save percentages, Detroit would have been expected to score 3.4 more goals than Columbus over the 4 game series.

What seems possible is that even though Columbus' shooters were getting into good shooting locations, they were mostly shooting under pressure from defenders. In contrast, Detroit was setting up more open shots, which allowed their shooters to snipe up high against Mason. I must confess I wasn't able to catch any of the games of that series, however, so if you did follow that series then feel free to comment on whether you believe the shot quality figures (both based on ice location and target location) seem correct.

If we adjust for both where the shots were coming from and where they were targeted, based on playoff averages so far, I estimate that Mason had a .933 expected save percentage while Osgood was at .922.

We can use those figures to conclude that the much-maligned Chris Osgood did surprisingly well in round 1, but it wasn't a very good playoff debut for Steve Mason. The stats suggest that the likely Calder Trophy winner had the worst overall performance of any goalie in the playoffs, although at least Jose Theodore ranks below him on a per-game basis.

I think it is pretty evident that shot quality would be improved if the target location of the shot was accurately tracked. By combining that information with where the shot was coming from on the ice, it should be possible to get a more accurate scoring probability. Unfortunately CBS Sportsline's tracking system seems to too untrustworthy to be useful at the moment. To evaluate all goalies on a level playing field it is necessary to standardize the reporting to remove or at least drastically reduce rink reporting bias.

Friday, April 24, 2009

Football, Hockey, and Player Development

Football Outsiders' Mike Tanier wrote an excellent article on drafting vs. player development (scroll down past the mock draft satirizing to get to the real meat of the post). He echoes my thoughts on the topic by pointing out that there are many post-draft variables that sculpt a player's future career success.

If that's true for 22 year old men with college degrees, then it is pretty likely to be even more true for 18 year old high school kids aspiring to play pro hockey.

And while we are comparing football to hockey, I thought of an interesting comparison: Brett Favre and Martin Brodeur. If career wins and shutouts trump all in the greatest of all-time debate, then doesn't holding the career wins and passing yards records make Brett Favre the greatest QB of all time?

Favre and Brodeur have more in common than holding their respective sports' career wins record. Both are distinguised first and foremost by durability, and have reputations for consistency. Neither has had any individual seasons that rank among the few greatest seasons ever - according to Aaron Schatz of Football Outsiders, Favre doesn't have a single season in the top 50 QB seasons ever, and he has just one in the top 80. Brodeur also has only a couple of seasons that rank among the best in save percentage compared to league average.

Despite this both of them have won some major awards, with 3 MVPs for Favre and 4 Vezinas for Brodeur. They also have experienced team success, with 3 Stanley Cup rings and 1 Super Bowl ring between them.

As the groundswell of support for Brodeur as a best-ever candidate continues to grow, it might be wise to keep in mind how Brett Favre is ranked. A collaborative ESPN effort to rank the all-time top 10 QBs from a year ago ranked Favre 8th, and even after his retirement there is still debate about whether Favre is a top-5 guy.

It should be said that your individual evaluative criteria come into play when evaluating athletes who were distinguished more by durability and longevity than by peak play, like Brodeur and Favre were. However, it's just something to think about before automatically declaring the goalie with the most wins and shutouts to be the greatest ever.

Tuesday, April 21, 2009

The Value of Rebound Control

Dirk Hoag, who runs the excellent stats-focused blog On the Forecheck, has been measuring rebound shots from the NHL play-by-play logs over the last few years and was kind enough to provide me with a summary of rebound shots against by team for this season to date. I was particularly interested in this year's numbers since Martin Brodeur's absence should give us an idea of the impact of a goalie on rebounds against. You can't go very far on a New Jersey Devils blog or message board without running across someone either praising Brodeur's rebound control or trashing Scott Clemmensen's, so I think it is fair to say that one of the best at the skill was replaced for 40 games by one of the worst. Not to mention, Brodeur himself seems to think that rebound control is a pretty important skill. What was the difference for New Jersey in the mostly Marty-less 2008-09?

First, we need some context. The Forechecker has posted data online from each of the past several seasons (for example, here is a post from last March for the 2007-08 season). These data sets aren't usually complete, but they give us a pretty good sample to work with. Over the last three seasons, with Brodeur nearly always in net, The Forechecker has estimated the Devils as averaging 1.5, 1.4, and 1.4 rebound shots against per game. That is compared to a league average of just below 1.6, and is a good enough to rank 7th best in the league over that time period.

New Jersey Devils, 2008-09: 1.22 rebound shots against per game, 4th fewest in the NHL

According to the NHL play-by-play data, New Jersey took out one of the best rebound controlling goalies in the league and put in one of the worst, and not only improved their numbers from previous years but ended up among the league leaders in fewest rebound shots against.

How is this possible? Based on this result and other evidence, I think rebound control is an overrated skill. That is not to say it is unimportant, just that it gets overly emphasized in terms of its impact on goals against. This is for the same reason that most non-save goalie skills are overrated: Because the rest of the team can compensate for it. Clemmensen's teammates knew there were going to rebounds when he was in net, and they made sure the other team didn't get them. This shift happened without any apparent effect on the overall play - the Devils' rate of goals for, goals against, and shots against is very similar with Clemmensen and Brodeur in the net this season.

Rebound control is especially prone to being overemphasized because it is an obvious skill. You don't have to know much about goaltending technique to know whether a goalie is controlling his rebounds, you just need to watch where the puck ends up after it hits him. Lots of people who aren't goalies can speak knowledgeably about a goalie's rebound control. The problem comes when they focus too much on where the puck goes after a shot and not enough on whether or not it went in the net to begin with. A goalie who stops a high percentage of shots but has a tendency to allow awkward rebounds (someone like Tim Thomas, for example) will be consistently underrated by people who rely on subjective evaluation.

Despite the focus on rebounds, there really aren't that many rebound shots per game. This season there have been just 1.43 rebound shots against per team per game. Not only are the totals low, but there is not a whole lot of difference between teams in rebounds allowed. Over the last 4 seasons from The Forechecker's numbers, Detroit allowed the fewest rebound shots (1.25 per game) while Florida allowed the most (2.00 per game). That is not particularly surprising, since Detroit allowed the fewest total shots (25.6) while Florida allowed the most (33.1). Note that the difference in overall shots (7.5) is 10 times as high as the difference in rebound shots (0.75). Clearly it would be a big mistake to attribute a difference in shots against between teams primarily to rebounds.

A better measure of rebound prevention is the percentage of shots against that are rebound shots. The best team in the league this year, Buffalo, has faced a rebound shot on just 3.6% of their shots against. The worst team in the league, Carolina, has seen a second chance opportunity on 6.1% of their shots. That is a gap of 2.5%, which is a typical gap between the best and the worst in any given season. Even if we assume that the entire difference is a result of goalie skill, for a team with 30 shots against that accounts for a difference of about 0.75 rebounds per game, which at a typical rebound scoring rate is somewhere around 0.18 goals per game. A difference of 0.18 goals per game is equivalent to a save percentage difference of .006. That is just for this season, where we would expect some more randomness in the results. Over the 4 year sample, the difference between the best and worst is just 1.7%.

That would be a poor assumption, however, because it doesn't into account team defence. A better defence will clear more rebounds, and a team that allows fewer dangerous scoring chances will make it easier for the goalie to control his rebounds. Here are the correlation coefficients between team shot quality against and rebound shots percentage:

2005-06: 0.39
2006-07: 0.09
2007-08: 0.42
2008-09: 0.30

That looks to be pretty good evidence of an underlying relationship. If we take that into account, then that .006 gap might shrink to .003 or .004, and that is between the best and the worst. That's not even taking into account how good the defence is at clearing rebounds. Once you include that in the equation, there likely isn't a whole lot of margin left.

Is it possible that there is some other effect of rebound control that I am not missing? I brought up the possibility some time ago that goalies with good rebound control might be deterring shots, since opponents will be less likely to shoot from bad angles if they don't think they will be rewarded with either a goal or a rebound. That effect, if it exists, would be difficult to quantify.

These numbers are also subject to the limitations of NHL play-by-play data. If there is good reason to believe that more rebound shots and goals are taking place than are being counted, then we might have revise the strength of some of these conclusions.

One thing that may be possible is that poor rebound control may be an indicator of a goalie who is not on his game. It would be interesting to see if goalies are more likely to allow goals in games where they allow multiple rebound shots against.

I am certainly not saying that goalies should ignore rebound control or not try to develop their skills in that area. Excellent rebound control is of course preferable to poor rebound control, and will help prevent goals against. One of the reasons that we don't see a lot of difference in things like rebound control at the NHL level is that goalies with particularly bad skills in that area would never make it there in the first place. However, we still need to be particularly careful to avoid making the mistake of letting the obvious nature of rebound control overly influence our evaluation of a goalie.

Wednesday, April 15, 2009

Shot Quality

ESPN's GameCast feature shows shot charts for every NHL game, which is quite useful to indicate the territorial play and to help evaluate goalie performance. I am a big fan of shot quality metrics like the one developed by Alan Ryder, but one of the limitations is that it reduces a team's shots against distribution to a single number. Even if we know that a team's shot quality against is 0.95, we don't know whether that is because they allow a lot of perimeter shots, or whether they are good at preventing in-close chances. I think this might be one of the reasons why some fans seem hesitant to trust the shot quality numbers.

As a result, I have developed my own system, using my typical brute-force approach. The objective is to achieve the dual goal of approximating scoring probabilities while also being able to qualitatively describe the type of chances a team gives up. Based on the scoring probability information given in this blog post from Behind the Net, I divided the rink up into 5 zones: Crease Area, Slot, Mid-Range, Point, and Perimeter. I used rink markings to divide the separate zones, which isn't exactly correct based on scoring percentages but becomes somewhat necessary for classifying chances. For the sake of time, everything is eyeballed except for chances from the crease area, which are defined as anything within 15 feet of the net inside the edges of the crease.

Here is the rink diagram:


The primary drawback of this technique is that it does not differentiate between power play and even-strength chances. I could split them out, but obviously it takes a bit of time to go through and count the chances so for now I'm lumping them all together. There are also certain rinks that seem to measure shot distances differently than everyone else (see this article by Alan Ryder), which means that it is not really fair to compare results directly between, say, goalies playing for the New York Rangers and goalies playing for the Tampa Bay Lightning. I'll have to work on some way to adjust for these discrepancies.

Just to be clear, I am not presenting this as an improvement on existing shot quality metrics. I hope the results will be similar, but I would certainly defer to other methods if there is a disagreement because they are much more precise in terms of identifying exact shot distance, shot type, etc. The primary reason for doing this is to get a better sense of the type of shots each goalie is facing. Is he facing a lot of shots from the perimeter? How well does his team cover the point shot? How many close-in chances does he face? And so on.

I don't have a large enough sample size to get high-confidence estimates of scoring likelihood from each area, but based on a sample from last year's playoffs the approximate average save percentages are:

Crease Area: .800
Slot: .850
Mid-Range: .925
Point: .960
Perimeter: .980

I'm planning to use this method to break down this year's playoff results, as well as to take a closer look at Martin Brodeur vs. his teammates in my continuing look at how goalies contribute to shots against. Criticisms and suggestions are welcome in the comments or via email.

Tuesday, April 14, 2009

The Replacements, Final Comparison

At season's end, the best backup comparison sample for Brodeur since 1993-94 ends with the following breakdown:

Brodeur: 19-9-3, 2.42, .916, 28.8 SA/60
Backups: 32-18-1, 2.40, .918, 29.3 SA/60

I'll just let those numbers speak for themselves. It's not a huge sample size, but there was no major effect on either GAA or shots against without Brodeur in the lineup. Nor, for that matter, was there much difference in New Jersey's goal scoring or winning percentage with or without the "best goalie in the NHL."

I think the "Brodeur saves 5+ shots per game" crowd has been permanently and decisively disproven by this season's results. For Marty to play that kind of sample size and to face just 0.5 shots per game fewer than his teammates is pretty good evidence that the range of shots against is fairly narrow. I do think that the "true" shots against differential between Brodeur and Clemmensen is likely more than 0.5 shots per game, as Brodeur happened to be the man in net during New Jersey's late season swoon, but the results certainly are in line with my estimate of +/- 1 shot per game on average as the goaltender's effect.

I think there is a substantial difference between Brodeur and Clemmensen in skills like rebound control and puckhandling. However, there is not much of a net effect on the team as a whole because the rest of the skaters adjust their play at least somewhat to the particular goaltender. That explains why we don't usually see much of an impact in the shots against and GAA data from goalie to goalie.

I'm not going to go much farther than that in terms of assigning significance to these results. Brodeur is a better goalie than Scott Clemmensen or Kevin Weekes. The numbers may have been similar, but Brodeur faced more difficult shot quality against and likely was affected to some degree by his long injury layoff. I suppose this season supports my overall position on Brodeur, but it is by no means the main piece of evidence in the argument. The results do show fairly plainly that New Jersey is an easy place to play goal, but that should have been obvious to begin with.

Thursday, April 9, 2009

How Not To Pick Vezina Winners

I sincerely hope that the general managers and sportswriters who vote on the end-of-season awards do not think like Rocky Bonanno.

His picks for Vezina nominees: Niklas Backstrom, Evgeni Nabokov, and Miikka Kiprusoff.

In my book that's going 0 for 3, as those three rank 15th, 19th, and 33rd respectively in the latest shot quality neutral save percentage rankings at Hockey Numbers. If you are going to just rank goalies entirely based on wins, at least put somebody like Cam Ward in there who actually has been having a good season.

And then there is this example of impeccable logic:

Honorable mention also goes to Boston's Tim Thomas, but 52 games played doesn't bring him up to snuff.

That's right, Thomas, you should have demanded your team play you more games. Don't you know that there is some magical point at around the 55 game mark that makes you instantly become worthy of Vezina consideration?

Take Evgeni Nabokov, for instance, who has played 59 games this season. It shouldn't even be necessary to explain why 59 games played is so much more valuable than 52, but I'll give it a try for all of you non-NHL.com analysts out there who clearly don't understand goaltending.

Let's compare them through their first 52 games this season:

Nabokov: 36-8-7, 2.41, .911
Thomas: 34-11-7, 2.07, .933

Even though it looks like Thomas was by far the superior goaltender through the first 52 games, by playing more games Nabokov had a much harder workload. I mean, just look at the difference in the number of shots faced this year:

Shots against in 2008-09:
Evgeni Nabokov: 1,622
Tim Thomas: 1,621

Maybe Thomas would have let in his next shot, and then have his teammates score 33 own goals on him, dropping him behind Nabokov in save percentage. We just don't know, and that's why anything a goalie does in only 52 games is completely worthless.

Similarly, in his first 52 games this season Kiprusoff had fewer wins than Thomas did over the same span, as well as a 2.80 GAA and a .906 save percentage. In his extra games played over Thomas, Kipper has gone 12-8-1, 2.91, .895. To match Kiprusoff's season stats, Thomas would have only needed to win 11 out of his last 23 decisions, and put up a 4.78 GAA and an .805 save percentage.

Some of you might think that is pretty good evidence that Thomas has been a lot better than Kiprusoff. You would be wrong. Of course I do concede that the goalie currently leading the league in save percentage might not turn into by far the worst goalie in the history of the NHL over his next 23 games. That is possible, some might even say probable. But it could have happened. All we know is that Kiprusoff did play those extra games and Thomas didn't. Seventy-five is more than 52, and therefore Kipper has been a better goalie this season. End of story.