Wednesday, December 30, 2009

Book Review: HockeyNomics

Stats analysis has slowly but steadily crept its way into the world of professional sports over the last two decades, and hockey has been no exception. I'm not someone who needs to be convinced of the value of statistical analysis (obviously), but the numbers work done in the hockey mainstream continues to be pretty simplistic.

HockeyNomics is Darcy Norman's attempt to popularize and spread the idea of "study[ing] NHL stats based on science, not just opinion". I found the book to be a useful introduction to the world of statistical analysis in hockey. When I say "statistical analysis", I really mean "proper statistical analysis", because there are lots of fantasy hockey GMs out there who can quote lots of numbers that really say very little at all.

Norman quotes hockey analyst Alan Ryder in the book as saying, "Hockey is awash with meaningless and, even worse, misleading statistics" (p. 34). As a goalie analyst, I've run into lots of sportswriters who write screeds against judging goalies based on stats, and then fill out their Vezina ballot based on the league leaders in wins and shutouts. Similarly, points make up such a huge part of a skater's rating, even though the player's usage (situational ice time, defensive vs. offensive zone faceoffs) and situation (strength of linemates and opposition) can have a big impact on those numbers. And that's not even starting to discuss the player's defensive play. Many people seem to evaluate a player's penalty killing skill, for example, by the number of shorthanded goals they score, which is at best an awfully incomplete analysis.

Nobody watches every game, so even the most hard-core anti-numbers crowd is probably going to be influenced by at least some numerical evidence. The trick is convincing fans to focus on the good numbers (e.g. possession metrics, save %, rate stats) instead of the bad (scoring numbers, wins, career stats).

The book does not get into any really heavy mathematical work (Norman says a few times that doing so would be outside of his scope). The limited scope means that some parts had to be simplified. For example, the Ovechkin-Crosby debate was limited to a discussion of which player was likely to be more high-scoring over the course of their careers. A full analysis could have tried to factor in additional variables like linemates, ice time, or defensive play. Those who follow the likes of Vic Ferrari, Tyler Dellow, Gabe Desjardins or Puck Prospectus on a regular basis might want to see some deeper analysis or think that the book didn't go far enough in a few spots.

Norman discusses a number of interesting topics. I thought the Poisson modeling to identify the best goalscoring season was an interesting method. The typical way to adjust is to use the average goals scored per game to create an adjusted goals figure, but that doesn't seem to work in all cases since the scoring totals of the top forwards and the league average goals scored have not always followed each other in perfect lockstep. In some seasons in the 1980s league scoring was very high, yet other than Gretzky the top forwards in the league were only scoring around 110 points. In the early 1990s the league scoring level had dropped, but for a few seasons in 1992-93 and 1993-94 many of the top forwards put up terrific numbers.

The book also gets into various measures of drafting success. I personally feel that player development is more important than drafting, and that how teams develop their 18, 19 and 20 year old players should be considered as well. There is a high degree of consensus on high picks, so it seems very unlikely that we would see the observed divergence in terms of player performances based simply on what names get called out on draft day. Part of it may just be a matter of overall team-building strategy: Some teams rely heavily on drafted talent and are content to develop players at the NHL level, while others either have much less turnover or fill many of their roster spots with players that were developed elsewhere.

The last chapter is of particular interest, since it deals with the question of whether Martin Brodeur is overrated. I don't want to give too much away, but Norman quotes me a few times and that should tell you which side he ends up on. To be honest, a regular reader of this blog would be familiar with all the arguments presented in the chapter. It is important once again to point out that "overrated" simply means that a player is rated more highly than he deserves to be. Martin Brodeur is a unique athlete in terms of his durability, his playing style and his ability to contribute to his team in ways other than simply making saves. Despite this many fans consider him to be the best goalie ever, even though Brodeur is most likely the third best goalie of the last 20 years.

HockeyNomics gives a quick summary of the most relevant arguments, but I think there are a couple of issues that still need to be resolved to properly evaluate Brodeur. They include a more nuanced analysis of his overall non-save impact, properly assessing any scorer bias effects from playing in New Jersey, and improving shot quality metrics. My confidence in shot quality measures has been shaken somewhat over the last few months, but I'm still pretty sure that there is some signal in the noise. It may be only useful for analyzing outlier teams, but if anybody was an outlier in terms of defensive skill it surely was the New Jersey Devils.

HockeyNomics is a decent read for anyone interested in hockey, but if you want deep analysis or heavy number-crunching you might find it a bit light in spots. It might be a good gift idea for that friend or relative who either buys into just a bit too much of the conventional wisdom in hockey, or perhaps knows the numbers but tends to focus on the wrong ones.

Saturday, December 26, 2009

Why Do So Many Dumb Things Get Written About Chris Osgood?

The official website of the National Hockey League named Chris Osgood as the Second Team All-Star goalie on the NHL's All-Decade Team (hat tip to an anonymous poster on this blog for passing along the link).

That is so mind-boggling that it really shouldn't require much further comment for any rational individual. But if you want to see some numbers just to drill the point home, here are the GAAs of Chris Osgood and Roberto Luongo in both the regular season and the playoffs from 1999-00 to 2009-10:

Regular Season: Osgood 2.56, Luongo 2.56
Playoffs: Osgood 2.05, Luongo 2.09

Even if one was so crazy as to use goal prevention as the sole criterion, it's still barely possible to make the argument for Osgood over Luongo, when you consider that Luongo played in a lot more games. Think for more than a nanosecond about their respective teammates and team situations and it's completely obvious who is far better.

If you want to argue about the value of save percentages for someone like Martin Brodeur, who brings more to the table than just stopping pucks that's fine, but this is Chris Osgood. Over the last decade Osgood's numbers read .912 at EV and .871 on the PK. Luongo's .929 at EV and .887 on the PK. And we're talking about a sample size of over 13,000 shots for Osgood and nearly 18,000 shots for Luongo, i.e. no doubt at all that there is a massive gulf between the two goalies.

Multiply the save percentage differential over Luongo's workload, and you get a difference of 275 goals, or nearly 30 goals per season. That's the equivalent of something like 50-55 wins. A team would be 10 points better in the standings every year with Luongo in net than with Osgood. On a per-game basis, that's a difference of about half a goal per game. Yet somehow a difference in playoff winning percentage of .073 between the two goalies is enough to make up for all of that.

Basically, this guy is saying that the scoring proficiency of a goalie's team means an awful lot more than saving an extra half a goal per game, based on the time-worn journalistic principle of assigning all team results to the account of their goaltender. Yet I'm sure nobody would dare rank one skater ahead of another if the first guy had averaged 30 fewer points per season than the other guy for an entire decade, regardless of any difference between them in terms of Cups or team success.

For illustrative purposes, here are the players who averaged 0.5 PPG less over the past 10 years than All-Decade First Team All-Star Joe Sakic:

Pierre-Marc Bouchard
Jonathan Cheechoo
Ulf Dahlen
Joe Pavelski
Gary Roberts
Todd White

And the same thing for fellow First Team All-Star Jarome Iginla:

Chris Higgins
Jan Hlavac
Bobby Holik
Sami Kapanen
Steve Konowalchuk
Sergei Zholtok

Now obviously scoring totals are not complete assessments of player values, but I think the point is pretty obvious. I'm not sure Osgood would make my Fifteenth All-Decade Team, as I'd take any of Luongo, Brodeur, Giguere, Kiprusoff, Hasek, Roy, Vokoun, Kolzig, Belfour, Nabokov, Lundqvist, Turco, Roloson, Khabibulin, Thomas or Theodore ahead of him. Maybe a couple of others as well.

Monday, December 14, 2009

Goalie and Faceoffs

Some of the most interesting new statistics introduced by stats guys in the blogosphere have been numbers that track how many defensive and offensive zone faceoffs players are put on the ice for. For skaters, this helps adjust for the way they are being used by their coach, since for example it is an advantage for an offensive player to start their shift in the other team's end rather than having to move the puck 180 feet down the ice to get it into a scoring position. Faceoff numbers can also be used to measure which players are driving possession, by seeing whether a player ends their shift more often in the offensive or defensive zone.

I haven't seen anyone apply these numbers yet to goalies. Since goalies don't change lines, the starting and ending shift numbers are of course meaningless. However, on many shots goalies have the opportunity to either freeze or play the puck, and that choice affects the number of defensive zone draws their teams face. Also, if certain goalies are able to contribute to their teams in other ways, such as for example through their puckhandling skills, then it might show up by their team taking more draws at the other end of the rink.

Vic Ferrari at timeonice has the faceoff zone start numbers for every player in the league last year, including goalies. I'll stick to the convention used by Gabe Desjardins at Behind the Net where he has the offensive faceoff percentage for all players, a number calculated by ignoring the neutral zone faceoffs and dividing the number of offensive zone faceoffs by the total number of draws in offensive and defensive zone combined.

The correlation between the offensive faceoff percentage for starters and their backups was 0.51, which suggests that the rest of the team has a big impact on puck possession and faceoffs. That should be fairly uncontroversial. I'd expect with a bigger sample size for most of the backup goalies that number would be higher.

The correlation between offensive faceoff percentage and shots faced per 60 minutes was -0.57. I was expecting that relationship to be stronger, since all the Corsi evidence shows a big advantage to starting in the other team's zone, but again EV only for one season is a fairly small sample so there is likely a reasonable degree of luck in the numbers.

There were 9 teams that had a offensive faceoff percentage difference of 5% or more between their starting goalie and his backups. Here are the faceoff numbers for each of those teams, along with the offensive faceoff percentages and shots faced per 60 minutes:

Ryan Miller: 480 def, 433 off, 47%, 30.9 SA/60
Backups: 254 def, 282 off, 53%, 31.2 SA/60


Dwayne Roloson: 553 def, 513 off, 48%, 32.6 SA/60
Backups: 210 def, 240 off, 53%, 31.4 SA/60

Carey Price: 423 def, 409 off, 49%, 29.9 SA/60
Jaroslav Halak: 383 def, 269 off, 41%, 33.5 SA/60

New Jersey:
Martin Brodeur: 194 def, 268 off, 58%, 28.8 SA/60
Scott Clemmensen: 434 def, 353 off, 45%, 29.0 SA/60
Kevin Weekes: 154 def, 136 off, 47%, 30.1 SA/60

New York Rangers:
Henrik Lundqvist: 459 def, 665 off, 59%, 29.0 SA/60
Steve Valiquette: 90 def, 102 off, 53%, 30.7 SA/60

Alex Auld: 322 def, 373 off, 54%, 28.0 SA/60
Backups: 390 def, 332 off, 46%, 28.3 SA/60

Martin Biron: 584 def, 484 off, 45%, 32.4 SA/60
Antero Niittymaki: 226 def, 230 off, 50%, 31.5 SA/60

San Jose:
Evgeni Nabokov: 510 def, 521 off, 51%, 27.1 SA/60
Brian Boucher: 150 def, 202 off, 57%, 26.2 SA/60

Vesa Toskala: 342 def, 366 off, 52%, 29.8 SA/60
Backups: 278 def, 370 off, 57%, 30.0 SA/60

The Price/Halak gap is interesting and appears to account for some of the shot differential between them, but I'm not sure how much it had to do with the goalies. I'd bet the split would be in the other direction if we were looking at this season's numbers, based on how Montreal has played in front of each of them in 2009-10. Most of the others either involve backups who didn't play very much or goalies who I wouldn't expect to have much of an effect on faceoffs, although there is one notable exception.

The biggest gap between any starter and backup, by far, is in New Jersey. Knowing what we know about those goalies, I'd say that these numbers suggest a real effect. In most starter/backup scenarios, we have to at least consider the possibility of strength of schedule being a factor, but that wasn't the case here as Clemmensen was an injury replacement for Brodeur. The Devils weren't performing exactly the same all the way through the season, but all three of their goalies had similar GAAs so it is likely that they played in fairly similar environments.

The numbers indicate that Brodeur is either helping shift the play to the other end of the ice or freezing the puck less often than the other goalies. Here are the faceoff numbers for New Jersey's goalies broken down per 60 minutes of EV play:

Brodeur: 8.6 def zone, 11.9 off zone
Clemmensen: 14.2 def zone, 11.6 off zone
Weekes: 15.4 def zone, 13.6 off zone

Brodeur keeps the puck moving a lot more than the other two (and indeed, he probably keeps it moving more than any other goalie in the league). However, the team did not take substantially more faceoffs at the other end of the rink, which makes it uncertain whether Brodeur's impact translates to the offensive side of the rink (although I would certainly like to see more data on this one).

Consistently giving the puck to teammates instead of allowing the opposing team to win control of it through a faceoff should help a team, and this may account for some of the observed shot differentials between Brodeur and his backup goalies. There are other possible benefits, such as creating more changes on the fly, which could be to the benefit of a smart bench coach who wants to get his matchups. I'd expect some tradeoff in terms of increased turnovers, but Brodeur is likely pretty efficient.

I think these results also shed a bit more light on the rebound numbers discussed earlier here that showed Clemmensen allowing a lower rate of rebound shots against than Brodeur. Given that Brodeur would have been playing the puck much more often and attempting to direct his rebounds rather than simply freezing the puck, that means he would have been facing many more opportunities to turn over the puck or for the other team to steal it and get a quick shot on goal. If Brodeur was aggressive in terms of directing rebounds and playing the puck while Clemmensen was conservative (and probably helped by a defence that gave extra attention to clearing the crease), then that would explain why Brodeur's numbers don't seem as good despite his superior skill. More opportunities usually mean more errors, no matter how good you are.

This is just a cursory look from one year's worth of data, but looking at a few more seasons' worth of data could help us better identify Brodeur's effect here and see whether any other goalies seem to have a tendency towards freezing or moving the puck.

Monday, December 7, 2009

Why Offence Rules the New NHL

"Defence wins championships" is a familiar cliche that is thrown around as a truth in not only hockey circles, but by fans of virtually every team sport. At some points in hockey history it may have been true, but I believe the game has changed. In the new NHL, the evidence suggests that offence wins championships.

First of all, I'll give the small sample size warning to everything that I'm about to post. Past results do not guarantee future performance and so on. I tend to believe the numbers show a real effect since regular season results have generally fallen in line, but I'm going to be focusing on playoff results only which means that the sample is limited to 60 playoff series over the last 4 seasons.

Secondly, I'm ignoring all shootouts here. Goals for and goals against mean actual goals, not the goal awarded in the standings to the team that won the shootout. I counted all games that went into a shootout as ties for both teams, so because of that a few times I considered a team that finished lower in the standings to have had a better record than their opponent. This approach makes sense to me since shootouts don't happen in the playoffs.

It is important to realize that in the playoffs nothing happens with a high degree of certainty. Upsets regularly happen in short series. Just to establish a baseline, over the period from 1955-2009 the team with the better regular season winning percentage has won 65% of the playoff series. Since the lockout the percentage has fallen to 61%, which likely reflects the higher level of parity in today's game.

If the better overall team wins 61% of the time, how does the better offensive team do? Answer: The team with more goals scored has actually done even better, winning 37 of the last 60 series (62%). That's a much better success rate than the team with fewer goals allowed, which has won just 27 series (45%).

Often the same team will be better in both categories. For example, the 2008 Detroit Red Wings had better regular season offensive and defensive numbers than all four of their playoff opponents. Let's look only at series when a team with more goals scored plays against a team with fewer goals against, the classic offensive team vs. defensive team scenario. In those matchups, the stronger offensive team has won 24 out of 38 times (63%).

The numbers also show the value of looking at a team's win threshold. The team with the better win threshold won 38 out of 60 series (63%). Win threshold has been a better predictor of success since 2006 than a team's overall record or number of goals scored. As you can infer from that result, a team with a higher win threshold was slightly more likely to win than an opponent with a lower win threshold even when the latter had a better win/loss record, although this advantage was slight (11/21, 52%).

When we focus on goaltending, we also see that the post-lockout playoffs have been determined primarily by the play of skaters, rather than by the play of the masked men. The team with the better regular season save percentage has won just 25 out of 60 series (42%). When a team with a better save percentage has played against a team with a better record, the team with the better goaltending won just 7 out of 25 series.

If you look at each position individually then goalie is the most important position in hockey, but the 18 skaters as a group are collectively much more important than the goalie. A good measure of the effectiveness of the skaters is a team's win threshold. Since the lockout, when a team with a better win threshold went up against a team with a better save percentage, the team with the better skaters won 26 out of 39 times (67%). Given the uncertainty of playoff results in hockey, that is a very high probability.

There is another cliche that goes something like, "In the playoffs you don't need a great goalie, you just need a hot goalie". If you have a dominant team then you might not even need that, but for most teams that is probably not far from the truth. The abundance of good goalies in today's NHL means that a lot of teams have a goalie that is capable of excellent play for a month or two. The best goalies are not able to be the difference makers that they perhaps once were, and that means that goaltending ranks well behind a team's offensive ability in terms of predicting their success.

Wednesday, December 2, 2009

Clutch Play

"TCG's other big problem is that he completely ignores the human element of sport, and approaches all statistical problems from the assumption that players perform the same regardless of game situation, again without ever really grappling with the problem. It is essentially a hockey version of Moneyball, Billy Beane's now comical baseball analysis which included the theory that there is no such thing as clutch hitting...I find attempts to erase the human element of sport under a pile of statistics not only patently false, but also vaguely disgusting." (HFBoards)

I just wanted to post a few words about my outlook on clutch play. The poster above is essentially correct in my base assumption regarding clutch performance in hockey players. However, I don't think it's fair to say I've never grappled with the problem. I simply think that the high degree of uncertainty means that it does not make sense to use clutch performance as an evaluative tool for goalies. Here are the six primary reasons:

1. Most of hockey is played with the score close. Allowing a goal against in a close game results in a significant downgrade in a team's win probability, and therefore most of a goalie's workload should be considered to be a clutch situation. From that, it follows that a goalie's overall performance should closely approximate his clutch performance, since most of the sample comes from situations with a high penalty for allowing a goal against.

2. If we define clutch situations more narrowly, we run into small sample size issues. For example, if we were to evaluate goalies based entirely on their performance in third periods in the playoffs, then we only have several hundred shots to work with even for experienced netminders. A more common split is to simply discuss regular season performance and playoff performance separately, but for the most part over a larger sample size the two results converge, or are generally within a typical margin of error for the size of the sample. Playoff results are also fraught with additional perils, including extreme opposition effects, a different style of play, and greater playing-to-the-score effects.

3. The best way to evaluate a method or statistic is to see how well it predicts the future. If we want to include clutch play in our predictions for active netminders, then sample size is even more of a concern. It is a simple fact of random chance and the variability of athletic endeavours that some goalies are going to start their playoff careers hot while others start their playoff careers cold, regardless of their level of talent, preparation or mental fortitude. Without a large sample size to work with, we're essentially guessing at this point whether someone like Cam Ward has a true ability to raise his game in important situations or whether he had a couple of well-timed hot streaks. If he continues to play on a marginally talented team that frequently misses the playoffs, we may never know with a high degree of confidence which viewpoint is correct.

4. Players and teams have the option of changing their style of play, their matchups, their shoot/pass tendencies, and their offensive/defensive bias to match the game situation. Goalies do not have those same strategic options. This suggests that changes in results or percentages in response to the game situation are primarily driven by players, not goalies.

5. Nearly every goalie who has been identified as clutch by subjective evaluators played on a dominant team. That correlation certainly suggests that many observers are conflating team effects with goaltender performance. It is possible that they are correct, but it does not seem very probable, given that on the whole team strength is a much better predictor of playoff success than goalie strength. If there were goalies who played on weak teams who did not have significant team success yet were universally praised as clutch, then I would have more confidence in the ability of observers to rate "clutchness".

6. Subjective evaluations of goalies often strongly emphasize a goalie's performance in important situations, his team success, and whether or not someone believes he is a "winner". It is conventional hockey wisdom that you need a clutch goalie to win and that the best goalies win the most games. Therefore, it seems extremely likely that there should be a selection bias against goalies who can't perform under pressure, that scouts will pick out the most clutch goalies to advance to higher levels of play. If a goalie who is in truth a "choker" or a "loser" does make it all the way to the best league in the world, then that would reflect somewhat poorly on the ability of observers to subjectively evaluate clutch play.

In summary, I'm not a clutch play disbeliever, merely a clutch play skeptic. I have done studies of clutch performance because I think it is a worthwhile topic to investigate, but I don't think the evidence suggests it is particularly significant or that it is accurately estimated by observers. I'm certainly not saying that sports psychologists are quacks, or that all athletes perform at exactly the same level in every situation. Players are indeed human, and there are too many top-level athletes who failed under pressure to discount the human element entirely. However, I think we need to focus primarily on the most significant data, and I see NHL goaltending as an area where major clutch differences are structurally unlikely (see points 1, 4 and 6 above). We also have to be very careful about poor logic when switching back and forth from the general case to the specific case, e.g. some athletes choke, Goalie A choked in a big game, therefore Goalie A is a choker who will always choke.

Let's assume there is some small variance in clutch skill among NHL goalies. Precisely measuring that skill will be very difficult, whether you are evaluating players subjectively, objectively, or using both methods. Either way you are going to make mistakes, because chance happens and the future is unknown. If you want to be like a television broadcaster and subjectively praise players for their mental toughness and because "all they do is win", then you are going to hype some guys who simply went on a hot streak and have nothing but regression to the mean in their future. You're also going to dump on some players for their lack of fortitude who, unbeknownst to you, are going to tear up the league in future playoff seasons.

On the other hand, if you view the world the way I do, you run the risk of failing to correctly praise a player as clutch, or at least you won't do so until their careers are mostly over and they have proven that they have that ability. You will also continue to predict great things for players who have good overall records but have poor clutch performances in their early careers. Some of these players might continue to perform poorly under pressure, but many of them will see their future pressure performances improve to match their overall ability.

If someone thinks that this perspective is promoting an agenda or in some way ignoring evidence, I'd like to point out that I took the same approach on shot prevention effects. I always thought there was a small effect, but I thought it was likely fairly insignificant and I wasn't going to commit to anything at all until I had evidence of what it was.

Was I wrong to state that goalies have no effect on shots against? Absolutely, as I think there is some very good evidence that goalies can affect the number of shots against, and the observed variance among NHL goalies seems to be about 1 shot above or below average. However, there are tons of people (including Martin Brodeur himself) who are demonstrably wrong on the other side of the equation because they overestimate the effect. Guessing too high is just as wrong as guessing too low. And by doing some in-depth research on the issue from my devil's advocate position and debating others who disagreed with me, I think we've come to greater learning than we otherwise would have if I'd merely accepted the consensus opinion or came to some quick subjective estimate and left it there.

Similarly, it's highly probable that I would be wrong to state that there is absolutely no difference in clutch skill among NHL goalies. However, in the presence of uncertainty that's still the position I am generally going to take, because I don't know exactly where to draw the line and I think that most people are drawing theirs too far on the opposite side of the true marker. That makes us both wrong, but my bet is that I'm closer to being right.

It's doubtful we'll ever develop perfect metrics or track them perfectly, and even if we do there are still going to be some limitations like being constrained by sample size. In the great clutch debate, that means it is likely always going to come down to picking which error you want to make. I'd rather assume a player is not clutch and wait for proof that they are, then assume that they are and wait for proof that they aren't. I think the general correlation between regular season and playoff performance, and the observed regression to the mean of many players who at one point or another were considered playoff over- or underachievers makes my position the one that is less likely to make mistakes. I am quite aware that there will probably still be mistakes, but I'm willing to accept the trade-off. And if we can ever prove the magnitude of clutch skill for NHL goalies, then I will update my position accordingly.