Wednesday, February 29, 2012

Why the Counting Trophies Method of Goalie Evaluation Is Flawed

One of the biggest problems with evaluating goalies by career value is that there aren't any good commonly-used counting statistics. A goalie's career line merely shows games played, wins, and shutouts, plus the rate stats of GAA and save percentage. Wins and shutouts are very heavily influenced by the rest of the team, with shutouts also varying widely depending on the level of league scoring. Games played is important in determining overall value, but does not take into account level of performance at all other than reflecting how long the netminder was able to convince an NHL team to keep giving them starts.

There has been a shift towards a greater focus on save percentage, but as long as save percentage remains a rate stat it is difficult to understand intuitively whether a goalie with a higher save percentage but a lower workload is contributing more value than one of his peers with the opposite. There are a number of good ways to turn save percentage from a rate stat into a cumulative stat (typically by comparing to league average or to replacement level and then multiplying by the shots against). Hockey Prospectus' Goals Versus Threshold, which is slightly more complex but based on a similar foundation, is a number that is recognized at least within the online hockey stats community, but there is not a widely accepted standard.

As a shortcut, therefore, many people look at a goalie's trophy case, and use that to determine which netminder had the better career. Intuitively that makes sense, as when we evaluate athletes we want to know things like how many times they were considered the best in the league. However, I do not believe this is the best approach for evaluating NHL goaltenders.

If somebody just sweeps the awards year after year, like Dominik Hasek did in the 1990s, then that is definitely showing something meaningful. Or when Glenn Hall kept getting voted ahead of other Hall of Famers while repeatedly overcoming the heavy bias towards the GAA leader, that's also probably revealing something important about how his play was viewed by his contemporaries. But goalies with a handful of awards are exceedingly rare. When you are comparing two veteran goalies where one has a Vezina or a Smythe while the other one doesn't, I don't think that trophy really adds much information at all. Some trophy-focused individuals might, for example, try to argue that Miikka Kiprusoff's Vezina Trophy means he has had a better career than Roberto Luongo, even though the performance gap between them is likely at least 100 goals in Luongo's favour (Lou's career GVT is more than double Kipper's number).

The problem with goalies is that one season is not a large enough sample to properly rate anybody, because results are not accurately tied to performance. Given that awards are handed out primarily based on results, this means that luck and team factors play a disproportionate role in winning awards for goalies.

For a .914 goalie who faces 1800 shots in a season, the 95% confidence interval of his save percentage based on binomial probability would put his performance anywhere between .903 and .925. To prove that this is not just a theoretical exercise, we can just look at two goalies with .914 career save percentages: Ilya Bryzgalov and Ryan Miller. Both have a low mark of .906 as a starting goalie, and despite their strong career track records both have struggled for the majority of this season. Bryzgalov looks likely to set a new career low mark this year, although Miller looks to have turned things around as of late.

Their peaks, on the other hand, are much higher, with Bryzgalov hitting .921 last season and Miller topping out at .929 during his '09-10 Vezina season. Those ranges are in fact almost exactly what would be expected if their numbers varied by random chance alone. Between the two of them, their average high is .925 and if Bryzgalov ends up at his current .899 while Miller stays above .906 then their average low would be .903, which again would exactly match the predicted range given above.

In addition to simple performance variation, a goalie's teammates could raise or lower his save percentage by up to about .005 or so depending on how many penalties they take and whether they are good at preventing shots against on the penalty kill. The official scorer in the goalie's home rink could also assist in cutting or boosting shot totals by a shot or so per game, which again could have an impact in the neighborhood of .005 compared to a goalie on the other end of the spectrum. And maybe a goalie plays for Ken Hitchcock or Jacques Lemaire and his team has a good year in front of him defensively in terms of reducing scoring chances, which could easily add another few thousandths to the final save percentage number.

Over a career these effects often mostly wash out, as a goalie will benefit from them in some seasons and suffer because of them in others. But when looking at a single season, as is the case when considering awards nominations, these factors can further accentuate the already heavy effect of random chance.

A goalie with at least a .920 save percentage over 1800 shots has about a 50% chance of being a Vezina nominee, based on seasonal results since the lockout. Bump that save percentage up to .925, and you probably have about a 50% chance of winning the trophy, given that half the goalies who met both cutoffs won the Vezina. Consider that binomial probabilities suggest that a goalie with league average talent will put up a .920 or better over 1800 shots 1 time out of every 5 just by chance, and you can see how it is entirely possible that a goalie who carves out a decent NHL career will probably luck into at least one good season somewhere along the line even before considering the other factors that could help his statistics. Fortunately there are selection effects that limit the number of flukes, since most average goalies won't be given that many starts by their teams, but it does still happen with some regularity.

Whether a goaltender wins the Vezina or not depends not just on his own play, but also what other goalies are doing around the league. To return to the Luongo/Kiprusoff comparison above, it's easy to see the impact of external factors when you consider that if Martin Brodeur had his 2006-07 season in 2005-06 and vice versa, Luongo would be the guy with a trophy while Kiprusoff would have been shut out.

It's not hard to find examples of goalies doing more or less the same thing year after year even as their voting numbers vary widely. Take Patrick Roy over the last five seasons of his career when he was the very picture of elite consistency, rattling off 61-63 starts per season, overall save percentage typically in the .915-.920 and EV SV% in the .925-.930 range and a GAA usually 2.20-2.30. In 2001-02, his numbers all improved a bit, particularly his GAA and shutouts, although his EV SV% was just .005 better than his five-year average. His team's improved shot prevention and penalty discipline helped as well. There's certainly a chance that Roy played better than normal in 2001-02, but there is also a pretty strong possibility that he was more or less the same goalie all the way through and the breaks went his way in '01-02.

Over that period Roy typically got a few Vezina votes per season, until 2001-02 when he almost won the award. That year the high-minute goalies (Brodeur, Kolzig, et al) had down seasons, while Hasek was playing at a lower, post-injury level. It turned out that only one starting goalie other than Roy posted a save percentage above .921.

Unfortunately for Roy's trophy case, that goalie happened to be Jose Theodore, who put up a .931 save percentage on his way to the Vezina/Hart combo. If you look at Theodore's 2001-02 numbers in context, they are major outliers. He also had a bunch of the indicators of one-year flukes, including very high special teams numbers and much better numbers at home than on the road.

Knowing what we know now, it is very likely that Patrick Roy was a better goalie in '01-02 than Jose Theodore. That just wasn't immediately apparent from that 82 game sample, and that's why Theodore won the award. Using awards as the primary evaluation criteria, Theodore's first five seasons as a starter ranks ahead of Roy's last five seasons as a starter (unless you're one of those guys who thinks only the playoffs matter and you really love Roy's 2001 Cup/Smythe combo, and even in that case you'd probably give Roy just a slight edge), even though Roy's .929 EV SV% on 6165 SA is quite a bit better than Theodore's .921 on 6522 (about 50 goals better, or nearly two wins per season).

In contrast, you won't find a skater win a scoring title primarily because of luck. Assume a typical first liner with 15 minutes per game at even strength and 3.5 minutes per game on the power play, who either doesn't play much on the PK or doesn't score any points when does get an occasional shorthanded shift. Let's say his team takes shots at an average rate while he is on the ice (27 shots per 60 at 5 on 5, 45 shots per 60 on the power play). That player and his linemates would need to put up a ridiculous shooting percentage, and he would have to be involved in an unusually high number of plays to get on the scoresheet enough to be in Art Ross contention.

For example, if the player's team shot 13% at evens and 20% on the power play, both numbers above what any regular player managed last year, and got a point on 90% of his team's even strength goals and 75% of his team's power play goals, both extremely high and unusual participation percentages, the player would still end up with 97 points. That's a pretty high total and it would end up being near the very top of the league, especially this year, but it still isn't high enough for a top-3 finish in any of the post-lockout seasons. And again, this hypothetical player would need to get all the breaks just to get that close. If any of the luck factors drop off, then he's not in contention, and there are many ways that could happen (e.g. he misses a few games, his ice time decreases, his teammates don't score at a high rate, he gets unlucky with second assists or the power play runs more often through a teammate which reduces his scoring opportunities).

Jordan Eberle is pretty much that guy this year (at even strength his team is shooting 13% at even strength with him on the ice and he has points on 88% of his team's goals), and he's still tied for 10th in scoring. Joffrey Lupul is another guy that has flirted with the top of the scoring charts this season based on unsustainable percentages, and while he still sits just above Eberle in 9th it was always just a matter of time before he was going to left behind by the likes of Malkin, Stamkos and Giroux.

Very high points finishes are meaningful because they are unlikely to be flukes. Pure trophy counting would still undervalue someone like Mats Sundin who was consistently productive although never near the very top of the league, but in general a player with a couple of high finishes can be considered to have had a better peak than a similar player who never climbed the table to the same degree (although of course context and team factors need to be taken into account).

That doesn't happen to goalies. Flashes in the pan have won the Vezina, and average goalies have found themselves in trophy contention simply because fortune ended up favouring them over a 50-60 game stretch. Of course observers can sometimes identify when other factors or luck are at play; they don't always just follow the numbers to the exclusion of anything else. However, it is particularly difficult when rating a young goalie without much of a track record who has a big season. Is he breaking out, or is he getting lucky? Is he Pekka Rinne or Steve Mason? Only time will tell.

It is my belief that goalies can only be properly evaluated in a multiple season context. With that in mind, single season awards should be considered relatively insignificant in terms of career evaluation.

Friday, February 24, 2012

Score One For the Kings

Los Angeles defencemen with at least 100 games played between 2007-08 to present, ranked by plus/minus rating per 82 games played:

1. Sean O'Donnell, +8
2. Rob Scuderi, +6
3. Drew Doughty, +4
4. Willie Mitchell, +4
5. Matt Greene, +2
6. Davis Drewiske, -3
7. Peter Harrold, -5
8. Jack Johnson, -21

Obviously, Jack Johnson is not very good, yet apparently he and a first round pick are worth Jeff Carter, a guy who scored more goals in the three seasons prior to this one than every player in the league save Ovechkin, Crosby, Stamkos and Marleau.

Just as I prefer to look at multiple seasons' worth of data for goalies, I think there is at least some value in multiple years' worth of plus/minus given how the percentages tend to work themselves out over the larger sample, although obviously matchups and usage context are still important, especially for players utilized in a specific role.

However, when you're supposed to be one of the best defencemen on your team and you're that much of an outlier in terms of getting outscored over a 338 game sample despite not playing tough minutes, it's pretty glaring. It's also very tough to make a linemates excuse given that Johnson had four different defencemen rank as his highest defensive teammate TOI over that span, not to mention also having the benefit of playing more minutes with Anze Kopitar than any other Kings forward in four of the five seasons (it probably would have been all five if Kopitar played 82 games in 2010-11). At some point there's really nobody else left to blame. And the bad news for Blue Jackets fans is that their team is on the hook for paying Johnson $4.357 million per year through 2017-18.

Friday, February 10, 2012

The Hall of Fame Committee

I've often been critical of the decisions of NHL decision-makers in seasons past. But I have found myself being generally less and less critical of their moves and choices in recent seasons (and as my last post shows, it is clear that their perspectives have certainly changed over time). I think NHL teams collectively have a much better sense of the value of goaltending and what makes a good goalie than ever before.

I don't think the general managers from two decades ago were stupid. I do think quite a few of them were uninformed, unaware of the importance of objective analysis of goaltenders and focusing instead on the things that tradition dictated were important (like winning). I'm making that claim based on award voting and based on the number of bad goalies and washed-up veterans that kept getting NHL jobs long after they were deserving of them.

The same logic applies to Hall of Fame voters. I've gotten into a few discussions about the Hall of Fame, both here and in other places, and when people ask me if I think certainly goalies will end up being elected, I generally say that I really don't know. I do know what kind of voting patterns have existed in the past, but I don't think that those historical rules remain valid given how perceptions have changed enough over the past two decades.

Just because the voters may have overvalued Cups and wins in the past does not mean they are guaranteed to do so until the end of time. Some may have that kind of negative, defeatist approach when discussing the really borderline Hall of Famers, but I think that's unfair to the voting group because it implies they haven't learned anything over the last 20 years, and the progression of Vezina voting makes it absolutely clear that hockey insiders have in fact learned some lessons.

Every goalie now up for voting has an entire career's worth of official save percentage numbers and an entire career's worth of Vezina voting done by the league's general managers. Even though it's almost been three decades since those changes were introduced, Ed Belfour was just the second Hall of Fame goalie who played his entire career in the official save percentage and modern Vezina eras, joining Patrick Roy. Both Belfour and Roy would have been slam-dunk picks in any year, and therefore are hardly litmus tests for how the Hall factors in these new developments. We simply have not seen how much those two things have changed the game in terms of rating goalie careers. As such, any attempt to use prior voting results to predict how the voters rate future HOF goalie candidates is probably little more than guesswork.

I know for sure that there is at least one member on the Hall of Fame committee who doesn't care about goalie wins. A long time ago I linked an article by HOF member Michael Farber in which he quoted fellow HOF voter Serge Savard as saying that goalie was not the most important position, since in almost all cases teams make the goalie, rather than the other way around.

In the past a guy like Scotty Bowman may well have been a traditionalist and rated goalies based on wins, I don't know. But he was there when the Red Wings waived Chris Osgood, and he remains a special advisor to the Chicago Blackhawks, who turfed out their Stanley-Cup-winning netminder to save a few bucks. Seems pretty clear he's either been overruled multiple times by his team's management, or else Bowman is using evidence other than merely wins and Cups to rate goalies. If he's able to do that while helping to run a team, there's no reason he shouldn't apply the same logic when debating Hall of Fame candidates.

Any of the currently active writers or broadcasters on the committee have to be aware that teams are spending less money on goaltending, that mediocre goalies are winning Cups, and that save percentage is being used more than ever to rank and rate goalies. With less of a tradition of rating goalies based on winning in Europe, it's also possible that the two European voters would be more open to rating goalies who were good but not "winners" ahead of average goalies with terrific teammates.

All this is pure guesswork, of course, and the secretive nature of the Hall's election process means that we really have little evidence to go on. Maybe Mike Vernon fell one vote short last year, maybe nobody even mentioned his name, I don't know, but it is at least a positive sign that nobody has voted him in yet, and there doesn't seem to be a huge groundswell of support to get that to happen any time soon. I do think over the next few years we will have a lot better idea of how the Hall rates modern goalies. Perhaps surprisingly to some I am actually fairly optimistic that they are going to get the choices more right than wrong, although only time will tell on that one.