Monday, March 22, 2010

The One Stat Argument

From time to time I get feedback that I focus too much on save percentage. The main criticism is usually that it is folly to rely on only one stat, and that a well-rounded analysis should take all the available numbers into account (e.g. GAA, wins, shutouts, games played, etc.).

The one stat criticism sounds pretty reasonable. It seems intuitive that adding more information is a good thing. It is also undeniably true that save percentage does not tell us everything. It is important to know how many shots the goalie faced to be able to assess the level of randomness in his results. We should also take into account how many of those shots came on the penalty kill, and whether there is evidence of any other team factors like shot quality effects. It also seems clear that goalies have some small impact on shots against, and that should be taken into account.

What is usually suggested in place of a save percentage analysis is to take all the different stats, apply a weighting to each of them, and then calculate the final rankings based on all the different inputs. The problem with this approach is that it simply doesn't deliver on its promise. It does not actually measure a lot of different things that a goalie does. What it does is count save percentage 4 or 5 times, and then add in a bunch of irrelevant team factors. This is because every commonly available goalie stat is essentially a different way of restating save percentage and shots against.

Let's go down the list. GAA, as I've pointed out many times, is equal to (1 - save percentage) x shots against per game. It's handy as a shortcut to compare goalies on the same team, but if you're looking at rivals it's much better to try to make sense of the two underlying components than to try to assess their combination.

Shutouts are very obviously driven by a combination of save percentage and shots against, only we are adding in an arbitrary cutoff (why is 0 goals against worth so much more than 1 goal against when teams win 90% of the time when they allow a single goal?) and a healthy dose of small number randomness (shutouts are infrequent, which means luck is often the difference between something like an ordinary 5 shutout season and a great 8 shutout season).

Wins are determined by goals for and goals against. The goalie has almost nothing at all to do with goals for, and goals against are determined by the same two familiar factors of save percentage and shots against.

Goalie's World magazine has a fairly typical "computer ranking" that puts a weighting on each of the traditional goalie stats. On the surface, it looks like it is measuring a lot of different things. However, break down all the stats into their components and it is essentially only tracking three things: Save percentage, shots against per game, and how good the rest of the team is at scoring goals.

Given that only one of those three variables is mainly determined by the goalie, it makes a lot of sense to me to focus on save percentage. That doesn't mean raw save percentage is a perfect stat, because it is not. Step #2 is to make any necessary corrections (e.g. shot quality, special teams adjustments, shot bias) or try to incorporate any other results (e.g. shot prevention). What does the goalie do to win games other than stop pucks? If we know what those things are, then we should focus on them directly.

Maybe a goalie helps his team with his puckhandling and prevents a shot against per game. Preventing one shot is the same as preventing about 0.1 goals, which based on a typical game with around 6 total goals is about 1.7% of the total team contribution. How are we going to see that small effect show up in the overall win total? We might not see it all through the noise of how good the team is at territorial play, how good they are at shooting, how disciplined they are at taking penalties, how good they are at faceoffs, how good their coach is at matching lines, injury luck, shootout performance, etc. However, if I increase the goalie's shots against totals by the estimated amount of shots he prevented, he will get credit for them and that will boost his save percentage.

One other commonly cited justification for relying on wins is that certain goalies help their teams win by performing better in important situations. The problem is that most of the time we can predict quite accurately what the standings will be based on goal differential. According to Alan Ryder, goal differential explains 94% of wins. This suggests that any "clutch win" effect is likely to be either small or influenced by luck.

Maybe over the course of a season a team wins 2 more games than expected based on their goals scored and allowed. Even in the unlikely event that this is entirely because of the clutch skill of the starting goaltender, that's going to be a net result in a similar range as shot prevention (about 0.1 goals per game, based on the standard goal differential that is usually required to produce 2 extra wins). This again would make up a percentage point or two of the total team effort, and just like the shot prevention effect is likely to be hidden from sight among the hundreds of player interactions, coaching decisions and random bounces that determine the outcome of a hockey game.

For goalies that we suspect have an additional effect on team play, I think we are better off determining the effect and using that to adjust the save percentage numbers. It is of course tricky to estimate things like shot prevention, but it is much more tricky and uncertain to try to somehow estimate a goalie's shot prevention effect directly from his GAA. Or to assess the contribution of a goalie based on his career wins number. Both of those are complete guesswork, and far worse alternatives than some of the methods already proposed for dealing with some of the unknowns (e.g. comparing to backup goalies, or estimating the effect by combining our subjective evaluation with our knowledge of general performance ranges for a particular skill).

I focus heavily on save percentage not because save percentage is perfect and tells us absolutely everything, but because I think the other stats just don't bring an awful lot to the table. They are merely different ways of telling you how good a goalie was at stopping the puck and how good his team was at preventing shots.


Bettman's Nightmare said...

My thinking is the reason you're seeing this critique is because people are confusing fantasy hockey value with real value. I agree completely with your resolve to stick by the SV%.

Anonymous said...

Hi CG,

Your argument is so self-explanatory that I won't go into it more. The only downside of save % is that is it highly affected by outliers, i.e. games in which a goalie gave up a lot of goals, which may just be a "bad outing". I'm not sure if this level of consistency or inconsistency can vary at the NHL level, but if so that would argue for including something like Rob Vollman's "Quality Starts", or an average of per-game save % where any score below 0.800 counts as 0.800. I'm not saying these would necessarily be better, I'm just saying that's the only way in which you MAY be able to add information.

- Tom Awad

The Contrarian Goaltender said...

Interesting suggestion Tom, I've thought of doing the exact same thing, taking an average and setting a floor at .800. I think at some point I'll run the numbers on it and see how it comes out.

There is no question that sometimes one or two starts can really skew stats, the kind of thing where a goalie lets in say 6 goals on 16 shots in the second half of a back-to-back where his team doesn't even show up, and it ends up costing him .003 on his seasonal numbers.

Bettman's Nightmare said...

The problem there would be determining the difference between a bad start because of defense and a bad start because of goaltending. Are a high percentage of these bad starts due to the former or the latter?

The Contrarian Goaltender said...

The truly awful ones are generally both. Even on nights when goalies are fighting the puck, it's still mostly going to just hit them unless the opposition is getting the time and space to pick corners. That's when it can get ugly.

Corey Pronman said...

I would argue that SQNSV% is the one stat people should look to.

To the debate above, yes SV% can be skewed over small samples, but rarely does Philip ever quote such a sample and that would probably be a talking-head thing to do regardless.

Passive Voice said...

It's almost unbelievable to me that you would even need to write this entry.

Scott Reynolds said...

Just as a heads up, Jonathan Willis wrote a couple of posts on the league's most consistent and inconsistent goalies. Although the concept isn't exactly the same, it may still be useful and he almost certainly has some raw data that tabulated that may be helpful:

Anonymous said...

How would you explain the worst "high-stakes" blowout in recent memory, the Wings' 7-0 stomping of the Avalanche in the 2002 playoffs?

Was this St. Patrick having a bad night, a bad night for the Colorado D, both, or something else entirely?

Anonymous said...

Just stumbled here. Good points however i would suggest that every goal do not have the same VALUE in hockey (see for instance: In effect, the point in time in which a goalie gives it's first goal of the game seems very relevant to the "clutch" skill set you mentionned as well as very pertinent to overall understanding of goalies contribution to a win. This would seem to militate, maybe, for an (re)inclusion of shotouts as they are a testimony to a goalie's ability to delay the first goal against (a very valuable goal). As far as puck handling goes, defensive zone giveaways and/or puck possession time(compare with backup?) could give some pointers. These are only suggestions/comments by a passing newbie, and i don't pretend to know the full extent of this discussion about the viability of your variables. Take care.

Anonymous said...

At anon above

I suggest reading TCGs post which shows that the vast majority of game play is played at a point where the next goal is "a big one". It really is such hogwash, when a team is down 2-0 and comes back to win 3-2, the announcers praise the winning goalie as making "clutch saves to keep his team in it until they could get their legs going". Yet if the team never scored but the goalie still did not let in another goal, someone would criticize him as "making saves when it doesn't matter". The goalie played the same in both cases, what his team does in front of him is not indicative of his own play. You say that it is very important not to give up the first goal, so goalies should be commended for delaying that, it would be interesting to see if there was any real variance in that particular stat between goalies of similar calibre, I suspect not. At the same time however, suppose you had a goalie who never let any team score on him during the first two periods, but let in enough goals in the 3rd alone to bring his GAA and save percentage to league average. What would everyone say about this goalie? The words "choker" come to me immediately as no doubt everyone would get on top of him for his "3rd period collapses".

Lawrence said...

For the most part I agree with you CG, taken at face value, SV% and evsv% tell us more clearly than other statistics the puck-stopping skills of a goalie.

However, the reasoning behind using the multi-stat argument is clear to me as well.

Ultimately I think it comes down to 'weight' of the statistic. It can be troubling to put too much 'weight' on the sv% metric, when all teams and shots are not created equal.

for example evsv% over 400 games.

This is certainly a statistic you can have faith in because of it's large sample size. We agree.

But when comparing Goalie A who possesses a .923% evsv% over this time with goalie B who have a .921 evsv% over this time, the tendency is to state, irrefutably and with extreme confidence that goalie A is a better goalie than B.

Whereas on 12000 shots during this period that accounts for a difference of only 24 goals or 1 goal every ~17 games. Let's call this methodology that "Stats first approach of comparison"

The other way, likely promoted by a more subjective fan, could be called the "Subjective first approach of comparison" (which certainly has numerous flaws). This person may argue that because of external factors such as luck, coaching decisions to not pull you star goalie even when he get's shelled for 8 against the Kings and IT SEEMS SO OBVIOUS to help the dude out, shot quality etc, playing to the score effects, etc. that a difference of 1 goal between two goalies in every 5 games, could be a point of differential of skill. For example.

This difference over 12000 shots and 400 games appears massive! Goalie A now has a evsv% of .923 and goalie B a lowly .916.

That's the difference between Hasek and Turco... or Vokoun and I dunno...Thibault. And this is one goal in five games.

My point is, that because people subjectively see a massive difference between teams and team effects watching single games, the tendency is to not have faith in single statistics even when sample size should give us this confidence.

Therefore, the desire to collect numerous data points, with equal levels of confidence and 'weigh' them differently in a sort of hierarchy of effects from goalie-influenced at the top to team-influenced at the bottom makes intuitive sense.

To this second "Subjective first" person, assessing the sv% stat is like Quality control. An acceptable precision has been determined to be the cut-off, let's say for arguments sake that's .005 (1 goal every 6.7 games). Here they 'feel' the difference between a .923 goalie and a .928 goalie over 12000 shots doesn't really exist. At this point they look to other achievements or subjective reasoning to make that distinction. Whether is Quality of shots against, density of games played, scheduling, travel, luck, coaching, quality of teammates, etc.

Here, the team affected stats DO have importance, because that's what we DESIRE to measure. If both goalies A and B play 400 games and face 12000 shots +/- 500 of one another, and have sv%'s of .923 and .919, we want to know their "team" make-up, their shot prevention, the conference they played in etc.

In this way we can start making determinations further to sv% if Goalie A has a 220-150-30 2.17 40so record in 5.5 seasons on a dynasty team.

Whereas Goalie B has a 150-220-30 2.78 5so record in 10 seasons on a series of bottom-of-the-barrel teams.

I guess what's difficult is how much 'weight' you assign to those other stats and whether it's a + or - beside the 'evaluation'.

Anonymous said...

Here's a stat for you ... 600/100

Passive Voice said...

This latest anonymous has totally flipped my worldview on its head.

Anonymous said...

I apologize for my typo.. Here's a stat for you ... 600\110.....and counting.