Wednesday, December 2, 2009

Clutch Play

"TCG's other big problem is that he completely ignores the human element of sport, and approaches all statistical problems from the assumption that players perform the same regardless of game situation, again without ever really grappling with the problem. It is essentially a hockey version of Moneyball, Billy Beane's now comical baseball analysis which included the theory that there is no such thing as clutch hitting...I find attempts to erase the human element of sport under a pile of statistics not only patently false, but also vaguely disgusting." (HFBoards)

I just wanted to post a few words about my outlook on clutch play. The poster above is essentially correct in my base assumption regarding clutch performance in hockey players. However, I don't think it's fair to say I've never grappled with the problem. I simply think that the high degree of uncertainty means that it does not make sense to use clutch performance as an evaluative tool for goalies. Here are the six primary reasons:

1. Most of hockey is played with the score close. Allowing a goal against in a close game results in a significant downgrade in a team's win probability, and therefore most of a goalie's workload should be considered to be a clutch situation. From that, it follows that a goalie's overall performance should closely approximate his clutch performance, since most of the sample comes from situations with a high penalty for allowing a goal against.

2. If we define clutch situations more narrowly, we run into small sample size issues. For example, if we were to evaluate goalies based entirely on their performance in third periods in the playoffs, then we only have several hundred shots to work with even for experienced netminders. A more common split is to simply discuss regular season performance and playoff performance separately, but for the most part over a larger sample size the two results converge, or are generally within a typical margin of error for the size of the sample. Playoff results are also fraught with additional perils, including extreme opposition effects, a different style of play, and greater playing-to-the-score effects.

3. The best way to evaluate a method or statistic is to see how well it predicts the future. If we want to include clutch play in our predictions for active netminders, then sample size is even more of a concern. It is a simple fact of random chance and the variability of athletic endeavours that some goalies are going to start their playoff careers hot while others start their playoff careers cold, regardless of their level of talent, preparation or mental fortitude. Without a large sample size to work with, we're essentially guessing at this point whether someone like Cam Ward has a true ability to raise his game in important situations or whether he had a couple of well-timed hot streaks. If he continues to play on a marginally talented team that frequently misses the playoffs, we may never know with a high degree of confidence which viewpoint is correct.

4. Players and teams have the option of changing their style of play, their matchups, their shoot/pass tendencies, and their offensive/defensive bias to match the game situation. Goalies do not have those same strategic options. This suggests that changes in results or percentages in response to the game situation are primarily driven by players, not goalies.

5. Nearly every goalie who has been identified as clutch by subjective evaluators played on a dominant team. That correlation certainly suggests that many observers are conflating team effects with goaltender performance. It is possible that they are correct, but it does not seem very probable, given that on the whole team strength is a much better predictor of playoff success than goalie strength. If there were goalies who played on weak teams who did not have significant team success yet were universally praised as clutch, then I would have more confidence in the ability of observers to rate "clutchness".

6. Subjective evaluations of goalies often strongly emphasize a goalie's performance in important situations, his team success, and whether or not someone believes he is a "winner". It is conventional hockey wisdom that you need a clutch goalie to win and that the best goalies win the most games. Therefore, it seems extremely likely that there should be a selection bias against goalies who can't perform under pressure, that scouts will pick out the most clutch goalies to advance to higher levels of play. If a goalie who is in truth a "choker" or a "loser" does make it all the way to the best league in the world, then that would reflect somewhat poorly on the ability of observers to subjectively evaluate clutch play.

In summary, I'm not a clutch play disbeliever, merely a clutch play skeptic. I have done studies of clutch performance because I think it is a worthwhile topic to investigate, but I don't think the evidence suggests it is particularly significant or that it is accurately estimated by observers. I'm certainly not saying that sports psychologists are quacks, or that all athletes perform at exactly the same level in every situation. Players are indeed human, and there are too many top-level athletes who failed under pressure to discount the human element entirely. However, I think we need to focus primarily on the most significant data, and I see NHL goaltending as an area where major clutch differences are structurally unlikely (see points 1, 4 and 6 above). We also have to be very careful about poor logic when switching back and forth from the general case to the specific case, e.g. some athletes choke, Goalie A choked in a big game, therefore Goalie A is a choker who will always choke.

Let's assume there is some small variance in clutch skill among NHL goalies. Precisely measuring that skill will be very difficult, whether you are evaluating players subjectively, objectively, or using both methods. Either way you are going to make mistakes, because chance happens and the future is unknown. If you want to be like a television broadcaster and subjectively praise players for their mental toughness and because "all they do is win", then you are going to hype some guys who simply went on a hot streak and have nothing but regression to the mean in their future. You're also going to dump on some players for their lack of fortitude who, unbeknownst to you, are going to tear up the league in future playoff seasons.

On the other hand, if you view the world the way I do, you run the risk of failing to correctly praise a player as clutch, or at least you won't do so until their careers are mostly over and they have proven that they have that ability. You will also continue to predict great things for players who have good overall records but have poor clutch performances in their early careers. Some of these players might continue to perform poorly under pressure, but many of them will see their future pressure performances improve to match their overall ability.

If someone thinks that this perspective is promoting an agenda or in some way ignoring evidence, I'd like to point out that I took the same approach on shot prevention effects. I always thought there was a small effect, but I thought it was likely fairly insignificant and I wasn't going to commit to anything at all until I had evidence of what it was.

Was I wrong to state that goalies have no effect on shots against? Absolutely, as I think there is some very good evidence that goalies can affect the number of shots against, and the observed variance among NHL goalies seems to be about 1 shot above or below average. However, there are tons of people (including Martin Brodeur himself) who are demonstrably wrong on the other side of the equation because they overestimate the effect. Guessing too high is just as wrong as guessing too low. And by doing some in-depth research on the issue from my devil's advocate position and debating others who disagreed with me, I think we've come to greater learning than we otherwise would have if I'd merely accepted the consensus opinion or came to some quick subjective estimate and left it there.

Similarly, it's highly probable that I would be wrong to state that there is absolutely no difference in clutch skill among NHL goalies. However, in the presence of uncertainty that's still the position I am generally going to take, because I don't know exactly where to draw the line and I think that most people are drawing theirs too far on the opposite side of the true marker. That makes us both wrong, but my bet is that I'm closer to being right.

It's doubtful we'll ever develop perfect metrics or track them perfectly, and even if we do there are still going to be some limitations like being constrained by sample size. In the great clutch debate, that means it is likely always going to come down to picking which error you want to make. I'd rather assume a player is not clutch and wait for proof that they are, then assume that they are and wait for proof that they aren't. I think the general correlation between regular season and playoff performance, and the observed regression to the mean of many players who at one point or another were considered playoff over- or underachievers makes my position the one that is less likely to make mistakes. I am quite aware that there will probably still be mistakes, but I'm willing to accept the trade-off. And if we can ever prove the magnitude of clutch skill for NHL goalies, then I will update my position accordingly.

8 comments:

Lawrence said...

I still believe the rebuttals to these effect you have highlighted are:

1. 72% of the time is not surprising because the game starts tied at 0-0. It may take a team 10-20 minutes to get to 2-0 (where it isn't within one goal) regardless if the final score is 3-2 or 8-0. That time to move from 0 will have a distortional effect on this stat.

2. Extreme opposition effects yes, but even less so than in the regular season due to tiering. This is why the QualComp for a goalie should be higher in playoffs. We should see either a wider spread in stats, (a finer zoom on distinction) or lower stats or both...which is what happens.

3.The problem of looking at such large sample sizes is variability can be averaged out no?

Chris Osgood seems to 'raise' his play from super-suck, to decent, on a regular bases.

4. Certainly goalies do, it's just that the strategic differences are more subtle. Freezing the puck vs actively playing the puck is only one example. Crease management would be another.

5. Chicken and the egg. Win threshold is an interesting stat to combat this idea, but it is muddled by PlayingToTheScore effects. I can agree with this point more than others though.

6. This is simply a difference of scale. Certainly there have been great goalie who go nowhere. We see this always. You're building a strawman to argue the difference between 50th percentile and 95th percentile. Of course the NHL goalies are human, the margins are there but smaller. We're talking 95th percentile vs 91st percentile of "choker-ability"

As is the case with sport we are measuring at such minuscule variables, that the human effects can get seem to get lost out of scale, but they are certainly there. Perhaps impossible to measure confidently though.

Moneypuck said...

CG, or should I say Philip (that's what happens when you get in touch with Puck Prospectus haha).

Also due to studies in baseball, it seems that clutch even with a great sample size doesn't have THAT drastic an effect, maybe a 5-10% boost or decrease at the very extreme.

I mean you could always regress to the mean too, but again the sample may make the results even more insignificant.

Triumph said...

yeah what i seized on first was the 'now laughable claim' that clutch doesn't exist in baseball - well, clutch really for the most part doesn't exist in any meaningful fashion - according to tom tango's (and others) 'the book', it is an incredibly tiny and barely measurable thing; probably about the 30th thing one should look at when evaluating a player.

if you believe in clutch, you'd think that osgood could will himself to playing well now that his team is struggling to make the playoffs, but that doesn't seem to be the case so far this year.

Anonymous said...

Not a goaltender, but one player who seems to have a consistent track record of bumping up his performance come postseason, regardless of how poorly he did in the first 82 games, is Jeff Friesen.

The Contrarian Goaltender said...

Lawrence: It is not my intention to build a strawman. Other than that, I think your last two paragraphs are spot-on.

I completely agree that we are likely to be comparing athletes who are on the high end of the "not-choking" scale, which means that the differences are probably slight. Again, my theory is not no difference at all, but merely that any differences are uncertain and not particularly significant (as we see in the baseball results mentioned by Triumph). I don't think the 6 points mentioned completely remove the existence of clutch play among goalies, but they very likely combine to reduce the magnitude of any effect, and that is the point I am trying to make.

In the presence of uncertainty, everyone has to make their guesses and lay their bets on one side or the other. If you want to err on the side of clutch, then fine. I'll err on the side of not-clutch.

Either way, I don't see it being a significant variable, and in any case it appears to be something that contains too much uncertainty to be a useful predictor of future events for active goalies.

Taylor said...

To add to your skeptical position on clutch play there is the well-known set of cognitive biases that all humans are subject to, that cause us to overemphasize recent events or high-profile events. There are a number of popular books reviewing all the research that shows how we tend to toward these errors. Just search 'cognitve biases' on Amazon.

Frank said...

You're frustrating for me.

Your measured, rational approach to evaluating goaltenders is against everything I stand for in sports, but represents very well what I stand for in general. Grr. Well done.

Doogie2K said...

Isn't playing to the score a very human sort of response? Rather than playing the same way because rationally, the chances of the next shot going in are the same as always?