Hockey ain’t Baseball, stats-lovers

Sabremetrics and Moneyball have worked in baseball, but would the same work in hockey?

To me the sports are just too different to allow statistical data to overtake live scouting. Willis mentioned Sabremetrics as a key to the success of the Red Sox, and while I agree it had something to do with their success, I think JW missed an obvious statistical equation with Epstein and the Red Sox: money.

Look at their salaries compared to other teams. To say it was Sabremetrics that won them the World Series, without at least mentioning their gross salary advantage is a bit misleading.

Advertisement - Continue Reading Below

Moneyball makes a much better argument because Oakland made the playoffs four out of the last nine years with a payroll that was always in the bottom ten of team salaries, and three of those years in the bottom five. They had half and sometimes 1/3 of the salary of the Yanks and Red Sox.

Billy Beane’s theory worked to get them to the playoffs, but only once did they win a series. Epstein made the playoffs but the Red Sox were always in the top five in salary, so I think their success is based as much, if not more, on money as Sabermetrics. If anything, the Red Sox championships prove why baseball needs a true salary cap. Sure, the Rays — after years of being the laughingstock of the league — made the playoffs last year, but how long before they lose all of their young players to free agency? It’s a joke when you have the Yanks at $209 million, which is $70 million more than any other team, and almost FIVE times more than the Rays $43 million. I think Moneyball and Sabermetrics was more a product of survival than anything else.

The other reason why stats work better in baseball is that every play starts the exact same way. The pitcher pitches, and the batter tries to hit it. Sure there are lefties and righties to take into the equation, but the play always starts in the exact same spot.

That is not the case in hockey, nor will it ever be. Players have to react much more quickly in many different areas of the game. That doesn’t mean stats aren’t valuable in hockey, and are becoming more common, but the variables in hockey differ from play-to-play and situation-to-situation much more than they do in baseball.

Advertisement - Continue Reading Below

Baseball junkies don’t pipe up and say that hitting a fastball or change up is harder, because I agree it is super tough, but that is more of a twitch muscle reaction and great hand eye coordination. The game of baseball is not as fast, thus split second decisions in hockey happen more frequently. An average hockey player has to make 500-600 decision a game, (based on 13 minutes of icetime), and most of them have to made instantly.

In baseball it’s different, thus the stats can better show the attributes of a player. In hockey, for the same type of stats to work, you would have to break every play down, with every variable. Teams forecheck differently, sometimes a D-man makes a pass with no pressure, while other times he is pressured by the opposition. Are some D-men better at passing across their body, or straight up the ice?

Face-off stats for example could be more accurate if you had: taken on forehand v. backhand… and whether the opponent was on his forehand or backhand. Most guys are naturally better drawing it back on their backhand. I’m sure those stats are coming, but in a game that relies a lot on pure instincts and split second decisions, I think it’s harder to find accurate statistical data that will back up whether a player is contributing to his team.

In baseball, a great defensive infielder has amazing reaction time, and the error stat backs it up. In hockey, I don’t see giveaways as an accurate enough stat, especially because guys who handle the puck more often will obviously have more giveaways. Hemsky has led the Oilers in giveaways for many years, but no one thinks he can’t handle the puck.

Also in Sabremetrics, they feel that drafting a college ball player has a much higher rate of success than drafting a high school player. We will need to see at least a 15-20 year study to see if this is indeed true. That doesn’t seem to be the case in hockey. Almost all of the top young players have come from Major Junior or the European leagues the last ten years, and beyond. Crosby, Kane, Oveckin, Phaneuf, Getzlaf, Carter, Richards, Malkin to name a few.

Currently in the top 30 scorers the only two to play college were Zack Parise, and he left after two seasons and Todd White (**Side note — Todd White top 30 in scoring: I’d love the statistical breakdown on how that is possible. Todd freaking White! Nothing explains why he’s there

Advertisement - Continue Reading Below

But I digress. I’m never one to automatically shun something just because it is different, and I think that some statistical analysis can show sides of a hockey player we never looked at before, but at the same time I think too much of it could make the game robotic.

The best part about hockey is the raw emotion and excitement. The end-to-end rush, a tic-tac-toe goal, a bone-crushing hit, the elation of the crowd, the willingness of a player to try and block a Souray or Chara shot (I’ll take Souray in an upset in the hardest shot later today — 105 for Souray, 104 Chara), a spirited scrap and most of all the mistakes that lead to odd-man rushes or even better goals.

The speed of the game is increasing all of the time, thus the players have to react even quicker, and that will lead to great plays, but also mistakes which the game needs.

I appreciate baseball for the nuances it has, but I also realize it is a much slower game and for me personally not nearly as exciting. Hockey is too fast to break it down in the same fashion as baseball, and that is what makes it great.

Of course hockey could use an upgrade in certain statistical areas, but to breakdown every aspect of the game would be too difficult and I can’t see how the data would be accurate. The guys making the grades are just as likely to make mistakes as the players on the ice. Like I said, I can see areas that it could help, but until they come up with stats for heart, determination, a willingness to compete and battle I just can’t see how a contact sport can rely on statistical data as much as a non-contact sport like baseball.


  • That's an excellent article, Jason. In all sincerity, a very good job, with a lot of good points, including the Red Sox caveat.

    I do have three points, if you don't mind, and I'm not trying to be smarmy or argumentative:

    1. On Todd White: White's been on the ice for 10 goals with Kovalchuk in 163 minutes of EV ice-time, and only 22 goals for in 455 minutes apart. So Kovalchuk's a big help. When he isn't playing with Kovalchuk at evens, it looks like he's playing butter-soft minutes (ranked 9th among forwards by Behind the Net, with Kovalchuk ranked 2nd); so while Kovalchuk's line takes the heavies, he can score some points. Finally, fully half of his points come from playing on Atlanta's first-unit PP; while they don't have a great powerplay overall, the first unit, with Kovalchuk on it, is dynamite. So when you factor in that he kills with Kovalchuk in limited ice-time at evens, and with a lot of time on the powerplay, and then that he's playing butter-soft minutes when he isn't with Kovalchuk, his offensive output makes some sense.

    2. You always need to know the character of players, and a number cannot and will not tell you that. Avery, as I've mentioned previously, is an excellent player – strictly by the numbers. Still, his effect on a team's cohesion, and the kind of antics he engages in are huge. Human emotion really can't be assigned a number, and there's always going to be a need to know how willing a player is to go into the dirty areas and so on.

    3. I agree that baseball's a game much better suited to statistical analysis than hockey. But what about weather patterns? There are more variables, the situation is far less regulated, and on top of that the data is not only much more difficult to gather but fluctuates with greater frequency. Surely hockey, a five players aside sport played with strict rules, in an enclosed environment, and with the capability of complete observation lends itself to such analysis in a better way? Or, if you don't like the weather analogy, what about economics, or market research, or virtually any other scientific field? Observation's more difficult, and there are generally more variables, and yet mathematical analysis is the best way to make predictions. Why wouldn't that apply to hockey as well?

  • Helicopter Guy

    Awesome article and response by Willis. Very well thought out and thought provoking. All I'm clear about now is that statistical analysis of the game must incorporate many weighted variables in order for the result to be contextually relevant. It seems to me that the human side of hockey doesn't plug into the spreadsheet so good and often represents a significant part of the game.

    Long story short – This has just reminded me part of why I love the sport so much. A multi-faceted game that gets deeper as you experience more of it. Oh yeah – also reminds me how much watching golf and baseball bore me.

  • MattN

    The problem at this point is that hockey stats are still in their infancy. Stat geeks still don't have an agreed upon way to measure the game, much less using those measurements to make meaningful observations. Gabriel Dejardin is the leader in this area almost by default. His website has given us the numbers that JW, Lowetide and others use in their posts.

    I think that their will be more movement towards this area as time goes on. As more and more young people who are comfortable with statistical models move into the business of running hockey teams, it will become as important as the older "saw him good" way of looking at players.

    BTW Jason,

    Any talk of contact sports not being able to use advanced statistical data is blown out of the water by doing a quick google search on "football advanced stats".

  • MattN wrote:

    Any talk of contact sports not being able to use advanced statistical data is blown out of the water by doing a quick google search on “football advanced stats”.

    Difference again in football is that each play starts essentially the same way…That is what makes hockey unique, and I suspect a reason why there are more statistical analysis for football and baseball compared to hockey to date.

  • Cam

    I really think it takes a combination of stats and observations to really get a good idea of what is happening. One without the other leads to mistakes. I hope that the Oilers have people on staff picking apart sats before these big trades are made. I alos hope they don't move to a system where intangibles like Moreau's heart aren't considered when making a decision.

  • Chris

    Great debate. I think it is easy to put too much stock in a theory or conclusion derrived from statistical analysis. Not because a correct breakdown of good data leads to false conclusions; but because false conclusions are derrived from poor data. When the entire NHL and hockey community at large embrace advanced statistical analysis I beleive the numbers, or data inputs, will improve. I rarely agree with the hit count, shot clock, or TOI totals if I pay specific attention… poor data= poor conclusions. More importantly, a statistician often subconciously uses numbers that tend to support a preconcieved notion. I was involved in a school project where three groups of people unknown to each other, from different demographics were given the same question and access to the same data pool. Is anyone surprised that all three groups of people arrived at completely different conclusions?
    Willis when you run numbers to support a theory like Hemskey is a superior hockey player to LeCavlier… Are you running a pure study or subconciously supporting a preconcieved notion? Instead of saying LeCavlier is overrated, try saying, "These numbers that I pulled from various websites suggest, LeCavlier might not be as defensively responsible as Hemskey. Perhaps we should take a closer look at the game tape."
    The numbers should not be the conclusion but the impetuous to take another look. That's it.

  • Dennis

    If you truly don't want to believe, then you'll convince yourself that there's no need to.

    Enough of what Desjardins generates passes enough sniff tests as to warrant serious consideration.

    I think Beane's point about college players was here was a way of figuring out the best way to utilize a pick. College guys are more mature and get to the bigs needing less seasoning time so it makes more sense to draft guys who are closer.

  • David S

    Helicopter Guy wrote:

    …statistical analysis of the game must incorporate many weighted variables in order for the result to be contextually relevant.

    Excellent. The word of the day my friends is contextual. Many of the "weighted variables" in hockey are qualitative in nature due to the speed of the game and resultant non-linear decision-making process (as Jason outlined above). To my mind, that is the sort of data only achievable by a combination of personal experience at the pro level and ongoing, first-hand interaction with the players. Without which, the analysis is based heavily on interpretation of quantitative data with little valid inferential capability.

    I've often seen this in marketing. Stats will give you the basic knowledge. They are in fact a starting point. But without the contextual experience to interpret that knowledge, you have just enough information to be dangerous. Yet you'd be surprised how often people hang onto those stats like they came from the lord himself, despite the fact that they have no real understanding of what the numbers they see in front of them actually mean.

    I don't have a problem with hockey fans having fun with stats, but it crosses the line when those fans make assumptions about the data in such a way as to make it sound like they actually know what's going on with the players or the team. Truth is, they don't. So in fact, more often than not their analysis is flawed at best and just plain wrong at worst.

    *Strangely enough, a guy like Gregor who has the access advantages and a good understanding of the pro game would be able to do a proper statistical analysis if he were so inclined.

    At the end of the day, the stats-based arguments are fun for me to read. But it gets tiring when those arguments are taken as serous stuff and not to be trifled with. Heck, how many times have I seen interweb arguments where the conversation comes to the point of an online bar room brawl because others don't take their work as "serious stuff". It's really quite laughable without knowing the things teams tend to keep secret as long as possible like Horcoff has a nagging back injury or Gagner has a suspect ankle injury.

  • RBK

    RobinB wrote:

    I know this is a mathematics website, but how about young Andrew Cogliano winning the fastest skater at the all-star game?

    He did? I didn't realize that young star players got to compete in the skills competition.

    Math site LOL

  • Chris wrote:

    Willis when you run numbers to support a theory like Hemskey is a superior hockey player to LeCavlier… Are you running a pure study or subconciously supporting a preconcieved notion? Instead of saying LeCavlier is overrated, try saying, “These numbers that I pulled from various websites suggest, LeCavlier might not be as defensively responsible as Hemskey. Perhaps we should take a closer look at the game tape.”
    The numbers should not be the conclusion but the impetuous to take another look. That’s it.

    David S wrote:
    At the end of the day, the stats-based arguments are fun for me to read. But it gets tiring when those arguments are taken as serous stuff and not to be trifled with. Heck, how many times have I seen interweb arguments where the conversation comes to the point of an online bar room brawl because others don’t take their work as “serious stuff”.

    Well said, both of you. Now, to put that in context, I'm not saying this about Jonathan, since he is pretty good about that. That said, a lot of the interweb arguments Staples is talking about likely involve me not taking the stats as "serious stuff." Like I said, in the other thread, I'm fine with the pursuit, and I've offered to host a wiki to that end (so that people could work together and actually progress these data sets into something that might be useful – if not for fans, then for the coaches/GMs) but outside of Jonathan who gave a nervous yes, the idea was ignored.

    It's become less a matter of "why don't people care about stats" or "why are idiots the ones who ignore stats the most" but much more a matter of "why do people feel the need to have so much control over the truth value of contextless data?"

  • namflashback

    Umm, why does this conversation always turn into ONE OF observation vs statistics? The answer is BOTH.

    No statistic tells the entire story of the ebb and flow of a game or a series of games nor does it take into account the shoddy game of a player with the flu (Souray vs San Jose, yet they won the game).

    However, as a mechanism to make sure that there is substance to what we see on the ice, it can and should be damn useful. Why would we, or certainly why would the management of the team, ignore a useful tool that can help "smooth out" the subjective opinions of those who watch. It rarely works on a micro level of observation (a play or single game), but is really descriptive on the longer term (10 games in a stretch)

    For example, although Sam Gagner struggled during his first 25-30 games on the scoresheet, when I watched many of those games, I thought to myself that he was still getting and creating a significant number of scoring chances. Some of the statistical analysis I read backed that up. It helped affirm something I watched.

    Why choose one or the other, both make for a rich and interesting perspective on the sport.

  • @ Chris:

    I'll tell you this – you've got a great point, and the simple fact of the matter is that I think we're all learning on this stuff (especially me).

    Take the Lecavalier instance below. I went in thinking that Lecavalier's always been a bit overrated (strangely enough, that wasn't based on the numbers so much as on what I've seen of his play – hence my anger at the "watch the game" comments). In any case, when the numbers looked really bad, I should have realized that something's not right – I mean, I still think there's a drop-off between the Thorntons and the Lecavaliers of the world, but it shouldn't have been so pronounced. That should have made me go back and account for the Lightning's weakness in the shot-clock department anyways, and adjust the numbers accordingly, but I didn't clue in because I had my point that I felt was validated.

    I do strongly feel that the stats have good value, so I don't want to drive folks away from thinking about the game differently because I messed one particular area up, which is unfortunately going to happen from time to time because I'm a long ways from perfect and I'm still finding my sea legs in this particular field.

    Oh, and about real-time stats: the hit counts are bizarrely off and biased from arena to arena, but the shots are generally pretty close and the time on ice is probably the most accurate, at least from what I've seen. Since you've mentioned it, I'm going to try and keep on eye on those latter two, which I've always thought were reliable.

  • namflashback wrote:

    Umm, why does this conversation always turn into ONE OF observation vs statistics? The answer is BOTH.

    That's a remarkably sane and well-grounded viewpoint for an internet forum.

    Although I've got to say, I've been very impressed with the discourse in the last two threads; especially since I don't think anyone's called me an idiot yet 😉

  • Matt N

    Jonathan Willis wrote:

    namflashback wrote:

    That’s a remarkably sane and well-grounded viewpoint for an internet forum.
    Although I’ve got to say, I’ve been very impressed with the discourse in the last two threads;

    If you are looking for a little crazy, I dare you to do some kind of statistical analysis of The Hockey Jesus or Mac-T's coaching.

  • Matt N

    Jason Gregor wrote:

    MattN wrote:

    Any talk of contact sports not being able to use advanced statistical data is blown out of the water by doing a quick google search on “football advanced stats”.
    Difference again in football is that each play starts essentially the same way…That is what makes hockey unique, and I suspect a reason why there are more statistical analysis for football and baseball compared to hockey to date.

    Um, no. The reason that football, baseball or basketball have a wider and deeper set of data and analysis is because they have more smart people doing it and they have been doing it longer. Hockey hasn't found it's Bill James yet.

  • Jason Gregor wrote:

    Difference again in football is that each play starts essentially the same way…That is what makes hockey unique, and I suspect a reason why there are more statistical analysis for football and baseball compared to hockey to date.

    I suspect there is more stats done in football and baseball because of the betting that goes on. If hockey had the same amount of cash flowing through Vegas and Italian delis I am sure there would be better advanced stats on hockey. Why do the NFL and MLB have far more open injury policies? Same reason. I am sure if there were billions to lose each week, there would be more reliable stats for hockey too, because some would have a vested interest in producing them.

    PS: the Deli comment comes from my new obsession with the Sopranos, sorry if offended anyone