Why?

Statistics and mathematical modeling have changed the way society functions.

Meteorologists use new methods of gathering information, especially from the atmosphere, to create a far grander database than any that human society has used for that purpose before. They feed these data points into supercomputers, and create complex models that involve a multitude of variables — often variables that are in a constant state of flux or that are difficult to get firm information on.

Edward Lorenz, who served as an army weather forecaster during World War II, and then studied and taught meteorology, coined the “butterfly effect” as shorthand for how tiny variations could affect weather models: “Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?” He pioneered the field of chaos theory largely in an effort to better predict meteorological changes. Scientists continue to use long-reaching models to predict damage caused to the earth by global warming and human impact, while similar (albeit less complex) models are used all over the Earth for long-range forecasting.

Advertisement - Continue Reading Below

The world economy, tracking the financial dealings of seven-billion plus through a dizzying array of legal jurisdictions, tax laws and the like, is tracked through a series of macroeconomic statistics; the ones produced at Berkeley use GDP, CPI, unemployment rate, corporate profits, change in business inventories, housing starts, interest rates, exports, and personal savings rate, among others. It’s on the basis of similar economic models that men like the Governor of the Bank of Canada or the Chairman of the Federal Reserve issue projections and make their decisions — decisions that affect millions.

Even the advertising industry has made extensive use of mathematical modeling. Taking statistics compiled by government and the private sector, they break the population into demographics, and target their ads to different sectors of the population. By doing this they hope to better market their products. In 2001, The Coca-Cola Company alone spent nearly $2 billion on advertising — money that was largely distributed on the basis of demographic information compiled by market research companies.

Even sports teams have turned more to mathematical models in recent years. The book Moneyball, based on the success of Oakland Athletics’ GM Billy Beane, is generally considered to be the work that brought statistical work into the mainstream in baseball. Theo Epstein, originally a PR man with the San Diego Padres, worked his way up the ladder and in 2002 was hired by the Boston Red Sox as General Manager. He was 28; the youngest GM in the history of Major League Baseball, and a man who’d never played the sport at even the high school level. Using sabremetric principles (math pushed by the Society for American Baseball Research), his team has won two World Series championships in his six years at the helm.

These are just a few examples; in virtually every scientific field, whether “hard” science or the social sciences, mathematical modeling is a primary tool for extrapolating all kinds of factors. It’s used in other sports, it’s used in scenarios that are for less predictable than what occurs in the carefully regulated confines of the arena, and often it’s used with data far less complete than we have available for free from the stats page and play-by-play reports at NHL.com.

Advertisement - Continue Reading Below

In most of these other fields, suggesting that your powers of observation and gut feeling were a better method for prediction than mathematical models using a vast database would get you laughed out of a job.

In hockey, it’s the only commonly accepted practice — and I have only one question. Why?


  • Hippy

    I don't think observation has ever been argued as the only acceptable practice. Where the debate gets heated is in determining the weight of advanced statistics VS observation and experience.

    Good article all in all.

    One critisism would be that using data as a predictor of weather misses as a pro argument. Les Nessman's Eye Witness Weather was way more reliable than the modern weatherman and that harkens all the way back to the '70's.

  • Hippy

    I think a big part of the problem is that so many of the people who are running teams/the league are not actually qualified. Their big selling point is that they played for awhile and maybe won a few cups. Sure, that experience can be valuable but there does seem to be a mentality in the NHL that having played there automatically means you know what you're doing. Can you imagine if the Jets named Brett Favre as their head coach: They would immediately become the laughingstock of the NFL.

  • Hippy

    Rick wrote:

    Where the debate gets heated is in determining the weight of advanced statistics VS observation and experience.

    Among reasonable folks (like Brownlee, for instance) that's where the debate rests. Among less reasonable folk, the value of statistics at all is pretty debateable.

    Les Nessman’s Eye Witness Weather was way more reliable than the modern weatherman and that harkens all the way back to the ’70’s.

    I watched the odd WKRP in Cincinatti show on Prime, I think it was (because obviously I didn't see it when it first aired). Les was always pretty funny, but there aren't too many videos of him hanging around the internet.

  • Hippy

    Rick wrote:

    One critisism would be that using data as a predictor of weather misses as a pro argument. Les Nessman’s Eye Witness Weather was way more reliable than the modern weatherman and that harkens all the way back to the ’70’s.

    I was out to dinner with some friends last night, one of whom works at the air-traffic control tower at the local airport. Part of her job is to update weather observations.

    Anyways, she was saying last night that the weather report will often switch from rain to sunshine based on the fact that she wrote down "clear" when she reported the conditions.

    I don't know how much modelling local weather forecasts use; my hunch would be not much. I was thinking more long the lines of the folks who track hurricanes, tropical storms and the like.

  • Hippy

    Jonathan Willis wrote:

    Among reasonable folks (like Brownlee, for instance) that’s where the debate rests. Among less reasonable folk, the value of statistics at all is pretty debateable.

    I don't think that is true. "New" statistics that they haven't been brought up, sure, those may be met with resistance. But those that prefer to rely only on their eyes aren't generally afraid to use goals or assists to bolster their argument.

  • Hippy

    The Towel Boy wrote:

    This site is becoming Jonathan Willis Nation…seriously dude…do you sleep at night? You’re a blogging machine.

    I write as a way to relieve stress and such, so I've got hundereds of text files on my computer on all sorts of topics that I can pick at random and use for ideas. Plus, it usually only takes between 15 minutes and half an hour to get a rough draft, and the whole thing's wrapped up inside of an hour (usually).

    Still, my wife's a patient woman.

  • Hippy

    Jonathan Willis wrote:
    Among reasonable folks (like Brownlee, for instance) that’s where the debate rests. Among less reasonable folk, the value of statistics at all is pretty debateable.

    I don’t think that is true. “New” statistics that they haven’t been brought up, sure, those may be met with resistance. But those that prefer to rely only on their eyes aren’t generally afraid to use goals or assists to bolster their argument.

  • Hippy

    speeds wrote:

    I don’t think that is true. “New” statistics that they haven’t been brought up, sure, those may be met with resistance. But those that prefer to rely only on their eyes aren’t generally afraid to use goals or assists to bolster their argument.

    You're right, of course, but goals and assists are pretty straight-forward; you just need to count. It's when the mathematics get more complex (for example, counting events against players and valuing those players based on ice-time, points-per-game, etc.) that they don't make the jump.

  • Hippy

    As a huge fan of the Oakland A's and of the methods of SABRmetrics (although I haven't made a good enough effort to understand them like the back of my hand), I love reading your stat breakdowns and I hope that this becomes a bigger part of the hockey world.

    I'm sure the time will come, baseball too was once a very old-school system, but it is in the process of changing and hockey is perhaps just 5-10 years behind in its thinking.

  • Hippy

    @ Jonathan Willis:
    I enjoy the intricate nature of stats, which is why I suppose that is why I got 100% in University Statistics. I love reading well thought out analysis' of players backed up with stats.

    More and more I have found that the people that seem to be talking out of their butt crack don't seem look at a single stat. Or if they do they pick the one they like the most and use that to make universal declarations that are asinine.

    One thing I appreciate about a statistical machine like yourself is that you are always willing to hear the intangible arguments surrounding some stats. People are more random than weather at times and there are often things that affect the game that can't be told by statistics (like team support and the mood in the room and leadership etc). Once you couple statistics with insider information (thank you Robin and Gregor) it really starts to make a lot more sense.

  • Hippy

    Jonathan Willis wrote:

    You’re right, of course, but goals and assists are pretty straight-forward; you just need to count. It’s when the mathematics get more complex (for example, counting events against players and valuing those players based on ice-time, points-per-game, etc.) that they don’t make the jump.

    Tom Benjamin wrote about the danger of being able to count stats awhile ago, with regard to faceoffs. I can't find the article (I think it may be lost forever from when his site remodelled), here is another blog talking about it (maybe 1/4 to 1/3 of the way down the webpage).

    I don't want to misremember or misstate his argument (so if anyone remembers better, or has a working link, it would be appreciated), but what I took from his article as his opinion was that faceoffs were overrated by conventional wisdom because they were now counted.

  • Hippy

    @ Jonathan Willis:
    My statistical analysis has revealed that Sheldon Souray is statistically the most manly Oiler in existence and the Jonathan Willis really likes math.

    Jonathan, I suggest a book for you called "Blink" by Malcom Gladwell. It's subject matter will interest you as it confirms what you're saying, to a point, but it also points out that the split second intuitive decision you first make, is most often the correct one. It points out that your brain is making thousands of unconscious decisions every second. It further points out that the more you have to explain your immediate intuitive decision the more likely you are to change your mind.

    It also points out the failures of relying on only statistical data and number crunching. While those numbers will tell you what you need to know, it cannot predict with absolute certainty what the future will be.

    If your statistical analysis leaves out one variable, it could be the difference in drafting Marc Pouliot and Zach Parise.

  • Hippy

    Dropping Deuces wrote:

    It sounds to me that Willis is trying out for the part of John Nash in A Beutiful Mind. Russel Crowe be damned!

    The funny thing is, I'm a good math guy, but I don't have a ton of experience in statistics. For example, in the Lecavalier article a little ways down, it didn't even occur to me to adjust his Corsi for the team until I started taking heavy flack, and it's something I should have done in the first place.

    I'm learning, in other words.

  • Hippy

    @ speeds: Tyler has some good stuff on that at his site, and Staples picked up on it too. I really wonder how important faceoffs are in the grand scheme of things; I think we do tend to place too much importance on it in all likelihood, but that's a guess not a hard statement.

  • Hippy

    Fiveandagame wrote:

    My statistical analysis has revealed that Sheldon Souray is statistically the most manly Oiler in existence and the Jonathan Willis really likes math.

    I was just on the Oilers website and the picture of Souray makes him look like a hypnotist. I believe his is using his mind powers to receive man crushes from all of Oiler Nation.

  • Hippy

    @Fiveandagame: Well of course you can't rely on stats 100% of the time, because even if this is a deterministic universe, we simply don't have the tools to unlock our destiny: the model is far and away too complex, and may always be. But as a tool for checking what your eyes are seeing, they're perfectly valid, and you can make reasonable guesses from them just as sure (and sometimes more so) as you could from your eyes. And yet, because they don't understand them, people have an allergic reaction to them.

  • Hippy

    Cam wrote:

    One thing I appreciate about a statistical machine like yourself is that you are always willing to hear the intangible arguments surrounding some stats. People are more random than weather at times and there are often things that affect the game that can’t be told by statistics (like team support and the mood in the room and leadership etc). Once you couple statistics with insider information (thank you Robin and Gregor) it really starts to make a lot more sense.

    Take Sean Avery, for example. His underlying numbers (QualComp, Corsi, etc.) show a heck of a hockey player, and if those existed in vacuum, he'd be a guy to pursue before.

    Of course, knowing what we know about him now, I wouldn't want him within 1000 miles of my hockey team. There's definitely personality/human factors that need to get taken into consideration, and when it comes to coaching you absolutely need to look at the process (i.e. physical scouting) as opposed to results.

    But I've yet to see an explanation for why the eyes of one biased observer is a better tool for determining results than indepth statistical analysis.

  • Hippy

    I don’t know how much modelling local weather forecasts use; my hunch would be not much. I was thinking more long the lines of the folks who track hurricanes, tropical storms and the like.

    Numerical models drive every single weather forecast, from the local forecast at your friend's airport to broad scale climatology. Human input happens a lot less frequently than people might suspect, likely to the overall detriment of the product.

    That aside, the best thing that advanced stats offer us is the opportunity to assess players beyond the "saw him good" approach. Context, in other words. Why aren't more people talking about the year Mike Smith is having in T. Bay? Look at his numbers versus, say, Kiprusoff's. He has a lousy team in front of him, so the win total is lower, but he's had a significantly better year by any other worthwhile measure. He's on a poor team in the South, so he doesn't get much media attention, but looking at the numbers can help us go beyond that bias. The math behind this stuff isn't that complex. Why be scared of it?

  • Hippy

    […] me the sports are just too different to allow statistical data to overtake live scouting. Willis mentioned Sabremetrics as a key to the success of the Red Sox, and while I agree it had something to do with […]

  • Hippy

    Contrary to public belief and practice, some things in life just aren't quantifiable. How do you put a statistic on an important facet like desire and heart. Everyone knows that desire and sacrifice are key components to any successful team. Probably just as important as scoring. Did the 06 Oilers have a 4.6 desire/sacrifice rating? Was that rating 1.3 higher than the rest of the Western Conference? Statistics are a vital assest for evaluation and prediction, however, we would be sticking our heads in the sand if we were to believe they told the whole story.

  • Hippy

    Deans wrote:

    Contrary to public belief and practice, some things in life just aren’t quantifiable. How do you put a statistic on an important facet like desire and heart. Everyone knows that desire and sacrifice are key components to any successful team.

    You can't, and I wouldn't argue that that you could. I would argue that on-ice results (offensive and defensive ability) can be measured better with statistics and math than with scouting, because of the limitations of an individual scout (no matter how capable).

    A player's character, off-ice behaviour, willingness and desire; these are all things that need to be answered through a combination of on-ice observation, and off-ice discussion. I'd argue that those attributes can't be effectively guaged solely by on-ice observation, either.

  • Hippy

    Deans wrote:

    Contrary to public belief and practice, some things in life just aren’t quantifiable. How do you put a statistic on an important facet like desire and heart. Everyone knows that desire and sacrifice are key components to any successful team. Probably just as important as scoring.

    As important as scoring? Two things matter in hockey, goals scored and goals allowed. Anything else is only important insofar as they lead to goals scored and goals allowed.

  • Hippy

    Why?

    For the same reason Vic (one of the staunchest supporters of hockey stats) chirped off about how goalies shouldn't get as tired as other players because they don't really do anything outside of spurts.

    Nobody wants to let go of their own biases. The old stats are hugely inadequate for looking at much of anything. The new stats have not been held up to review (in the peer review sense, since you're talking about science. That's the thing. The hockey stats are not actual stats. There have been no actual studies. They are data sets that people take, put together with other observational things, and try to draw conclusions from them. There is no rigor. There is no repeatable experiment. There's a series of data sets that people attribute to meaning "something" when realistically we don't have a good way of mathematically modeling anything this fast or fluid.

    Let me put it this way. You can model a game as simple and algorithmic as baseball. You can model the effects of X given a control group. However, physicists *still* cannot accurately measure n body gravity problems where n > 2. It's an issue with static vs dynamic.

    That's not to say the whole field is lost. It's not. But instead of trying to say "this data set implies X" how about everyone steps back for a minute and says "We need to try to use these methods to build a model which should lend itself to prediction."

    Comparing hockey stats to anything in the hard or soft sciences is apples and oranges.

  • Hippy

    Oh, and business and hockey stats are also apples and oranges. In the business world, you make a product, send it out into the world, and see how it's received. You can replace the word "product" in that sentence with "ad" or "hype machine" if you like, but the point stands. The only comparable things between business stats and hockey are a)How many tickets did the team sell, b)How many more tickets will the team sell if it's doing well/making the playoffs and c)How much money should we charge for those tickets due to demand.

    Of course then your hockey stats *are* business stats, so they can both be apples.

  • Hippy

    @Doogie:

    If you're using stats to confirm or deny your eyes, aren't you running into the old correlation fallacy that if two things are correlated it "means" something? I mean, realistically there's enough data out there to confirm just about anything to your eyes, depending on how rigorous (predictive) you want to be.

  • Hippy

    @ Ender:

    Such work hasn't neccessarily been done by the folks in the Oilersphere, since it's a developing field, and people wroking on it for teams are obviously going not going to just share the information, but what about the work done by Alan Ryder and others at his site? That would seem to fit into your point.

    Besides the point was less about direct comparison and more about the fact that systems far more complex than hockey are studied via math and statistics, so why should hockey be immune to that sort of study, rather than a direct comparison between the stuff being done right now and a paper submtted for peer review.