Why the “Scoring Chance” is a flawed metric, and how we can improve on it

Last week, Columbus Blue Jackets head coach John Tortorella sent hockey Twitter into another round of arguing about which statistics are and aren’t meaningful when he denounced both corsi and fenwick

Hearing an NHL coach say that shots and unblocked shots are not meaningful in measuring team performance is patently absurd, but Tortorella long ago passed into self-parody so any discussion on that topic is a waste of time for all involved. 

What is more interesting is CBJ beat writer Aaron Portzline‘s tweet that said while Tortorella doesn’t care about shots and unblocked shots, he does care about scoring chances. And in fact, he cares about them so much that he displays the charts for the players to view. Tortorella definitely isn’t the only coach to value scoring chances. Tampa Bay Lightning head coach Jon Cooper has also discussed publicly that the organization tracks scoring chances and views them as important. And earlier this week, Penguins coach Mike Sullivan discussed how his team uses scoring chances.

What would make a traditional coach like Tortorella so willing to embrace a statistic like scoring chances, which is newer, less proven, and less valid than shots when he is clearly adamant about rejecting shots and unblocked shots, which are both more well-established and more valid measures of team performance? 

It obviously isn’t math. No publicly available statistic predicts future goals better than shots. I would argue that the appeal of scoring chances as a statistic is largely due to its intuitive significance and its name. Obviously, creating chances to score is important to scoring goals and therefore winning games. And while corsi and fenwick have obscure names that do not suggest their meaning, scoring chances is named exactly what it purports to measure. So it makes sense that people would gravitate towards the term scoring chances.

The Problem with Scoring Chances

The unfortunate part of the widespread acceptance of scoring chances is that it has several flaws that are rarely discussed. The first, and most significant, is that the term does not have an actual definition meaning that the Blue Jackets scoring chances are different from the Lightning scoring chances are different from the Penguins scoring chances are different from the War-On-Ice scoring chances are different from the Corsica scoring chances. All of them are based on counting shots that meet certain characteristics such as location and shot type to determine how dangerous a given shot is. But determining those characteristics is left to each organization to define.

The second problem is one that Garret Hohl has been vocal in criticizing. Scoring chances represent “binning” of data. Binning is almost always a bad idea because it draws arbitrary boundaries around an otherwise continuous set of data. To use our scoring chance example, all shots have a given danger level. The most logical way to measure that is in expected shooting percentage. 

To rephrase, if a player has a clean look from the slot, what is the chance that shot gets past the goalie? Does that shot go in 5% of the time? 10% of the time? 15% of the time? What if the pass preceding the shot came from behind the net meaning that the goalie will have a harder time finding the puck? What if the shooter one-times the shot? What if they catch it and wrist it? All of these things contribute to the likelihood that the puck will end up in the net. Using Ryan Stimson’s passing project data, I established that shooting percentage is impacted by the passing sequence that precedes it.

And thus, shots of various types could be expected to go in the net anywhere from nearly 0% of the time (a shot from the opposite end of the ice) to nearly 100% of the time (an empty net shot where the shooter is standing in the blue paint). So what makes something a scoring chance? A shot with an 8% chance of being a goal? A shot with a 10% chance? 12%? I’m guessing you get the point.

The Solution

Fortunately, we have a solution to this problem. Unfortunately, the solution is terrible in terms of getting more traditional minded hockey people to embrace it. The solution is to start thinking in terms of expected goals. Emmanuel Perry has already introduced an expected goals calculation and made it available on demand at corsica.hockey. DTMAboutHeart has his own expected goals model and he shares his methodology and results regularly but hasn’t yet created or partnered with a site that would allow us to access it on demand. Both models follow the same overall concept. They calculate the likelihood that each shot taken will lead to a goal and use that to calculate the expected goals for a team. So, if a team generates 20 shots that each have a 5% chance of being a goal, they would accumulate one expected goal.

This line of thinking removes the need for a term like scoring chance because we can define the danger level of shots based on their expected shooting percentage instead of a collection of characteristics. That shot from the slot we discussed earlier might be a 12% shot and a shot from the point might be a 2% shot. Talking in those terms allows for assessing the danger of each chance without resorting to binning.  We could even look at a given game where one team had 30 shots accumulating 3 expected goals and the other had 24 shots accumulating 2 expected goals and know that not only did the first team generate more shots, they also generated more dangerous shots with an expected shooting percentage of 10% (3/30) compared to 8% (2/24).

The Problem with the Solution

So that’s all great and makes perfect sense and seems like a great solution to using questionably binned data like scoring chances. Unfortunately, as is often the case, the NHL has botched our ability to introduce this concept smoothly. The source of all of the data that you see on sites like Corsica is various types of files that the NHL publishes providing information about what happened during the game. The NHL tracks every shot attempt and its location as well as the shot type, which is what both Perry and DTMAboutHeart use to calculate expected goals. They have significant differences in their methods but that’s a topic for another time so for now, the only point is to say they share the same basic sources of data. 

The problem is that the location that the NHL provides for blocked shots is the location of the block and not the location of the shot. Therefore, blocked shots can’t be included in the expected goals calculations. And thus, we can’t simply call the expected shooting percentage that we discussed above “expected shooting percentage” because that would be inaccurate. We have to call it either “expected unblocked shot shooting percentage” or “expected fenwick shooting percentage,” which becomes xFSh%. And that is a genuinely terrible name for a stat if the goal is to get the public and traditional hockey people to be willing to consider a new idea.

Final Notes

I have now written 1200 words calling attention to a problem in nomenclature for which I have no solution. Thinking of each shot in terms of the likelihood that it becomes a goal is the correct way assess that danger of the shot. And due to the limitations of the data we currently have, getting the public and teams to think in those terms will be a challenge. But that doesn’t mean we shouldn’t try. And so going forward, I’m going to try to shift to discussing shots in terms of xFSh% instead of relying on scoring chances whenever possible.

As a final note, please don’t walk away from this post with the message that using scoring chances is bad. It isn’t. My only point is to say that if the goal is to understand which team is creating more dangerous chances, xFSh% is a better approach. But it also presents real barriers in naming and definition that make it difficult to use in casual discussion of the game. So until we get better data that includes shot locations for all shots, I would expect to see scoring chances remain as a statistic. But hopefully, we can start working towards using calculations of expected shooting percentage as the most sound approach to measuring how dangerous a given shot is.

For more information on expected goals and some of the limitations of counting up each shot’s xG as described above, read Danny Page‘s excellent article here.

Graphs

In case you’re wondering which teams generate and allow the most dangerous shots based on xFSh%, here are some charts. The first shows teams’ xFSh% for and against. But just looking at xFSh% ignores the amount of shots a team generates or allows. The second chart shows xFSh% for and fenwick (unblocked shots) for. The final chart shows xFSh% against and fenwick against.

Dashboard 1

Dashboard 2

Dashboard 3

  • albertabeef

    How is Corsi relevant at all? The last 6 games the Oilers out Corsied their opponent, but lost the game. You people diss +/-, but worship Corsi. Goals win the game, and that is what +/- reflects.

      • vetinari

        So what you are saying Jonathan is that Corsi isn’t a perfect predictor of Cup wins or last place finishes?*

        *Yes. I know there is a correlation but not a perfect causation. I just want to see people’s heads explode and it’s funny to watch intelligent fans with wildly differing opinions on a developing field of metrics go at it with calculators drawn and pencils unsheathed. And it’s a slow news day after an Oilers loss so…

      • chuckcouples

        If that’s your argument, maybe we should just use goals for to determine the best teams.

        From 2010-16, teams ranked by Goals for: 1: Pittsburgh, 2: Chicago, 3: Tampa Bay 4: Boston…28: Buffalo, 29: Edmonton, 30: New Jersey

        5 Stanley Cups of the last 7 to the first 4 teams plus two more Conference Finalists.

        4 of Seven last place finishes to the bottom 3 teams on the list.

        • Bittersomfan

          I think this is a fair point. My guess is that corsi is meant to be a way of explaining why the teams in the top scored more goals than the ones in the bottom.

          I don’t think it does that though because while there is a strong correlation between goals scored and corsi, there is no causality. If there were you could score more goals just by shooting a few extra shots per game, no matter the quality of the shots. So, in my mind, it ends up just being another way of saying what we already know. If you score more goals you are more likely to win.

          I still think so called ‘advanced stats’ are very interesting though, but some of them have limited usefullness.

    • Peachy

      +/- tells you who won the game, but it is next to useless in telling you who will win the next game.

      Corsi and other shot metrics are next to useless in telling you you who won the game, but are pretty good at telling you who will win the next game.

      What are you more interested in knowing?

  • theoil

    The thing that bothers me is that Corsi events are viewed equally no matter where they are taken on the ice. How can a shot from the point or the boards be the same quality as a shot from the high slot or in front of the crease. That’s where scoring chances seem to be more accurate. Even though each team might have a different calculation as to where they might be taken from, generally it should near the net.

    • Sheldon "Oilers Fan for Life!!!"

      With there being a well publish shot map every game including those which scored it should be a simple bit of computer wizardry to generate a statistical probability for each area of the ice. This would allow a value to be applied to areas of the ice and the chance that a shot could score from there. This should allow a far better shot vs. value metric that makes some sense. I suppose you could then sort by kind of shot but that would be too much work IMO. The reverse of such a graph would be a metric which could show the value of blocking a shot from a given area based on its chance of going into the net. If a defensive player knew where the worst shots were likely to come from and was more likely to block those shots it might increase the value of the block. Logic states that the more infant of the net would be the best areas but some times statistics reveal lines of data that can surprise. This is what the player would want to know.

  • Derzie

    Corsi has value in certain circumstances but is woefully misused. It is a secondary piece of information that is often trotted out as a primary. It is somewhat useful when comparing teams that are established as great or elite. It helps differentiate which of them may win a playoff series for example. But in order to break out Corsi, you already need to look at basic stats like win/loss, pts/game, goal differential, scoring chance differential, save pct, etc.

    If you look at the bottom dwellers (i.e here in Alberta), Corsi is next to useless. The teams have too many flaws and poor simple stats that Corsi is not even worth looking at. Goal differential of -12? Look no further.

    When it comes to players it can be a useful stat to help identify solid two-way players but only over a large period of time. As much as the faithful rail against it, plus/minus is valuable in this area as well. Again, over a long period of time.

    In the game examples where the team wins but has lousy corsi, and vice versa, remember that dumping the puck in is a Corsi event. Giving up the puck is counted as positive possession. Once we get trackable pucks and players we can focus on true possession which will have much greater value than Corsi.

    In the meantime, simple stats will do when measuring our Alberta teams.

  • Who is the author? They start with some cut and dry statements, but we have way of knowing who this person is or why they should be listened to. Is this an anonymous article b/c they are scared to stand behind it?

  • camdog

    After each and every hockey game most coaches go over the video over and over again and leap to a logical conclusion. Sometimes the numbers go into a spreadsheet, sometimes they don’t. Thing is coaches don’t give the information to the media, so most bloggers never really truly understand what a coach like Torts, Carlyle or even Mclellan is getting at.

    • NewPants

      I don’t understand any of all of this.

      I do know that only 1/5th of the season is to little to tell how the oiler are. I still like this group more then I’ve liked the oilers of the past 20 years.

      Three less goals at the right time could be 3 more wins? The luck part that will be hard to graph.

  • S cottV

    I doubt that we ever achieve the theory of everything, for universal advanced stats.

    Coaches will want a combination of advanced stats in conjunction with internal stats, designed specifically for feedback toward what it is that they are trying to accomplish with team systems of play.

    Some Coaches will be less concerned about corsi and or fenwick than others. In fact, some coaches would be concerned that a focus on corsi or fenwick might influence players to do things that the coach doesn’t want done, for sure – in certain situations. Sorry Coach – I was just trying to make my fenwick look better?

    Example – there are times (should be way more times in my opinion), when a Coach would want forward o zone possession, for the sake of forward o zone possession. We’re up 3 to 1 with 7 minutes to go, with o zone possession 20 seconds into a shift. Unless the Red Sea parts and we get a gaping corridor to the net, don’t shoot the damn puck. Work the perimeter and even get a partial or full change in – still working the puck, while the opposition gets more and more tired. You don’t want a bone head player ruining the benefit of o zone possession because his agents tells him – he needs a better corsi, so I better at least get a shot in.

    Another example – some coaches believe in passive good positioning d zone coverage and some believe in more forced d zone coverage. If passive with good positioning – you’re gonna give up some shots and do more blocking, usually of a low grade variety. If forced – you’re probably going to give up fewer shots and the need to block, but – run the risk of way more glaring scoring chances against.

    I do believe that corsi and fenwick are over rated but useful in conjunction with other stats.

    I like what Staples does with his scoring chances but agree that subjectivity is an issue. As a coach, I would want the Staples approach tailored to my subjectivity beliefs and filtered for a number of things like fluky things, that might eliminate the track on a certain play all together.

    I like plus – minus, but when I coached I wanted to know the total pluses and the total minuses and the net. To the point above – I would adjust the results, depending on the play at hand, for the group or certain individuals within the group.

    Example – a first liner, might have a plus minues like +27 / -33 / -6. A fourth liner +6 / -7 / -1. I wanted the tallies because there comes a point where that -33 for example becomes too accepted and lost in the meager -6.

  • Chainsawz

    Hilarious comment about binning. The standings are based on wins and losses from games. Shots on goal are a binned stat from those games, the very stat that at the top is expressed as the end all of stats.

    If teams and their stat departments can out think their opposition by determining which bin of shots on goal produces the best likely hood of goals, all the power to them. There is no flaw in that, only innovation.

  • @Hallsy4

    Scoring more goals than the other team per game matters, and win/loss. I’d be interested to see how someone like Ryan Smyth in his primes fancy stats were. Didn’t have any wowing skills, but seemed to understand what it took to score, somehow, someway, consistently.

  • Sheldon "Oilers Fan for Life!!!"

    It would also be interesting to compare how the LTI players change the results of their teams winning when they are gone. Is it really worth it to block shots? If so how much value does it bring? What happens if you let the goalie actually see more shots vs trying to block them. I know everyone states that blocking is best but what if?

  • Bittersomfan

    I have no doubt that Corsi is very usefull for predicting the results over time (typically a season). I do wonder though how usefull this information actually is. I mean, if someone tells the Oilers management that according to their corsi they don’t shoot enough they wont go “Holy crap, we had no idea!”. “Ok lads, three more shots per game and it’s lord Stanley next.”.

    I vaguely recall Eberle say something like (during the Eakins era) that sometimes on the pp it felt like they took (bad) shots only to increase the corsi. That is of course no way to use corsi, but how do you use corsi to actually achieve something? In that respect I think that scoring chances, or expected goals, have more value.

  • RJ

    The obvious issue to me is the proliferation of new stats.

    If you’re talking about examining the topic from the perspective of shots, then you can pick from Corsi, Corsi ON, Corsi Rel, Corsi Off, Corsi For, Corsi For 60, Corsi Against 60, Corsi For %, Fenwick, Fenwick For, Fenwick Against, Fenwick For%, Opposing Fenwick For % (OppFF%), Opposing Fenwick Against %. On the Fenwick side, they also calculate Team Fenwick For %, Team Fenwick %, and just to be different, they also calculate Team Fenwick For 20, Team Fenwick Against 20, Opposing Team Fenwick For 20 and Opposing Team Fenwick 20, instead of calculating it over 60, like many other advanced hockey statistics.

    These do not include shot statistics like high danger scoring chances or Dangerous Fenwick.

    If you read any articles re: advanced goalie or save statistics, you’ll note that they look at many of these same factors, only instead of looking at them from the perspective of the shooter, they look at the shot from the perspective of the goalie.

    I would close by noting the following; Steve Valiquette’s statistics regarding green shots/red shots and green goals/golden goals is one of the only statistics that takes into account the pass before the shot. A pass across the Royal Road (an imaginary line down the middle of the ice) leading to the shot is much more likely to score than a simple shot from the same point.

    I’d also note that Valiquette’s stats explains perfectly why Devan Dubnyk was a marginal NHL goalie with the Oilers and a Vezina nominee for the Wild. So these stats can be very useful, but there are so many it’s easy to see why regular fans get overwhelmed. You shouldn’t need a PhD in statistics to be able to watch a game.

  • Gadgets

    As an analytics fan, blogs like this are part of the problem.

    The blogger states that scoring chances are an unreliable metric because they are too new and unproven. To him maybe. I know lots of coaches who have been tracking scoring chances and other stats for their entire coaching careers. Just because they don’t put it in a spread sheet and share it with the world doesn’t mean is isn’t reliable and valid to that coach or team

    He also is using a strawman argument first by insulting Tortorella and questioning why he tracks scoring chances and from the headline on, discusses why scoring chances is a flawed metric. Then he states, in bold, that “My only point is to say that if the goal is to understand which team is creating more dangerous chances, xFSh% is a better approach.” But he never made that point at all because that was never a claim made by Totorella!

    Finally he claims that scoring chance is a bad metric but here is a metric thats much better and shows the chances of a goal being scored! The biggest flaw in analytics is not the numbers themselves, it is the inability of people to properly convey concepts and marry them with a real hockey knowledge.

    • RJ

      I like your points and I would add one thing further.

      Part of the issue IMO is that there are now salaries involved in the equation.

      Not that long ago Wanye wrote an article about the owner of General Fanager getting a job with an NHL team. So he will be making a pretty healthy income (I’d guess). He turned down a pretty healthy offer to sell his site, as well so advanced hockey stats are big business.

      So if course these people have to sell that their latest stat they just created is the best tool ever to decipher and understand the game of hockey, so they’ll get more hits and more hits means more money. And if they’re lucky, they can get a call-up to work for a pro team and watch and analyze hockey for a living.

  • IRONman

    Stats are good at assessment of a individual

    Problem with a team is there is too many variables

    You put 3 30 goal scores together does not guarantee 90 goals

    Chemistry like 99 and 17 hard to get

    • @Hallsy4

      I agree, I can’t even read this sh!t, or any of Willis Articles anymore. That’s on me, not meant as a shot at JW or this author. I just don’t put as much value in advanced stats as some. In baseball they work better because much of the game is 1 v 1, more controlled variables, which tend to give fairly accurate results over 162 games…. but even in Moneyball, the ending determined that there’s more to sports than just stats, which is what makes it sport. I’m glad that Arizona is in last this year after hiring their advanced stat guru as GM, after such a promising year. I’m also glad that things aren’t working out in Nashville, and Weber is lighting it up in MTL. The final straw against advanced stats was the Russel Injury, I thought we were saved that opposition could no longer enter our zone on his side…. However, it seems to me like we were winning a lot more with him in the lineup than we have been since he’s been hurt.

  • fasteddy

    I’m too lazy/don’t care enough to dig into all the metrics and which are good or bad. I do have a couple of thoughts though; a good scoring chance can also be a situation that doesn’t actually generate a shot or shot attempt, (maybe this accounted for, I have no idea). The other thing that crosses my mind is shooting percentage; you would think a guy like Lucic would have a high shooting percentage based simply on his style….Smytty same thing. Their shots are typically banging at a rebound, not a wrister from the slot. Not even sure what I’m trying to say, other than I guess I’m not a big fan of the fancy stats!

  • OmJo

    The only thing that Torts’ comments prove is why Alain Vinneault is a much better coach than he is.

    Somebody tell the Rangers that blocked shots aren’t meaningful.

      • OmJo

        So you’re telling me, if you had to choose between Vinneault and Tortorella, you would choose Torts?

        Vinneault has a .529 winning percentage in over 1000 coached games.
        Torts? .475 winning percentage.

        My math says AV > Torts.

        You can hate AV for his time in Vancouver, and I do because he was a pain in the rear for half a decade there, but you can’t possibly argue he isn’t a good coach.

  • camdog

    And Randy Carlyle is a good coach that Ducks team is going to win a lot of hockey games if they play like they did against the Oilers the other night. I know the blogging community have been trying to tell everybody that he’s a bad coach, but I didn’t see a team being coached poorly the other night. Don’t be surprised if they make a cup final appearance, they are that good.

    • Sheldon "Oilers Fan for Life!!!"

      I did not watch the game but from the 5.oo min highlight package it sure seemed as if the Oilers were far more snake bit than the fact that they seemed out coached. Two defensive breakdowns resulted in two goals. This team is improving but they win a few games and their level of attention to detail starts to break down. Suddenly they loose heart and attention to Detail as well. Then we have a loosing streak. They will change so long as they are listening to coach.

      • camdog

        Both coaches are good coaches the author of this blog most likely views them both as bad coaches because they have both been dismissive of advanced stats. Randy Carlyle got ripped for his work in Toronto, but the thing is the team he had just wasn’t very good. If you have a bad team, often the results aren’t going to be good, if you have a All Star hockey players at every position odds are you are going to win a few Stanley Cups oh and lead the league in advanced stats.

        As to the game the other night the Oilers didn’t play a bad game, made a few really bad mistakes that a veteran team was able to capitalize on. The Ducks are a really good team right now.

  • Slappyshot

    Great article! Expected goals has its flaws but it’s the best we got. There’s also a new expected goals mode on moneypuck.com that’s worth checking out. Factors in some additional things like the change in angle on rebounds.

  • All of them are based on counting shots that meet certain characteristics such as location and shot type to determine how dangerous a given shot is. But determining those characteristics is left to each organization to define.

    I’m not sure I accept this particular issue as a “problem” with tracking scoring chances. Why would teams care if other teams track them the same way? As long as a team’s methodology is consistent I don’t see the issue.

    In fact, If I spent time and money determining what should best be deemed a scoring chance and I believe I’ve gotten it right I don’t even want other teams using the my methodology.

    This is not an inherent problem with scoring chance tracking, and would only ever be a negative if teams failed to put manpower to use tracking the rest of the league.

    I also should note that it’s interesting how the writer went out of his way to tell us how he established shooting percentage is impacted by the preceding passing sequence and then links two expected shooting percentage methodologies that don’t take passing sequence into account. Is this not an even more obvious problem with “the solution” than “shot location data isn’t great”?

  • Bills Bills

    Does a dump in on the goalie count as a shot? Of course but not as a scoring chance. So when a goalie misses it and it goes in the net is it not now a scoring chance? Stats in hockey are flawed dramatically. The only stats that truly do matter are wins vs. losses. A team can out chance, out shoot, out corsi the opposition on any particular night but if they are playing against Carey Price the result will likely be very different than if they are playing against Eddie Lack.

    So much happens so fast with a variety of different combinations of people on the ice at any given time. 6 different combinations of forward and 6 different combinations of defense depending on who’s stuck on during line changes, icings etc…. How long they have been on the ice and who of the 6 and 6 different permutations that the other team could have on the ice are all going to alter the results of what is considered a scoring chance or quality of.

    I know stats lovers are going to beat me down hard about this post and I really don’t care. My point is based on the fact that hockey is too fast, too fluid and too unpredictable for stats to be able to correctly predict the outcome more than 50% of the time making them about as useful as a coin flip. But those are also the reasons that I love hockey.

  • Hrkac Circus

    Anyone else notice the Oil are tied with Van for allowing the most dangerous shot attempts against?

    If that’s where we are in 2016, I can’t imagine where we would’ve scored the previous 9 seasons.

  • @Hallsy4

    Is there any stat similar to CORSI or FENWICK that includes individual shooting percentage on shot attempts, as well as shooting percentage against while Player X is on the ice. This would be much more accurate, and surely there could be some equation to accurately reflect the stat.