Exploring @DTMAboutHeart’s WAR Model as a Rate Statistic

@DTMAboutHeart’s hockey WAR (wins above replacement) model was released last October, and one of the things that excited me about this model (as well as WOI’s WAR model) was my hope for a legitimate “counting stat” – a statistic that doesn’t need to be “adjusted” for time on ice (TOI). @DTMAboutHeart’s 5-part explanation for WAR can be found here for anyone unfamiliar with this statistic. Rate stats in more traditional hockey statistics provide value as they help to account for TOI variance (for more information please read this great article). Ostensibly, It seems that @DTMAboutHeart’s WAR model was  developed in a similar manner to baseball WAR or basketball VORP where playing time is, more or less, “built in” (each of these sports has completely different issues when looking at playing time, so it is hard to compare the 3). One of the first ideas I had was to look at “raw” WAR vs. “TOI adjusted” WAR (WAR per 60). Should WAR be used as a rate stat? Or is it more useful as a counting stat? Maybe both are useful?

One of the issues with “WAR per 60” is that each component is tied to a different TOI amount  – EV offense/defense with even-strength TOI, PP offense with power play TOI, and drawing penalties, taking penalties and faceoffs with total TOI. Turning each of these into a “per 60” version starts to eat away at the beauty of this statistic – WAR is made of 6 individual components that are summed to arrive at an Overall WAR figure. When the WAR components are “per 60’d” (using the traditional method) weird things start to happen – most notably, the sum of the 6 components does not equal Overall WAR per 60. Below, you can see the “top 30” NHL forwards from the 2015-2016 season ranked by Overall War per 60:

war-top-30-f-per-60

Obviously, something weird is happening when we look at the “WAR/60” data provided by @DTM – Power Play Offense causes some serious problems. Again, all the components are not equal in a “per 60” measure. So how can we adjust for TOI to get a proper Overall WAR rate stat version while accounting for the weird things that happen because of the different TOI figures used? If our goal is to attempt to keep the “6 components = Overall” aspect (which I think it should be), then we need to adjust each component separately depending on the TOI figure that it is tied to. What might that look like? Well, the below chart attempts to show my initial approach with this adjustment:

war-top-30-f-avg-toi-normalised

Above, you can see that I turned the EV components into “per 800 minutes” numbers, the PP component into “per 100 minutes” numbers, and the taking penalties, drawing penalties and faceoff components into “per 1000 minutes” numbers. These are roughly the average TOI for each category among all skaters included in the WAR data (average EV TOI is 850 minutes, average PP TOI is 105 minutes, and average total TOI is 1050 minutes) – I rounded them off to nice big whole numbers. You could argue that it should be exact averages, but I’m not really sure it matters that much – “per 60” isn’t really an exact measure anyway, so I’m not too concerned. The TOI figure we use in rate stats (per 60 minutes) probably warrants further exploration already, but I’ll save that for another time.

Okay, back to the chart. Obviously, we got it right because Connor McDavid is the best player in the world (I kid, but really… ). One of the problems with “ranking” players using rate stats is how TOI dependent everything is – that sounds pretty obvious, right? This is especially true with WAR. The biggest issue is the lack of a “qualified player” definition in the NHL. Where do we draw the line when comparing players using rate stats? Well, I don’t know. I have some theories, but again, I’ll save that for another article. Back to the main topic. When we look at defensemen, we start to see even stranger results. Here is what the top 30 defensemen look like “ranked” by Overall WAR per 60:

war-top-30-d-per-60

And here is what we get when we use the same TOI adjustment used above:

war-top-30-d-avg-toi-normalised

The forward group “looks” fine, but the defensemen charts look really strange (when compared with the raw WAR leaders for this season). This could be a result of not using position specific TOI average numbers, but honestly it seems there is something else at play. The problem could also be that we are working with very small numbers (and a lot of rounding in the Overall WAR per 60 data provided), so players in the 31-100 could end up being within .05 of the “leaders” in Overall WAR per 60. At the heart of the issue is TOI itself. We’re comparing OEL (1700+ total mins) to Mark Barberio, Zidlicky, and Klefbom (400 – 600 total mins). Is it fair to expect Barberio to play at that rate for 1700 minutes? Maybe… but probably not. Since hockey doesn’t have a true qualified player definition, it seems that one would have to rely heavily on conditional statements to use WAR this way for analysis (similar to how conditional statements should probably be used with other rate statistics).

With all that said, let’s try this out at a player level. One of the benefits of a counting stat is the ability to find “value in consistency” – a player who consistently provides above replacement level play without being out of the lineup. One of the questions I’ve always had with rate stats is: why are we ignoring missed time? How do we account for players who miss more time due to injuries? Rate stats adjust production for time played (which is good for 3rd liners who produce at a very high level, etc.), but this approach potentially inflates players who miss time and deflates players who never miss time. Wouldn’t it stand to reason that a player who isn’t in the lineup provides no value at all? Is there not value in a player who consistently plays 82 games a season above replacement level? The following is not an attempt to come to a definite conclusion, rather, it’s a starting point for looking at this question using WAR. I’ll look at two separate comparisons here for the sake of brevity, but there are many more comparisons that could be made. Hopefully I’ll explore this more in the future.

Let’s start with a very intriguing player: Keith Yandle. He is among the highest performing defensemen in WAR since the data was collected (start of the 2008 season), but he’s often missing from the best-defenseman-in-the-league conversation. Yandle’s WAR was one of the reasons I wanted to explore this idea. So, let’s compare Yandle to a player who is sometimes considered one of the best defensemen in the game, but who also has a reputation for having injury issues (aka he misses time): Kris Letang. Below you can see their respective seasons since 2008 represented in Overall WAR:

yandle-letang-chart

yandle-letang-numbers

One thing that might stick out is how much better Yandle looks from an Overall WAR perspective. If you look at the total TOI each has played per season (or even-strength TOI), however, you might see why that is. Yandle has averaged almost 200 total minutes and almost 250 even-strength minutes more than Letang.

Now let’s see what these players look like from a rate stat perspective. Below I used the same method from earlier to turn WAR into a rate stat:

yandle-letang-avg-toi-chart

yandle-letang-avg-toi-numbers

From this perspective, Yandle and Letang look pretty comparable. Letang takes more penalties and Yandle is worse per EV defense, but they’re very similar players from this perspective overall. The adjustment seems to have “corrected” for Letang’s injury history – he appears to have improved using this approach. It also seems to have “corrected” for the consistent amount of minutes Yandle has played each season. Is that correct?

Let’s look at another example to further explore this question. Here’s a high-end forward who has a reputation for being injury-prone vs. a high-end forward who does not have this reputation:

stamkos-kopitar-chart

stamkos-kopitar-numbers

Stamkos’ TOI figures are not actually as bad as you would think based on the reputation. However, Kopitar has rarely been injured throughout his career, and you can see that in the TOI figures. Stamkos and Kopitar are both top-line centers that contribute similar value in different ways. From a WAR perspective, this is mostly seen in Kopitar’s strong EV defense component and Stammer’s very solid PP numbers. But Stamkos has missed a decent amount of time due to injury where Kopitar has missed a total of 12 games since the 2008-2009 season. Let’s look at the TOI “correction” that we used in the Letang/Yandle example and see what that looks like:

stamkos-kopitar-avg-toi-chart

stamkos-kopitar-avg-toi-numbers

As you can see, Stamkos’ numbers even out – his PP numbers “decrease” significantly, but his EV numbers “increase” quite a bit. For Kopitar, he still looks like an elite player. Kopitar averaged almost 100 minutes total more and almost 180 minutes at even-strength more than Stamkos, while Stamkos averaged 30 minutes more on the PP.

Stamkos clearly gets a bump when we convert WAR into a rate stat, but doesn’t it stand to reason that Kopitar’s value should should still be “higher” than Stamkos’ assuming we’re looking at Overall value? If two players who never miss time contribute similar value when in the lineup, would we not expect one of those players’ value to decrease if that respective player misses time (for whatever reason)? This same question applies to Yandle and Letang. When we convert WAR into a rate stat, Yandle and Letang appear to be comparable. But wouldn’t it stand to reason that in actuality, Yandle is more valuable because he is never injured?

Is a rate stat version of WAR that much better than just using the “raw” totals? Is it maybe even misleading to look at WAR in hockey from a TOI adjusted perspective? I think a lot more research needs to be done in this area, but seeing as the data is very new, my initial impression is that “raw” WAR appears to do a very good job “correcting” for TOI variance. WAR per 60 (or the component-specific version I used above), in my opinion, is problematic at this point in time and needs to be handled with care. From a valuation standpoint, “raw” WAR seems to account for TOI in a way that has not been previously accessible or available in other hockey “counting” statistics, and converting raw WAR into a rate stats potentially detracts from the value a complex counting stat could provide for analysis.

Advertisements