Introducing True Season Score

Last year, I introduced Season Score, a relatively simple metric to evaluate pitchers over a full season by aggregating their Game Scores and adjusting for replacement level. Part of the impetus behind the creation of Season Score was to bridge the two versions of WAR, which use different methodology to try to arrive at the same conclusion: a pitcher’s total value.

I saw the two keepers of WAR as opposite ends of the political spectrum. On the far right was rWAR, which debited a pitcher for runs allowed (if it worked for Charlemagne and Henry Chadwick, it’ll work for us), while fWAR lived on the far left, eschewing runs for fielding-independent outcomes (the world is changing, and we know more now than we used to, so adapt or die). I wanted Season Score to be the moderate answer to the debate between the two sides.

As the season progressed, I realized that there were two problems with this thinking. First, Season Score was not moderate, but rather conservative. In fact, by counting hits allowed more than walks issued and considering earned runs twice as damaging as unearned runs, Game Score, and by extension Season Score, was something of a relic, correlating much more closely with rWAR than with fWAR, and generally going too far to the right side of the spectrum, reflecting team outcomes rather than a pitcher’s own success.

The second problem is that I’m aggressively liberal, both in my politics and in my sabermetrics. I don’t see much difference between earned and unearned runs, since pitchers and fielders share the fault for most runs scored, regardless of an official scorer’s verdict. I also don’t see any reason why we should punish a pitcher too much for results when a ball is put in play, since those results are more often influenced by defense and luck than by anything the pitcher put on the ball. Sure, there are exceptions to this rule, and it does make sense to dock a pitcher who’s giving up frozen ropes all over the field, but not as much as a pitcher who can’t get the ball over the plate.

With these problems in mind, I set out to amend Season Score. Initially, the basis for season score was the Game Score formula, which goes like this:

start with 50 points
add 1 point for each out recorded
add 2 more points for each inning completed after the fourth
add 1 point for each strikeout
subtract 2 points for each hit allowed
subtract 1 point for each walk issued
subtract 4 points for each earned run allowed
subtract 2 points for each unearned run allowed
subtract 2 more points for each earned run allowed

To adjust for factors beyond a pitcher’s control, I created True Game Score, which looks like this:

start with 50 points
add 1 point for each out recorded
add 2 more points for each inning completed after the fourth
add 1 point for each strikeout
subtract 1 point for each hit allowed
subtract 2 points for each walk issued
subtract 2 points for each run allowed, earned or unearned
subtract 2 points for each home run allowed

This correlates very loosely to the FIP formula: (13*HR + 3*BB – 2*K)/IP, in that home runs are counted at least 5 times (1 for the hit, 2 more for the homer, at least 2 for the run), and as many as 11, walks are counted at least twice (they could lead to a run), and strikeouts are counted at least twice (1 out, 1 K, potentially part of a late-inning bonus).

It’s not completely fielding- or context-independent, as hits cost a pitcher a point and runs cost 2 more. I think this is a good thing, since some runs are more excusable. As much as I don’t believe that “good pitchers pitch to the score”, it’s true that grooving a pitch to Miguel Cabrera with a 4-run lead and the bases empty in the 8th is not the same as throwing the same pitch with the bases loaded.

Let’s take a look at some 2011 results through the TSS prism:

Original Season Score leaders at year-end:
1. Verlander, 711
2. Kershaw, 653
3. Lee, 611
4. Shields, 587
5. Weaver, 581
6. Halladay, 579
7. Hamels, 530
8. Kennedy, 490
9. Lincecum, 482
10. Haren/Cain, 459

WAR leaders, per baseball-reference:
1. Verlander, 8.6
2. Halladay, 7.4
3. Kershaw, 7.0
4. Lee, 6.9
4. Sabathia, 6.9
6. Weaver, 6.6
7. Beckett, 6.2
8. Shields, 6.1
9. Romero, 5.9
10. Fister, 5.7

The two agree about Justin Verlander, who pitched an MLB-high 251 innings with an AL-low 2.40 ERA. Season Score gives James Shields a lot of late-inning bonuses for his 11 complete games, while rWAR knocks him down some for pitching at Tropicana Field. Season Score docks Doc Halladay for giving up 208 hits, despite his outstanding 2.20 FIP. CC Sabathia, who was 12th in Season Score, shows up here, due primarily to park adjustments. Looks like rWAR is not as conservative as I thought it was.

WAR leaders, per fangraphs
1. Halladay, 8.2
2. Sabathia, 7.1
3. Verlander, 7.0
4. Kershaw, 6.8
5. Lee, 6.7
6. Haren, 6.4
7. Wilson, 5.9
8. Weaver, 5.6
8. Fister, 5.6
10. Hernandez/Bumgarner, 5.5

Lots of change here. Halladay gets a lot of credit for his mind-blowing strikeout rate (8.47/9 IP), walk rate (1.35/9), and home run rate (0.39/9). Sabathia, who struck out even more than Halladay and pitched more innings, outpaces AL Cy Young winner Verlander. This seems odd, since Verlander struck out more, walked fewer, and had much more luck with BABiP, but Verlander gave up a few more homers and stranded more runners (80% to 77), giving him a far better ERA (2.40 to 3.00). I believe that it was much closer than Cy Young voting indicated, but that Verlander was the better pitcher. Let’s see what True Season Score thinks:

True Season Score leaders:
1. Verlander, 506
2. Halladay, 470
2. Lee, 470
4. Kershaw, 465
5. Shields, 425
6. Weaver, 397
7. Hamels, 379
8. Haren, 357
9. Sabathia, 352
10. Kennedy, 331

True Season Score agrees with both WARs that Halladay should have won the NL Cy Young, which I think is true, although Kershaw wasn’t as far behind as fWAR might lead you to believe.

Only fWAR has Sabathia ahead of Verlander. TSS has Sabathia in the top ten, as it isn’t as angry about the BABiP as SS was. I like him down here, as every pitcher ahead of him except Haren had a better ERA and all but Hamels threw a similar number or more innings. Both versions of WAR give Sabathia a boost for pitching in an extreme hitter’s park, and rightfully so, but this seems like an appropriate mix of on-field results and true talent.

This is not an effort to disparage either keeper of WAR, both of whom do fantastic work and are revolutionizing the world of baseball statistics. Certainly, a stat that does not adjust for park effects is not meant to compete with more sophisticated metrics. Rather, this is an attempt to evaluate a pitcher’s cumulative success at any point in a season in a simple, understandable way that reflects a combination of team outcomes for which he was partially responsible and true outcomes for which he bears the entire load. Throughout the season, I’ll post TSS leaderboards and make observations about the league’s best pitchers and rotations using this new metric.

2 Responses to Introducing True Season Score

William Miller says:

April 25, 2012 at 3:07 am

Ever notice how, regardless of the metrics used, the best players always turn out to be pretty much the same? To a certain extent, I think virtually all stats seek to codify what we already pretty much know to be true. For example, if a particular stat-system showed Mickey Mantle to be wildly overrated, we’d probably have to conclude that the stat-system itself was flawed rather than to simply, blindly, accept that result.
Still, I enjoy the constant process of refining these stats to help us better understand the inner-workings of what appears to be a pretty simple game.
Bill

Pingback: 2013 Cy Young Picks | Replacement Level Baseball Blog