The 2015 Season Preview in Projected League Leaders
Editor’s Note: With Opening Day nearly here, it’s time to preview the season. Welcome to Season Preview Week!
The Steamer projection system gives Felix Hernandez the best ERA projection in the American League and yet it doesn’t expect him to win the ERA title. No player is expected to hit more than 38 home runs and yet we fully expect that baseball’s home run leader will have more than 40. Wait, what?
The solution to this riddle is that Steamer hands out mean projections (also known as expected values), and means can obscure the uncertainty in what might happen. In our case, this uncertainty comes largely in two forms. First, we simply don’t know how good each player is, both because we have a limited numbers of observations of their past abilities and because those abilities are subject to change. Second, even if we did know every player’s true ability, we still couldn’t say precisely what’s about to take place. In other words, if God came down and, in what some would consider a lesser miracle, revealed every player’s true Stat-O-Matic card or Diamond Mind event table, we’d still have to roll the dice, run the simulations or maybe even play the games.
Adam Dorhauer recently went in-depth on this concept here at The Hardball Times in regard to projecting the standings. Today, I thought it’d be fun to take a look at some projected league leaders.
NL ERA (Qualified Starters)
Shown below are the projected ERA distributions for the 43 National League starters with 162 or more projected innings pitched. Clayton Kershaw is in blue, everyone else is in black, and the red shaded region shows the distribution of ERA for the league leader.
Our projected leader, whoever he may be, has a mean ERA of 1.88 with a 90 percent chance to fall between 1.40 and 2.29. When Kershaw leads the league, he does so with a mean ERA of 1.75 and when someone else ends up on top he leads the league with a mean ERA of 1.94. If Kershaw walks away from the game tomorrow, the mean winning ERA hops up to 1.99. Note that these numbers are all based on the assumption that pitcher outcomes are independent, so if the run environment changes markedly, all bets are off.
As much as Kershaw stands out from his peers (well, almost peers), you’re still better off picking the field. According to these projections, Kershaw has only a one in three chance of leading the league in ERA. Next up is Matt Harvey, who despite having only the fifth best ERA projection in the league benefits from a lower projected innings total and a little more variance than the pitchers with lower means. The reliability numbers shown here (an imitation of Marcel’s reliability scores) represent how much we know about a player’s true talent. Harvey’s 60 percent reliability, considerably lower than his peers’, implies that we know less about his true ability than the others. The most surprising name on this list is surely Brandon McCarthy, who added two mph to his fastball and had the ninth best xFIP- in baseball last year. In case you’re wondering, I haven’t yet found any prop bets involving his ERA or Cy Young chances.
Potential NL ERA Leaders, 2015 |
---|
Player | Chance to Lead League | Reliability | IP | Mean ERA | 10th% | 50th% | 90th% |
Clayton Kershaw | 33.4% | 79% | 218.1 | 2.33 | 1.60 | 2.25 | 3.16 |
Matt Harvey | 7.9% | 60% | 176.7 | 3.02 | 2.04 | 2.91 | 4.14 |
Madison Bumgarner | 6.9% | 79% | 206.3 | 2.95 | 2.09 | 2.86 | 3.92 |
Max Scherzer | 6.7% | 79% | 206.7 | 2.95 | 2.09 | 2.87 | 3.93 |
Zack Greinke | 6.1% | 78% | 199.2 | 3.02 | 2.13 | 2.92 | 4.02 |
Stephen Strasburg | 5.0% | 78% | 189.3 | 3.10 | 2.18 | 3.00 | 4.14 |
Johnny Cueto | 2.5% | 77% | 203.5 | 3.30 | 2.37 | 3.21 | 4.35 |
Jordan Zimmermann | 2.5% | 79% | 189.6 | 3.31 | 2.36 | 3.22 | 4.39 |
Adam Wainwright | 2.3% | 80% | 195.3 | 3.33 | 2.39 | 3.23 | 4.38 |
Brandon McCarthy | 2.2% | 76% | 177.0 | 3.41 | 2.41 | 3.31 | 4.54 |
AL ERA (Qualified Starters)
The American League brings us more parity. Felix Hernandez, the frontrunner, is the dark green line below.
The league leader in ERA has a mean ERA of 2.14 with a 90 percent chance of falling between 1.67 and 2.56. King Felix is given a one in six chance of being that guy. Michael Pineda, with a reliability of only 48 percent, has the 10th highest chance of having the league’s best ERA despite only the 15th best mean ERA projection. On the other hand, he has a 36 percent chance of having an ERA over four and an 11 percent chance of having an ERA over five.
Potential AL ERA Leaders, 2015 |
---|
Player | Chance to Lead League | Reliability | IP | Mean ERA | 10th% | 50th% | 90th% |
Felix Hernandez | 15.9% | 80% | 212.2 | 2.93 | 2.09 | 2.85 | 3.89 |
Chris Sale | 12.7% | 77% | 210.6 | 3.03 | 2.15 | 2.94 | 4.01 |
Corey Kluber | 8.8% | 79% | 206.6 | 3.17 | 2.27 | 3.08 | 4.19 |
Masahiro Tanaka | 8.1% | 70% | 184.8 | 3.28 | 2.29 | 3.17 | 4.39 |
David Price | 6.8% | 80% | 217.2 | 3.23 | 2.34 | 3.15 | 4.23 |
Hisashi Iwakuma | 5.1% | 76% | 201.3 | 3.38 | 2.43 | 3.29 | 4.45 |
Carlos Carrasco | 4.2% | 65% | 164.5 | 3.61 | 2.52 | 3.49 | 4.85 |
Garrett Richards | 4.0% | 74% | 175.4 | 3.53 | 2.50 | 3.43 | 4.70 |
Anibal Sanchez | 3.1% | 74% | 165.7 | 3.67 | 2.59 | 3.56 | 4.87 |
Michael Pineda | 2.8% | 48% | 171.8 | 3.79 | 2.64 | 3.67 | 5.09 |
What about the relievers?
Projections for the 141 relievers who are projected to throw at least 45 innings are shown below. Here the rightward skews of our ERA projections, modeled as lognormal distributions, are more pronounced. Aroldis Chapman, the dark red line, has almost a one in four chance of leading all relievers in ERA. Conditional on each pitcher’s projected usage (a pretty sizable condition, no doubt), there’s a 59 percent chance that someone will have an ERA under 1.00. The mean lowest ERA among relievers is 0.94.
Potential Relief Pitcher ERA Leaders, 2015 |
---|
Player | Chance to Lead League | Reliability | IP | Mean ERA | 10th% | 50th% | 90th% |
Aroldis Chapman | 23.9% | 52% | 65 | 1.63 | 0.77 | 1.44 | 2.72 |
Craig Kimbrel | 7.2% | 52% | 65 | 2.17 | 1.09 | 1.96 | 3.51 |
Kenley Jansen | 6.3% | 55% | 55 | 2.31 | 1.12 | 2.07 | 3.80 |
Sean Doolittle | 5.4% | 53% | 55 | 2.39 | 1.17 | 2.13 | 3.90 |
Jake McGee | 3.5% | 54% | 65 | 2.48 | 1.30 | 2.26 | 3.95 |
Koji Uehara | 2.6% | 51% | 65 | 2.59 | 1.36 | 2.36 | 4.10 |
Andrew Miller | 2.5% | 48% | 65 | 2.60 | 1.37 | 2.37 | 4.12 |
Greg Holland | 2.5% | 53% | 65 | 2.59 | 1.37 | 2.37 | 4.09 |
Brad Boxberger | 1.8% | 56% | 65 | 2.70 | 1.44 | 2.47 | 4.24 |
Mark Melancon | 1.7% | 55% | 65 | 2.75 | 1.47 | 2.52 | 4.30 |
Could Chapman have a perfect season?
Modeling seasonal ERA as a continuous lognormal distribution is problematic when exploring the far left tail of the possible. For instance, this continuous distribution assigns no probability to the possibility that some pitcher will allow no runs at all. To estimate the probability of such an extreme dominant season, we can utilize Keith Woolner’s work to construct a reasonable model of Chapman’s probability of allowing any number of runs in a given inning. Such a model is shown below.
If we assume that Chapman has an 87 percent chance of not allowing an earned run in any given inning, his chance of surviving 40 innings unscathed is one in 265. If he throws 65 innings, his chances fall to one in 8,700. From our cloudy perspective, however, his chances are higher than that due to the uncertainty in his true talent – Chapman may be even more dominant than we imagine. Taking his full distribution of projected true talents into account, his chance of a zero ERA after 65 innings jumps up to 1 in 1,500. So, yes, I’m saying there’s a chance.
The wOBA Leader
The above chart shows the projected wOBA distribution for the 146 hitters who are projected to have the 503 plate appearances needed to qualify. The distribution of wOBAs for the major league leader is the red shaded region and Miguel Cabrera, Jose Abreu and Jorge Soler are the orange, black and red lines respectively. Since we know less about Abreu and Soler, their distributions are considerably broader than Cabrera’s. In roughly 1 out of every 160 seasons, Soler actually leads all of baseball in wOBA. Steamer sees last year’s lack of an entirely ridiculous season (with Andrew McCutchen leading the league with a modestly terrific .412 wOBA) as somewhat anomalous and yields a median expectation for the wOBA leader of .437. Here, I suspect that our conservative playing time projections (which account for the chance of injury but still give the top hitters enough playing time to qualify) have the effect of somewhat exaggerating the spread of possibilities and likely overestimating the median winning wOBA. A .437 wOBA has been eclipsed in three of the past five seasons, however.
Potential wOBA Leaders, 2015 |
---|
Player | Chance to Lead League | Reliability | PA | Mean wOBA | 10th% | 50th% | 90th% |
Giancarlo Stanton | 14.0% | 85% | 596 | 0.402 | 0.368 | 0.401 | 0.438 |
Miguel Cabrera | 13.2% | 87% | 614 | 0.403 | 0.370 | 0.402 | 0.437 |
Mike Trout | 12.1% | 87% | 661 | 0.402 | 0.370 | 0.401 | 0.435 |
Jose Abreu | 10.1% | 78% | 609 | 0.395 | 0.359 | 0.394 | 0.432 |
Jose Bautista | 5.0% | 85% | 559 | 0.386 | 0.352 | 0.385 | 0.421 |
Andrew McCutchen | 4.9% | 87% | 659 | 0.388 | 0.357 | 0.393 | 0.434 |
Paul Goldschmidt | 4.7% | 85% | 609 | 0.386 | 0.356 | 0.387 | 0.421 |
Joey Votto | 3.9% | 83% | 574 | 0.381 | 0.351 | 0.385 | 0.421 |
Edwin Encarnacion | 2.5% | 85% | 545 | 0.375 | 0.346 | 0.380 | 0.417 |
Yasiel Puig | 2.4% | 82% | 621 | 0.375 | 0.340 | 0.374 | 0.411 |
The Home Run King
To project every hitter’s chance of leading the league in home runs, we projected every hitter’s true home run talent as a beta distribution and every hitter’s observed number of home runs as beta-binomial.
According to this model, if Giancarlo Stanton has 596 plate appearances, the median number of home runs he hits is 36 but there is only a 50 percent chance that he will hit between 30 and 43 and a 90 percent confidence interval stretches all the way from 23 to 53. The following charts show Stanton’s homer projection on its own as well as how it stacks up against the projections of all other qualified batters. In the second chart, the red shaded area is the home run projection for the major league leader and the orange, black and red lines are the projections for Stanton, Jose Abreu and Ben Revere respectively. Revere is projected to hit three home runs but his modal outcome has him matching last year’s total with two.
Potential Home Run Leaders, 2015 |
---|
Player | Chance to Lead League | Reliability | PA | Mean HR | 10th% | 50th% | 90th% |
Jose Abreu | 21.4% | 0.78 | 609 | 36.5 | 24 | 36 | 49 |
Giancarlo Stanton | 20.8% | 0.85 | 596 | 37.5 | 26 | 37 | 49 |
Anthony Rizzo | 6.2% | 0.86 | 636 | 31.4 | 21 | 31 | 41 |
Jose Bautista | 4.8% | 0.85 | 559 | 30.1 | 20 | 30 | 40 |
Mike Trout | 4.6% | 0.87 | 661 | 30.3 | 21 | 30 | 40 |
Chris Carter | 4.3% | 0.84 | 542 | 28.8 | 19 | 28 | 38 |
Chris Davis | 4.3% | 0.85 | 544 | 29.5 | 20 | 29 | 39 |
Edwin Encarnacion | 3.7% | 0.85 | 545 | 29.2 | 20 | 29 | 39 |
Brandon Moss | 2.9% | 0.84 | 523 | 28.1 | 19 | 28 | 37 |
Paul Goldschmidt | 2.7% | 0.85 | 608 | 28.5 | 19 | 28 | 38 |
Jose Bautista’s chances here are constrained by his playing time projection; a model that allowed for the possibility that he plays a full healthy season would show him winning considerably more often. Likewise, if Kris Bryant gets a quick call-up, he will shoot up this leaderboard. The median number of home runs for baseball’s leader is 46, with a 50 percent chance that the leader falls between 43 and 51 home runs and a 90 percent chance of falling between 39 and 59. As you probably know, no hitter has reached 59 home runs since Barry Bonds’ 73 home run season in 2001. In all 5,000 simulations, this 73 home run mark was topped a mere seven times; four times by Abreu and three times by Stanton. No major league leader has had as few as 39 home runs since 1982, although Nelson Cruz led all of baseball with 40 as recently as last year.
No doubt there are more possibilities than have been dreamed of by Steamer, but if you’re interested in some of the better understood unknowns, Steamer percentile projections for batters and pitchers can be found here and here.
The fact that Miguel Cabrera is not on your list is absurd.
Gary, I’m guessing you mean the HR list. Cabrera finished 12th in the 5000 simulated seasons but 9th-12th (Moss, Goldschmidt, Springer and Cabrera) were all close enough that if we ran another 5000 simulations the order could get shuffled.
Who’s the outlier for NL ERA projected for a mean well above 4?
That’s Kyle Kendrick with a projected 4.80 ERA. Next highest among NL projected qualifiers is 4.27.
Thanks
What does density mean, on the Y-Axis?
It’s a way of indicating the relative probability for a continuous distribution — if the density of some value is twice as high as the density for some other value, then values in that neighborhood are twice as likely. In each graph the total area under each curve is 1.
Great stuff!
Also, glad the NL ERA trailer was already answered… that jumped right out at me.
I’d love to see this for other stats (doubles, HBP, stolen bases, etc.) as well.
This is a really cool article, well done with the visuals!
Showing the range of outcomes predicted by the projections actually makes them (the projections) look much more intelligent and showcases the complexity involved in coming up with the mean values we usually see.
I actually wish Fangraphs showed at least some confidence levels in the projected stats, as the intervals vary so widely between players (and sample sizes).
Thanks guys. If you have ideas about how you’d like to see uncertainties/percentiles presented going forward let me know.