Are relievers across baseball consistently improving?
A month ago, I wrote an article, entitled “Should we even try to predict future runs allowed for relievers?.” In that piece, I tested various ERA estimators, to see which performed the best at predicting future runs for relief pitchers. Each estimator had such a weak performance that I wrote this paragraph in my conclusion:
The variability in reliever runs is massive. Craig Kimbrel, a workhorse reliever, threw almost 140 innings over the last two seasons. That number of innings is less than a full season for a healthy starter, and a full season of innings for a starter is still a small sample in relative terms. I think the results from this study just reiterated the fact that year-to-year numbers for relievers are incredibly hard to predict, because of the sample size.
In an attempt to improve this predictive model, I worked on regressing the dependent variable in the test (runs allowed or RA9). Interestingly, while running this tests, I came across some interesting findings about relievers’ runs allowed per nine innings.
First, I’ll point to the average RA9 for the entire population of relievers from 2007-2012:
| Year | RA9 | 
|---|---|
| 2007 | 4.55 | 
| 2008 | 4.46 | 
| 2009 | 4.43 | 
| 2010 | 4.26 | 
| 2011 | 4.02 | 
| 2012 | 3.99 | 
Over the past six seasons, RA9 for relievers has clearly been trending down, and this season it actually fell under 4.00.
This trend caused me to be a little overly excited at first, as I thought, “maybe I could write an article entitled: ‘2012: The greatest season for relievers in baseball history’.” After very brief research, I realized that 2012 probably wasn’t baseball’s greatest season for relievers.
Disclaimer: For the rest of this post, I’ll be using only data dating back to 1974.  In part that’s because this was the second full season in which the American League used the designated hitter and relievers more and more had designated roles.  Mainly, though, it’s because FanGraphs began splitting its data for starters into just the innings thrown by the pitcher who started the game, while reliever data includes only innings pitched by pitchers who did not start.
Since 1974, the average RA9 for relievers has been below four runs, nine times; thus, 2012’s low number is not unique.  But despite the fact that the 2012 RA9 average of 3.99 was not unique on this scale, I wanted to delve into this data further. 
My goal was to compare reliever runs allowed to the run environment for each season. A 3.99 RA9 average could be an incredible season for all relievers if the 2012 run environment indicates that there should have been a higher average RA9 for relievers.
I wanted to find a way to test this idea, but it’s tough to look at the overall RA9 for an entire baseball season and then compare that to the reliever average, given that relievers are included in the overall number.
So, I decided to compare the average starter RA9 and average reliever RA9 from 1974-2012. My hypothesis was that the larger the gap between reliever and starter average RA9, the more successful relievers were in that season.
In each season from 1974-2012, the reliever average RA9 was least .06 lower than the starter average. Thus, when I graphed the gap between starter and reliever RA9 for each season, each value should be interpreted as the number of points the reliever average was below the starter average:
 
The labels on the chart indicate the two seasons in which the gap between starter and reliever RA9 were the largest.
The largest gap came in 1982 (0.63), and the second largest occurred last season (0.56), which could be enough to back my inclination that 2012 was a uniquely great year for relievers.
The gap between each type of pitcher’s average RA9 fluctuates a great deal year to year. Given this fluctuation (seemingly random variation), it is not completely intuitive from looking at this graph, but the gap between the two RA9 averages seems to be increasing over time.
I plotted the data as a scatter and ran a linear regression through it to test if there was in fact a statistically significant increasing trend for this gap over time:
 
There was a significant positive correlation between the year and gap between starters/relievers RA9.
When I fit an exponential curve on the data, the overall r-squared increased to 9.91 percent (the r-squared tells of the percent of the variation in the gap explained by the year).
According to this simple regression it seems like there’s a possibility that relievers have been improving, in terms of run prevention, over time in comparison to starters (or what the run environment would indicate).
Are relievers actually getting better?
Just because there is a significant positive relationship between the gap in starter/reliever RA9 and time, does not necessarily mean relievers have been consistently improving since 1974. It could simply mean that starters are getting worse in comparison to relievers, while relievers are staying the same.
Or there could be a third variable at work here.
The use of relievers has increased over this time period. In 1974, relievers accounted for ~27 percent of all the innings thrown in baseball. That number rose above 35 percent in 2007, and last season relievers accounted for 34 percent of all innings pitched.
Is the increase in use of relievers over the years the actual cause for this perceived increasing gap?
I ran another simple linear regression with the percentage of innings pitched as the predictor for the gap between starter/reliever RA9:
 
I found a statistically significant positive correlation between the percentage of innings pitched by relievers and the gap between the average RA9 for relievers and starters.
Essentially what this says is, if relievers are allowed to throw more innings, on average, their RA9 will be even lower than the starter RA9 than it usually is. This brings me back to the question I just raised: Is the increase in innings pitched the real cause for this increasing gap?
To answer this question, I ran a multiple regression with percent of innings pitched and the year as the predictors of the average RA9 gap. The percentage innings pitched was the only statistically significant predictor of this regression; thus, it seems that the increase in innings is actually explaining the increasing gap, rather than time.
Why would a higher percentage of innings for relievers result in a larger gap between their average RA9 and that of starters?
That’s a tough question to answer.
My first idea has to do with the variability in batting average on balls in play (BABIP). BABIP has a large effect on runs and in small samples, BABIP is due to a good deal of random variation.
Obviously, looking at the entire population of innings thrown by relievers (14,737.2 innings in 2012), is not a very small sample. However, if relievers threw one percent fewer innings overall, that would be more than 400 innings taken away from the population of relievers’ true talent level.
I think it’s possible that taking innings away from starters, while working in the opposite way for relievers, could cause this average overall gap to widen.
There’s also the thought that relievers are being used more on purpose. More roster spots are being dedicated to relievers, and they have more defined roles. Thus, relievers may not be performing better than before, but it is possible that they are being used more effectively than before.
These two ideas could be used in tandem, with the idea that when starters throw fewer innings, their RA9 fluctuates more and when relievers are managed or leveraged more effectively, their average RA9 could end up being much lower than the average for starters.
My final possible explanation for this trend is that inning percentage is not independent of starter vs. reliever performance.
If for whatever reason (injury, a set of good starters retiring, a decline of starting talent) in a given season, the population of starters isn’t performing as well as usual, in all likelihood relievers will end up with a higher percentage of innings. That logic works two-fold when considering RA9: If starters are performing worse, then we’d expect see a larger gap between their RA9 and that of relievers.
I’m not positive how to explain this pattern, if we’re going to see it continue or even if this pattern matters, but it’ll be interesting to see if relievers continue to get more innings, and continue give up fewer and fewer runs compared to starters.
References & Resources
All statistics come courtesy of FanGraphs.
Glenn, there is a lot going on in this data. For one thing, you could argue that managers are doing a better job of matching relievers to situations. That might also happen to correlate with better reliever RA. Another thing that might be happening is that relievers are pitching more as a whole, but less per pitcher. That might also improve their performance.
I’d recommend you move away from correlation analyses and try to conduct matched studies or something of that sort. That are just too many things that could be happening here.
With position player slots at a premium due to larger pitching staffs, relievers are less likely to be put in situations where they are put at a platoon disadvantage. There simply are not as many pinch hitters available. Furthermore, the bench has a higher percentage of backup catchers and weak-hitting utility infielders, who are not much of a threat in general, and less of the once common professional pinch hitters and can’t-field-at-all or aging sluggers who could just ruin a reliever’s day. This is somewhat ironic as controlling the platoon advantage is significantly easier with a pinch hitter than with juggling the bullpen. Could be a partial factor here.
One thing you will want to look at is whether the relative improvement of relievers comes in part from more often having the platoon advantage than in years past. Also, there maybe a movement of innings from the top of the rotation to the bottom, because of the switch from 4 man to 5 man rotation and the pitch count/innings limitation put on even the best starters. The almost extinction of the swing-man and long reliever roles may have put relievers on shorter and more regular schedules, making it easier for them to pitch better. Lastly, (for the time being), there may be a stronger tendency now to have more of the best pitchers in the bullpen rather than as starters.
Thanks for the article Glenn, this is really interesting.
If I was to speculate (which isn’t a substitute for actually sorting through the data, but it’s a place to start), I’d think that studes’ suggestion about fewer IP per reliever is the thing to look at. The finding that more IP for relievers as a whole correlates with better results is surprising, since it means that starters are pitching fewer innings and so, for example, don’t have to pace themselves as much when pitching. But if IP per reliever have dropped even more (or even more as a percentage, I don’t know) than having IP for starters, because of the larger pitching staffs, then that would make up for the better performance we should expect from starters with lighter workloads.
You have to separate the American from the National Leagues. In the American league, where pitches don’t hit, managers are more likely to yank them before the end of an inning (with none, one or two outs). And with runners on base, I see so often, a reliever gives up a hit or two and allows a run or more, but it’s charged to the starter, while the reliever gets credit for an out or two. It happens in the National League, but not as often because managers often want to pinch hit for them and save a reliever. If the difference is consistently greater in the American League, it might give some weight to this factor.
Really happy to have some baseball stuff to digest in the off-season – so Thank You!!
As a relative newcomer to looking at stats to this level – I wonder why you use RA9 – which I had been told was not a good indicator (yes I know it has unearned runs) because it is still very defense dependant. I mean Jeter doesn’t make many errors – but well you know what I mean.
As a follow on from the above, I would believe (from the ever reliable eyeball test) that maybe relievers, on average, pitch with a superior defense behind them. We often see the all glove no bat players come in during the late innings when trying to protect a lead.
@Everyone
Sorry it took me so long to get back to most of you, I had a crazy week with Thanksgiving. I”m attempting to compile all of the suggestions I’ve received here and from Tango at the Book Blog into a follow-up piece for next week, so I hope you enjoy.
@Tim, I’m not positive that relievers actually pitch with better defenses, but using defensive independent measures becomes difficult when looking at the population. This because statistics like FIP are correlated to each individual season’s ERA for the population, so you’d see similar results. If you’d like to read more on the topic though, this link might be helpful
http://www.beyondtheboxscore.com/2012/11/9/3617554/fip-era-gaps-history-baseball-pitching-statistics-sabermetrics