More on mERA+

I got a lot of e-mail about mERA+, a pitching metric which makes some marginal improvements to ERA+. Many disagreed, and some said that I simply must have made a mistake someplace.

I goofed!

Well, you were right. In combing through my formulas—this is what I do for fun—I noticed that I had missed a factor of two. Upon correcting this error, I was afraid that the difference between mERA+ and ERA+, which is small to begin with, would get even smaller. But my error was in the opposite direction, which means that the marginal improvement to ERA+ is slightly greater than I first thought. Nothing earth-shattering, but in the interests of full disclosure, I thought that I should share the correct results.

I compared the careers of Roy Halladay and Sandy Koufax and concluded that while Koufax was slightly better by ERA+ (131 to 130), Halladay was better by mERA+ (62.9 to 62.3). I understated the case; Halladay actually has the edge 65.9 to 65.0.

I told you it wasn’t a big difference. Here are the career leaders in ERA+

Pitcher			ERA	lgERA	ERA+	mERA+
Pedro Martinez		2.81	4.49	159.8	77.6
Lefty Grove		3.06	4.54	148.4	73.6
Walter Johnson		2.17	3.17	146.1	69.5
Dan Quisenberry		2.76	4.04	146.4	71.8
Hoyt Wilhelm		2.52	3.68	146.0	70.8
Joe Wood		2.03	2.97	146.3	69.0
Ed Walsh		1.82	2.63	144.5	67.4
Roger Clemens		3.10	4.46	143.9	71.7
Johan Santana		3.20	4.59	143.4	71.8
Roy Oswalt		3.05	4.35	142.6	71.0

Because the error was systematic, the trend (and therefore the conclusions) doesn’t change, but the absolute number does. How good is Pedro Martinez? In over three-quarters of his career starts, he has delivered a performance better than what you would have expected from an average pitcher.

And as for single-season pitching:

Pitcher			ERA	lgERA	ERA+	mERA+
Pedro Martinez		1.74	4.97	285.6	98.0
Dutch Leonard		0.96	2.68	279.2	91.3
Greg Maddux		1.56	4.26	273.1	96.2
Walter Johnson		1.14	2.96	259.6	91.0
Greg Maddux		1.63	4.23	259.5	95.1
Bob Gibson		1.12	2.90	258.9	90.6
Mordecai Brown		1.04	2.62	251.9	88.4
Pedro Martinez		2.07	5.07	244.9	95.6
Walter Johnson		1.39	3.34	240.3	90.5
Christy Mathewson	1.28	2.93	228.9	87.2

Remember that 100 is a perfect score. Yes, Pedro scored 98.0 in his best season.

Technical points for other researchers

There are a few additional technical points that I wanted to touch upon.

  • One of the assumptions of mERA+ is that given a large enough sample for any given pitcher, the distribution of their performance tends to the Weibull distribution. As a “proof-by-example,” here is the distribution of performance for two pitchers with a good number of career starts—Kevin Appier (402 career starts) and Pedro Martinez (376 career starts). The number of runs allowed in each start is the sum of the runs allowed by the individual plus the number of runs a league-average pitcher would have given up in the balance of the innings. For example, on June 10, 1994, Appier allowed three runs in 6 1/3 innings. Assuming the remaining 2 2/3 innings were pitched by an average pitcher with a 5.23 RA/9 (the average for the 1994 AL), that’s 4.55 RA by Appier. This credits a pitcher not only for pitching well but for pitching well deep into games. Here are the plots for Appier and Martinez:
    image
    The shape is familiar in both cases, though not perfectly Weibullian. Appier in particular showed a tendency to give up one or two runs less often than expected by the model and four and five more often. Certainly, mERA+ might be improved by using the actual distribution instead of Weibull distribution. But I believe that the assumption that distribution of performance tends to the Weibull is a decent one.

  • The nice thing about using distributions is that we’re not limited to using “mean” to describe average. In baseball, “median” seems a more natural measurement of average since the game is played in discrete chunks.
  • Why limit ourselves to averages? After all, if we’re discussing the most valuable pitching seasons, all of the guys will look great compared to some average schlub. With a distribution (such as the Weibull) in hand, we can compare to the 25th percentile, or 10th percentile if we so choose. It all depends on your application.
  • One of the reasons that mERA+ offers insight on the margins is that mERA+ is a bounded measurement (0 is the worst, 100 the best) whereas ERA+ is not bounded (0 is the worst, infinity is the best).
  • Using a bounded measurement is good news if you are interested in the distribution of talent and/or performance. Below are two plots, one showing the distribution of single-season ERA+ and the other showing the distribution of single-season mERA+ (minimum 20 IP, post-1900).
    image
    Notice that with ERA+, the plot doesn’t quite look symmetrical. This is because a very good pitcher could have an ERA+ approaching infinity. Using the mean and standard deviation to fit it to a normal distribution (“bell-curve”) is problematic—you can see that the normal distribution (black line) doesn’t do a very good job of matching the actual data. On the other hand, mERA+ fits to a normal distribution much more nicely. It’s not perfect, but it is an improvement on ERA+. For reference, the all-time average for mERA+ is 50.6 and the standard deviation is 15.5.

  • One question I got was, “Can we do this for hitters?” My answer is yes, although I am not quite sure how. That’s another project for another day.
    Remember, ERA+ rules!

    It is important to remind everybody that ERA+ is still a wonderful metric, suitable for almost all everyday purposes. mERA+ is a slight improvement that is strongest when discussing the very best or very worst pitchers.

    References & Resources
    The references section from the previous article on mERA+ contains a lot of good reading on Weibull distributions.


  • Comments are closed.