What’s the best BABIP estimator? by Derek Carty January 26, 2009 BABIP is a stat that lots of people like to throw around but many don’t fully understand (even some who profess to be statistically inclined). Background info BABIP stands for Batting Average on Balls in Play. It measures the rate at which balls in play fall in for hits. Essentially, any ball that the batter makes contact with, puts into fair territory, and does not become a home run falls into the domain of BABIP. It is calculated as (H-HR)/(AB-K-HR). We use BABIP to evaluate both pitchers and hitters, but the way in which we use it differs greatly among the two. Most pitchers regress toward the league average BABIP of around .300 or .305. Very few pitchers can repeatedly do better or worse than this, so we say that pitchers have very little control over BABIP. Hitters, on the other hand, can have a substantial amount of control over BABIP. Ichiro Suzuki, for example, has a .356 career BABIP. Hitters do not regress toward league average, rather, they each regress toward their own, unique number. The big question these days seems to be, what is that number? Today, I’d like to look at several ways of determining it and see which is best. The test This is something I’ve been curious about for a while, so I took as many BABIP estimators as I could think of and decided to put them up against each other to see which does the job of predicting the following year’s BABIP the best. The combatants Previous year BABIP (BABIP): This is simply the player’s BABIP from the previous year. Expected BABIP (xBABIP): This is a BABIP model created by Chris Dutton and Peter Bendix, introduced at THT last month in this article. xBABIP is the primary reason for this article as I have been very curious how well our newest model does what it intends. Also, please note that Chris has tweaked the model a little since the original article ran. Please check the bottom of this article for more details. Quick Expected BABIP (qxBABIP): As Dutton and Bendix’s xBABIP includes some stats that aren’t readily available to the casual fan, they’ve created a simplified version using stats that are readily available, of course, at the (expected) expense of accuracy. Line drive BABIP (ldBABIP): This is the one that gets the most play. Everyone seems to be using it these days, but for reasons I’ve explained many times before, I’m not a fan. It’s calculated as line drive rate plus .120. Studes BABIP (studesBABIP): This one was created around the same time Dave Studeman put out Line drive BABIP but doesn’t get nearly the same attention. Not a whole lot more difficult to calculate, but uses more than one variable. It’s calculated as 0.245 + 0.52 * LD% – 0.16 * FB% + 0.11 times K%. Expected Batting Average BABIP (xBA BABIP): This one is the BABIP portion of Baseball HQ’s Expected Batting Average (xBA) statistic. I should note that this uses HQ’s SX stat, which I couldn’t replicate precisely. I was, however, able to get it very close. Also, because SX and PX are indexes based on league (American/National) average, for player’s switching teams mid-year, I weighted each based on games spent with each team. Marcels BABIP (mBABIP): This isn’t so much an estimator as a projection, but I thought it would be good to include for context. It’s simply what Marcels projects for the following year. It’s also currently what I’m using in my True Batting Average calculations. The process I used data from 2004 to 2008, matching players from one year to the next. As xBABIP was the reason for doing the study, I had to work around that a little bit. xBABIP wasn’t calculated for anyone with fewer than 300 plate appearances, so I made that the cut-off for both year one and year two. There are some biases with using cut-offs, but there’s no way around it in this instance. From there, I adjusted each stat for differences in league average and ran a couple of tests. You can see the results below. The results +---------------+-------+--------+---------+---------+-------------+-----------+--------+ | TEST | BABIP | xBABIP | qxBABIP | ldBABIP | studesBABIP | xBA BABIP | mBABIP | +---------------+-------+--------+---------+---------+-------------+-----------+--------+ | Correlation | 0.38 | 0.50 | 0.45 | 0.20 | 0.32 | 0.40 | 0.46 | | R-Squared | 0.14 | 0.25 | 0.20 | 0.04 | 0.10 | 0.16 | 0.21 | | Average Error | 0.028 | 0.021 | 0.022 | 0.029 | 0.024 | 0.022 | 0.022 | +---------------+-------+--------+---------+---------+-------------+-----------+--------+ As you can see, there’s a pretty clear pecking order in these results: +------+-------------+ | RANK | ESTIMATOR | +------+-------------+ | 1 | xBABIP | +------+-------------+ | 2 | mBABIP | | 3 | qxBABIP | +------+-------------+ | 4 | xBA BABIP | | 5 | BABIP | +------+-------------+ | 6 | studesBABIP | +------+-------------+ | 7 | ldBABIP | +------+-------------+ I’ve also broken things down by tiers. Dutton and Bendix’s xBABIP seems to be the best, and I can only imagine what looking at multiple years of it would do. Just one year of data can explain 25 percent of the change in BABIP, a very big number for a stat with such wide variability. That it beats three years worth of Marcels data (plus regression to the mean and age adjustments) is excellent as well. After that comes Marcels (which I’ve currently been using), and the quick version of xBABIP (which, I should note, doesn’t include a not-hard-to-apply team adjustment. I didn’t include it for some logistical reasons, but it would likely improve the accuracy a bit). It’s very nice to see the quick version grade out so nicely since it will be easy to calculate in-season (although thanks to Sal Baxamusa, Marcels isn’t very difficult either). Then comes Baseball HQ (which Average Error thinks belongs in tier two) and actual BABIP, followed by Dave’s more complex BABIP estimator (which was derived back at the beginning of 2005 when we were first starting to work with batted ball data). Finally, line drive BABIP — which is the arguably the most popular of any other measure on this list — comes in dead last, well below everyone else and significantly worse than simply using actual BABIP. I’ve long said that I dislike this way of estimating BABIP, and it’s very nice to see the tests confirm it. Going forward Going forward, I’ll be using xBABIP in place of Marcels BABIP in my True Batting Average calculations and when discussing a player’s BABIP in general. I’m committed to giving you guys the best there is, and Chris and Peter’s model is tops among any BABIP estimator that I know of. If you missed the original article, I’d definitely recommend you go back and read it.A Hardball Times Updateby RJ McDanielGoodbye for now. Some notes from Chris Dutton Chris worked a lot with me on this, and I really appreciate his receptiveness and helpfulness. Here are some things he wanted me to pass along. First, he has changed the model a bit since the original article. Here are the exact changes and his explanation of them: Old formula: Hitter eye, Pitches per extra-base hit, LD%, FB/GB, Speed score, Contact rate, Spray, Pitches per AB New formula: HR/FB, IF/FB, LD%, FB/GB, Speed score, Lefty*(FB/GB%), Contact rate, Spray The differences are basically that I used hr/fb as a measure of power rather than pitches per extra base hit, added popups/FB to measure poorly hit balls, and included an interaction variable of lefty*(fb/gb%) to adjust for the fact that lefty ground ball hitters tend to often hit balls to the right side of the field (which rarely become hits). I also removed pitches_per_AB, which seemed to be potentially correlated with other variables, and removed hitter_eye since contact rate seemed to be capturing a very similar effect. Chris also says that he’s isn’t done improving the model. He is constantly looking for ways to improve it even further, and is specifically hoping to incorporate some PITCHf/x data as soon as possible. Finally, Chris is developing a tool that would allow readers to easily calculate the quick version of xBABIP. This would prove to be useful in-season when we constantly need to be changing our evaluation of hitters. While constantly calculating things like Spray would be time-consuming and difficult, the quick version utilizes stats that are all readily available and — as the tests show — is still effective. The tool also has some other cool features: interactive graphs, projected stat lines, and some other things you might find useful. References and resources Expected BABIP and Quick Expected BABIP data was provided for me by Chris Dutton. A big thanks to him for his help and also for helping to create such an excellent stat. Marcels BABIP was taken from Tango’s site. The rest of the stats I calculated myself.