How much do counts affect BABIP? by Derek Carty July 30, 2010 In my other article today, I mention a discussion that’s going on in the comments section of a post I made to the CardRunners site. In it, RotoWire’s Chris Liss gets to talking about luck, randomness, BABIP, and how what we often consider to be luck may not be luck at all. His most concrete claim sounded interesting to me, and I wanted to test it: In fact, Todd Zola sent me BABIP data by count—and BABIP goes up reliably as the count gets more hitter favorably—like .315 on 3-0, and .285 on 0-2. It’s .305 on the first pitch. So let’s say a guy like Haren (or Aaron Harang or Dave Bush) gets a rep as an extreme strike thrower—then batters might swing more often at the first pitcher, rather than take a pitch and get behind. So there, the pitcher’s BABIP would change not bad on luck, for example. The “BABIP by count” argument While it’s absolutely true that BABIP differs by count, I didn’t think the overall effects would be very large. To test this, I used the MLB GameDay files to pool all the data since 2008 and came up with average BABIPs by count: +-------+-------+ | Count | BABIP | +-------+-------+ | 0-0 | 0.311 | | 0-1 | 0.300 | | 0-2 | 0.288 | | 1-0 | 0.311 | | 1-1 | 0.310 | | 1-2 | 0.294 | | 2-0 | 0.322 | | 2-1 | 0.314 | | 2-2 | 0.301 | | 3-0 | 0.336 | | 3-1 | 0.312 | | 3-2 | 0.316 | | All | 0.306 | +-------+-------+ Caveat: There may be some selection bias in calculating league average BABIP by count. That is, perhaps the BABIP for 3-0 counts is too high because poor pitchers reach 3-0 counts more often than good pitchers do. This shouldn’t change our overall impression of the effect, though, because even if there is significant bias, all we’d likely see is the BABIP by count clustered closer to the overall BABIP league average of .306. From here, I came up with an “expected count-based BABIP” (xcbBABIP —that’s catchy) for everyone who’s thrown a pitch in the PITCHf/x era. To do this, I looked at how many balls in play each pitcher allowed by count and assumed a league average BABIP (for that count) for all of those balls. After adding it all up, here are our 2009 leaders (300 BIP to qualify): +------+-------------+---------+--------+ | YEAR | LAST | FIRST | xBABIP | +------+-------------+---------+--------+ | 2009 | Harden | Rich | 0.3003 | | 2009 | Hendrickson | Mark | 0.3012 | | 2009 | Lohse | Kyle | 0.3014 | | 2009 | Hanson | Tommy | 0.3015 | | 2009 | Tallet | Brian | 0.3018 | | 2009 | Sabathia | CC | 0.3024 | | 2009 | Nolasco | Ricky | 0.3026 | | 2009 | Weaver | Jered D | 0.3027 | | 2009 | Holland | Derek | 0.3029 | | 2009 | Wellemeyer | Todd | 0.3033 | +------+-------------+---------+--------+ And our trailers: +------+------------+----------+--------+ | YEAR | LAST | FIRST | xBABIP | +------+------------+----------+--------+ | 2009 | Zambrano | Carlos | 0.3125 | | 2009 | de la Rosa | Jorge A | 0.3114 | | 2009 | Meche | Gil | 0.3112 | | 2009 | Bush | David T | 0.3104 | | 2009 | Stammen | Craig N | 0.3103 | | 2009 | Morton | Charlie | 0.3100 | | 2009 | Carmona | Fausto C | 0.3098 | | 2009 | Suppan | Jeff | 0.3097 | | 2009 | Redding | Tim | 0.3096 | | 2009 | Santana | Ervin R | 0.3094 | +------+------------+----------+--------+ So it appears that that the extent of the “BABIP by count” effect is about 0.006 points of BABIP in either direction, at the extremes, and that’s without any regard for repeatability or regression (and if you’d like a quick idea about that, I found a pissantian 0.01 r-squared for pitchers with at least 400 BIP in adjacent years from 2007-2010). I don’t think we can rightfully claim that this “BABIP by count” effect is to blame for any truly abnormal-looking BABIPs. The enduring effects of it seem minimal at best and completely insignificant at worst.