Data Erratum Redux
In yesterday’s article, Data Erratum Et Cetera, I noted the difference between the two leagues in BABIP and LD%, and wondered what might have caused it.
A number of readers and commentators mentioned that I overlooked the obvious — pitchers don’t bat in the AL. Doh! That is obvious. So I went back and ran my analysis a little differently.
This time, I only included batters with at least 40 plate appearances in either league (which I probably should have done in the first place). That excludes almost all pitchers at this time, but still represents 93% of all plate appearances in the NL, 96% in the AL.
Now, there is only a 10 point difference between the two leagues.
NL: LD% .183 BABIP .292 Diff .110 AL: LD% .176 BABIP .297 Diff .120
A couple of points:
– Taking out batters with less than 40 PA’s has very little impact on LD% (one point down in the AL, one point up in the NL). That’s a bit surprising, and probably important in some way.
– It has no impact on BABIP in the AL, but brings down BABIP ten points in the NL. That’s the pitcher effect.
The remaining 10% diff could easily result from a slight difference in fielders or ballparks, as well as sample size issues or sheer luck.