BABIP by hit location

This is a very simple question: Where is the best location on the field to get a base hit?

It is very difficult for a hitter to put the ball in play exactly where he wants it. But the hitter has a majority influence on where the ball will go. He will either pull it, go up the middle, or to the opposite field (or “going oppo” as the cool kids say it). The Gameday data provides the hit location (actually the location fielded by the fielder) of a batted ball and using Peter Jensen’s translations, I can find the angle relative to home plate. And since I want to find out the likelihood of a base hit on a ball in play (home runs are always base hits), BABIP fits in just nicely.

Instead of calling it pull, middle, and opposite, I will stick with left field, center field, and right field even though I include balls fielded in the infield. So now here is a table with the BABIP to the three fields since 2008 split by batter handedness. I also took out what Gameday labels as “pop outs” since they are almost always outs.


So the ball fielded in the direction of centerfield is the location that is highly likely to be a base hit. Interestingly, a left-handed hitter pulling the ball into right field is considerably worse than going to center or the opposite field. This might make sense since most teams now shift towards right field with a notable left-handed pull hitter so there may be some base hits taken away there. However, this doesn’t correspond to right-handed hitters who hit well pulling the ball to left field.

But something is not right. Baseball Reference also has hit location splits for the entire Majors by season. And looking at the past two seasons data, it seems that a batted ball hit up the middle is the least likely to be a base hit for both batter handedness. This is the exact opposite of what I found earlier using the Gameday data.

How can this be? I divided up the fields square even with 30 degrees to each. It is likely that Baseball Reference’s data, which comes from Retrosheet, is divided up differently than what I did. From the Retrosheet website itself there is a visual chart that shows the codes that the Retrosheet stringers use to enter hit location data. There are about seven different zones relative to home plate. Which means that up the middle can size be either the lone center zone or the center zone and the two adjoining ones. Whatever it is, all three fields are not even.

It is possible that there is an error on my part since Peter Jensen’s translations are only from 2005-2008. But it should’t mean I would be getting angles 10 degrees off of what they should be. And my data is comes almost exactly the same for 2008 and 2009 for BABIP and the number hit to each field so there should haven’t been any drastic changes in the factors. I also took out pop outs but that just gives me a slightly higher BABIP. This will have to be investigated further in the future.

Sticking with this season, I can plot a local regression for BABIP by the batted ball angle for the top and bottom BABIP hitters in the Majors for this season. First up the right handed hitters.
Don’t pay too much attention to the MLB average line. Austin Jackson does get most of his hits to center and right which Fan Graphs BIS data aligns with also. The Retrosheet data however doesn’t follow. But Jackson is expected to get some serious regression soon. Aaron Hill does have a higher down the right field line, but that is because he has three base hits out of ten balls in play at 20 degrees and greater.

Now for the left-handers.
Carlos Pena should not be hurting that much from the shift. And Justin Morneau follows the average line, except he is almost 20 percentage points higher to rightfield.

All four players have unique curves in the regression. Jackson does well to center and right feildExpect all four hitters to regress, with Jackson to regress the most although I wouldn’t be surprised with it stays close to .400 since he has a history of high BABIPs in the minors.

This is a rough idea as I ignored so many aspects earlier including batted ball type, the frequency of the batted ball angle among others. I also have to do a little more research into finding out how Baseball Reference splits there hit location data and whether or not I made in errors in my method of producing batted ball angles.

Newest Most Voted
Inline Feedbacks
View all comments
12 years ago

this is awesome.

think you could do this for like all Lefties who face a shift?(or whatever you could pull from your head)

mlb average line, and then like 5 or so guys who always get shifted against(pena, howard, etc).  would be interesting to see who beats the shift more often than the others.

Jeremy Greenhouse
12 years ago

RZ, cool to see you writing here. Good work as always.

I don’t know if you plan to continue research down this line, but I think it’d be nice if you either separated by batted ball type, or made your regression lines a bit less smooth so that the differences in BABIP based on fielder location were more pronounced. Especially for a guy like Pena who has so much going on with all his balls the other way being flies, and all his grounders being pulled into the shift. Also, something I’ve recently toyed with is adding a histogram using the same plot and horizontal axis but different vertical axis, since I have no idea where Aaron Hill tends to hit the ball.

Vorp Opiescu
12 years ago

The formatting of this article has a severe shift toward third base.

Nick Steiner
12 years ago

If I’m not mistaken, Baseball Reference uses Gameday data from 2005 on, so you should be getting the same results.

12 years ago

Interesting stuff!

As far as the higher BABIP to Left versus Right regardless of batter handedness, I would guess it is largely a result of the higher number of infield hits on balls hit to the left side of the infield.

12 years ago

Jamie- Nice idea. Will definitely look into it.

Jeremy- Thanks for the kind words. I do plan on expanding on this idea and research it sometime soon. Dividing it by batted ball type would definitely be the first thing I do then smoothing the graph would come later for serious graphical analysis.

A graph like that would be awesome for this. If I only knew how to make it.

Vorp- I’m new here!

Jeremy- You might be right but I always see the citation that the data is mainly gathered from Retrosheet. I would have to research that.

Seth- Thanks! Not sure if that will skew the data. The angle should still be directed towards left field on a grounder regardless if it was an infield hit or fielded in the outfield. But the idea of seeing where infield hits occur is something that can be looked into.

12 years ago

Not sure you needed to do all that. Been watching the game 50 years and it’s pretty obvious a) more righties than lefties, b) more singles than extra base hits, c) fewer hits to opposite field than straight away. The outcome is therefore eminently logical and obvious- the area in which the most hits land is left center, ie past the shortstop either on the ground or in the air, fielded by either the center or left fielder.While I love sabermetrics and all it ahs added to the game; it really wasn’t needed here.

12 years ago

Good Stuff RZ
Lew I think the importance is that it confirms the long held premise of “hit it up the middle” with statistical analysis. Many widely help baseball “truths”  have been shown on analytical examination to be unfounded. This analysis confirms the adage. Without analysis it is just an unproven anecdotal based adage with out defense other than “everybody knows it”.

Dave Studeman
12 years ago

Great work, Ricky.  On the Bill James website, they break hitters down by grounder, fly and liner and each one to right, center or left.  In general, pulled fly balls have a higher average than fly balls the other way, while grounders that go the other way have a higher average than pulled grounders.  Liners tend to be base hits, though pulled liners tend to result in a higher average than liners the other way.

Mike Fast
12 years ago

In general, pulled fly balls have a higher average than fly balls the other way, while grounders that go the other way have a higher average than pulled grounders.

This is because pulled fly balls are hit closer to square with the middle of the bat, whereas opposite-field fly balls are the result of undercutting the ball a great deal (less solid contact, less speed off the bat).

Similarly, opposite-field grounders are the result of hitting closer to square with the middle of the bat, and pulled grounders are the result of hitting more over the top of the ball and making less solid contact.

Liners are closer in character to the whole population of fly balls than they are to the whole population of grounders, thus as a group they behave somewhat more like fly balls in this fashion.

Mike Fast
12 years ago

Btw, RZ, good work and welcome to THT!

Retrosheet started getting its batted ball data from Gameday at some point.  That may have been in 2005.  Peter Jensen would know.

Lew: As HAB says, it is important to check if what your eyes perceive is true.  But not only that, it is also important to quantify how much of a difference it makes.  Knowing it’s “more” is nearly useless until you know how much more.

Dave Studeman
12 years ago

Mike, I believe there is also a positioning factor.  Batters are played to pull—for grounders in particular.  So pulled grounders are fielded more often partly because fielders are in position to field them.