Exploring Batted Ball Run Values and Spray

Mike Napoli is good at hitting fly balls (via Keith Allison).

Mike Napoli is good at hitting fly balls (via Keith Allison).

If you have read my articles on batted balls and the like, this will come as no surprise: I don’t care for line drive percentage (LD%). As such, I’ve spent a significant amount of time thinking of better ways to measure a player’s performance on batted balls. If you want a recap, in a previous article I made the argument that H/Batted Ball Type was a better measure of a player’s batted ball distribution and performance. But to bring you up to speed, consult the following section:

Why no two Batted Balls look the same or should be treated as such

It’s no secret Line Drive, Ground Ball (GB), and Fly Ball (FB) classifications are faulty and have some major measurement errors. Until there is a standardization of the stringer data based on hit angle and velocity, we should be cautious with taking batted ball classifications at face value. The fine line between a FB and a LD is in the eye of the beholder, with many different eyes determining the classification among MLB’s 30 ballparks. So, the fact that LD%, GB%, and FB% don’t take into account classification biases by adjusting for park factors is concerning, being that a FB at one stadium can very well look different at another — despite looking the same on our spreadsheet.

Two, by using batted ball percentages we are assuming that all LD’s, FB’s, and GB’s are created equal. Binning each batted ball type into a percentage assumes that we believe they are created equal by not differentiating between a batted ball hit with authority and one blooped over an infielder’s head. While, yes, the public does not have HITf/x, and yes, we cannot differentiate batted balls by velocity—we can use proxies to help us separate batted balls into more accurate groups. Using Gameday data, it is possible to approximate the distance a ball was hit, the angle to the field it was hit, and (less accurately) which type of batted ball it was.

By using the proxy of distance and spray (pull, opposite, center), we see major differences in the sub-groups of each batted ball type. For instance, a pulled LD falls for a hit 8%-10% more than a opposite hit LD—there is a similar differentiation between a pulled and opposite field FB. Take a look at the following tables:

The tables below are pulled from Gameday data, which is known to have its own classification errors. For that reason, I removed all unreasonable batted balls from the data set (for instance balls recorded to be hit over 500 ft, etc.). When comparing the cleaned data set to the raw one, the numbers below were very much similar. Angles range from negative (left field), to 0 (dead center) to positive (right field). Fields vary but on average the foul line is somewhere in between ± 45 – 50, so Pull was defined as any ball hit < -20 degrees, and opposite was > 20 degrees. Of course for a righty hitting the ball at -25 degrees (left-field) is pull, and 25 degrees (right-field) is classified as an opposite field batted ball—whereas a lefty hitting the ball at -25 degrees would be hitting the ball the opposite way.

Line drive to outfield, not equaling home run, 2008-2014
Spray BABIP runvalue n distance std(distance) std(angle)
Center 0.796 0.395 72,097 255.43 46.00 10.62
Opposite 0.737 0.341 27,043 231.29 36.78 33.61
Pull 0.847 0.463 39,235 245.44 40.98 35.39
All balls hit to outfield, not equaling home run, 2008-2014
Spray BABIP runvalue n distance std(distance) std(angle)
Center 0.394 0.068 201,586 261.18 63.78 10.67
Opposite 0.331 0.003 82,846 235.12 49.97 33.65
Pull 0.563 0.245 72,828 242.99 64.46 34.52
All balls hit to an infielder, on ground, not bunts, 2008-2014
Spray BABIP runvalue n distance std(distance) std(angle)
Center 0.224 -0.110 189274 116.00 59.94 12.45
Opposite 0.374 0.014 35015 113.95 66.66 33.03
Pull 0.251 -0.085 137950 110.11 56.98 35.12

You can see that spray has an obvious effect on the rate at which certain batted ball types fall. For that reason, the field to which a ball was hit should not be ignored when analyzing a player’s batted ball profile. So when analyzing any aggregate of batted ball type, first we should likely stratify by pulled, opposite, and up-the-middle hits. Whether or not spray is a proxy for batted ball velocity, my guess is that it may have just as much to do with defensive positioning as quality contact. So in addition to spray, we have to look at individual run values of the events spawned from each batted ball type. Below, variance in run value of any batted ball type differs between each type:

Batted ball type and run values
Batted Ball Type runvalue std(runvalue) count
Line Drive 0.34 0.44 162,971
Fly Ball 0.05 0.61 219,819
Ground Ball -0.09 0.35 358,722

Imagine a line drive, then imagine a fly ball. The picture is simple, a fly ball is more likely to leave the park, and more susceptible to park conditions (wind, heat, positioning). But a fly ball can also look like a pop-up or “room service”. Hence a fly ball has the most variance in its run values of any of the three batted balls.

While, a line drive is also most likely to fall, it is mostly effected by luck (where it was hit, its speed off the bat), but its high run value supports common knowledge that it is the most desirable (not sustainable) outcome of the three batted ball types. Meanwhile, ground balls are mostly dependent on defensive positioning (shifts), surface, and batted ball speed. But if speed off the bat is a factor for all three, which it most likely is, and if in fact spray serves as a proxy for hit-velocity, then the chart below should show that a pulled batted ball should have the highest run value for its specific bin.

Batted ball type and spray, run values
Batted Ball Type Spray runvalue std(runvalue) count(*)
Fly ball Pull 0.44 0.79 38,640
Line Drive Pull 0.42 0.46 47,879
Line Drive Center 0.32 0.43 84,540
Line Drive Opposite 0.29 0.43 30,552
Fly ball Center 0.01 0.56 127,122
Ground ball Opposite 0.01 0.39 34,283
Ground ball Pull -0.09 0.35 137,222
Fly ball Opposite -0.11 0.43 54,057
Ground ball Center -0.12 0.33 187,s217

Like we expected, a pulled ball leads the way in run value for its batted ball bin, where the exception is ground balls—an opposite-hit ground ball is worth a tad bit more than a pulled one. My guess is that this has to do with shifts and/or the fact that most pulled ground balls are rolled over. Still fly balls have the most variance in run values, in front of line drives and ground balls.

So with obvious differences between batted balls inside their own bins, there has to be a better way of representing how no two batted balls look the same, and how they should not be treated as such. Intuitively, a home run is a home run, context neutral. Meanwhile, A LD is not a LD in any context of the word. A LD can be a single, a double, a triple, a home run, an out — all possibilities equipped with different run values. For this reason, expecting some regression of the amount a player hits line drives and expecting subsequent regression of BABIP and/or wOBA is not wrong, it’s just not quite right. Instead, we should use the empirical data we have on the actual outcome value of a player’s batted ball to supplement the simple rate that outcome occurs. With run values in the picture, I care less about how often a player is hitting a line drive and more about what he is doing with a line drive.

Like I’ve said before, I’ll take 15% LD rate from Giancarlo Stanton before I dare take a 45% LD rate from Juan Pierre. Expecting any regression from Ichiro Suzuki’s line rate to negatively effect his performance overall is assuming he has a marginally better line drive run value, per line drive, than the average player (after adjusting for park). But can we make that assumption solely based on the rate at which he hits line drives? The answer is clearly no, and it calls for a more transparent measure of batted ball performance based on the actual run values resulting from a player’s batted ball distributions.

Enter Weighted Batted Ball Runs Above Average

We want to create a measure of how valuable a player’s batted ball type is to their production. We can do this using run values for the events created by that batted ball type.

Dave Studeman has tirelessly researched and produced articles revolving around the topic of batted ball production. His work on the area, shows exactly how batted ball rates tend to diverge from their usefulness when we want to describe the actual run production of a player’s batted ball. So what I introduce today, is not news, it’s merely my interpretation of some of the great work from those who came before me.

As for an introduction let’s say for instance a player has 50 line drives, and in those 50 line drives he has created 21.05 runs from 45 singles and 5 outs. We want to adjust this player’s performance by adjusting it for what the average player would do in as many LD’s and then adjust that measure for park factors, or how much more frequent a LD was recorded in the parks where the line drives were hit compared to the relative frequency in all other parks. Since I am using the same data set as Bill Petti’s spray tool , I’ll use the same run values—where: “-.28 – outs, .5 – singles, .79 – doubles, 1.07 – triples, 1.41 – home runs”.  The first step is to find a player’s run value per ball in play (BIP). Let’s use fly balls as an example: 2013 Chris Davis had a run value of 0.46 per FB. Basically the value of a single for each fly ball hit, but it does need to be contextualized by the league average run value per FB — which hovers around 0.05-0.06 runs per fly ball. So Chris Davis’ RVfbaa (Run Value per Fly Ball Above Average) was around 0.41. Great. Now we need to account for park so that we can feel better about the possible classification errors and park effects on run values per batted ball type. This will be the last step, so using the following formula will yield wRVfb (Weighted Run Value of Fly Ball):

wRVfb = (RVfbaa – (PF_FB/100 – 1) * AverageRV/FB * Player FB) / (PF_FB/100) )

Or for 2013, Davis’ wRVfb was 0.41 given Camden Yards had a FB park factor of 100 (league average). Follow the same process for line drives and ground balls and you’re all set with values that will assess a player’s batted ball performance in terms of run values. Now we have measures of how valuable a player’s batted ball outcomes are, in addition to the raw probability they occur. Let’s take a look at some of the leaders and losers.

Best and Worst Fly Ball producers, 2008-2013, min 500 BIP
Best wRVfb BIP
Jeremy Hermida 0.34 643
Jim Thome 0.33 533
Mark Reynolds 0.25 1,451
Mike Napoli 0.24 1,484
Pedro Alvarez 0.22 1,034
Worst wRVfb BIP
Omar Vizquel -0.20 873
Alberto Gonzalez -0.21 625
Jemile Weeks -0.22 662
Cesar Izturis -0.25 1,181
Emmanuel Burriss -0.25 573
Best and Worst Line Drive Producers, 2008-2013, min 500 BIP
Name wRVld BIP
Matt Carpenter 0.15 791
Mark Trumbo 0.14 1,120
Alejandro De Aza 0.13 965
Chris Johnson 0.12 901
Derrek Lee 0.12 886
Name wRVld BIP
Ben Francisco -0.14 677
Justin Turner -0.15 639
Matt Diaz -0.15 641
Kevin Frandsen -0.17 530
Eugenio Velez -0.17 533
Best and Worst Ground Ball producers, 2008-2013, min 500 BIP
Best wRVgb BIP
Lorenzo Cain 0.10 562
Lastings Milledge 0.09 680
Andrew McCutchen 0.09 2,067
Austin Jackson 0.09 1,693
Mike Trout 0.08 905
Worst wRVgb BIP
Craig Counsell -0.07 798
Eugenio Velez -0.07 533
J.P. Arencibia -0.07 784
Blake DeWitt -0.08 536
Jim Thome -0.08 533

Finally, some facts about the wRV metrics:

  1. There is very little correlation between wRV to LD%, GB%, and FB% respectively. In fact, LD% had a 0 correlation between itself and wRVld. In other words, batted ball run values are pretty much independent of the rate at which they are hit.
  2. ISO explains nearly 70% of the variation in wRVfb, so it is a pretty great proxy for the value of a player’s fly ball.
  3. A simple regression of wRVfb, wRVld, wRVgb explains nearly 70% of wOBA, while adding BB% and K% explains accounts for around 35% of wOBA in year two.
  4. Below are the year-to-year correlations, and the data can be found here:
wRV metrics, year-to-year correlations
Metric R
wRVfb 0.573
wRVgb 0.286
wRVld 0.262
wRVoverall 0.547

So they have similar variances to batted ball rate metrics, being that they seem to be subject to a lot of year-to-year variation—line drive and ground ball production is the hardest to maintain while fly balls remain relatively consistent. So what’s the culprit here? My guess is defensive positioning and shifts. My guess is that for pull hitters and home run hitters, these numbers are pretty consistent once a shift is found to limit their effectiveness, while more balanced hitters are subject to more random variation.

There is a lot more analysis to employ here, however. In my piece tomorrow, I create a shift breakeven point and a metric that isolates players who should be shifted based on their batted ball run values relative to their spray tendencies. In the future, I except to regress run values based on distance hit from the fielder, so that we can isolate for players who have overperformed due to faulty fielding position and could expect regression once better alignment is in place.

References and Resources

  • Thanks to Jeff Zimmerman for the Gameday data and distance/angle code. Also to Major League Baseball Advanced Media for publicly providing the Gameday data.
  • Studeman, Dave. “Pictures of Batted Balls.” The Hardball Times. Jan. 5, 2006.
  • “WRAA For Position Player WAR Explained.” Baseball-Reference.com. June 2, 2014.

Max Weinstein is a baseball analyst. He has written for Fangraphs, The Hardball Times, and Beyond the Box Score. Connect with him on Twitter @MaxWeinstein21 or email him here.
Newest Most Voted
Inline Feedbacks
View all comments
The Stranger
9 years ago

This is rather counterintuitive, since the ability to drive the ball to all fields is generally considered to be a good thing. I get that opposite-field balls aren’t as well-hit on average, and that you’re looking at spray tendencies as a proxy for quality of contact. But I’m not sure that’s a good proxy to use, since it seems like it would have an inherent bias towards pull hitters at the expense of guys who can use all fields effectively.

9 years ago
Reply to  The Stranger

I think that’s one of the key issues Max is addressing in the last paragraph. Players who can use all fields are more shift-proof and would be less likely to suffer from an optimized defensive arrangement.

Matthew Yaspan
9 years ago
Reply to  The Stranger

I don’t think Max is attempting to evaluate hitters by whether they hit more balls to the pull-side or not, but to characterize what the average is to better understand the myriad of different values a batted ball can take on, rather than treating them all equally.

9 years ago

Max, I’m wondering if there’s any way to look at the average “UZR-against” for each of the bins. I’d love to know if there’s any significant correlation between the “UZR-against” and these weighted run values for any given player.

Presumably, since UZR doesn’t capture the “quality” of the balls hit into a particular zone, we might be able to adjust UZR on any given BIP to account for the hitter’s average weighted run value for that type of BIP (assuming, of course, that a meaningful correlation exists)

9 years ago
Reply to  Max Weinstein

If you can look into that, I’ll look into your Dominican birth certificate to see if you’re really a teenager 😉

Cause if you are, you just might be the Mike Trout of the saber-verse.

Excellent stuff, as always!

9 years ago

Always welcome another analysis from you, Max.

I’m glad you mentioned the difficulty in classifying LD vs. FB, because I have long wondered about that. It’s obviously a convenient classification good for generally describing what happened, but for serious analysis needs to give way to more precise definitions.

I was at first surprised to see the big difference in BABIP and RV for FB (and to a lesser extent, LD) pull vs. center and opposite. I understand that pulled balls are likely to be hit harder, with more authority, but against that I would have thought the OF would be positioned expecting the pulled ball more likely. That is either not the case or doesn’t contribute much.

I suppose in the case of FBs that’s because most of the run value is from HRs (and technically they are not BIP)? Depending on just how FB vs. LD classification is made, I would think FBs to any field that don’t leave the park would have about an equal chance of being caught? So that most of the greater value of pulled FBs is from being much more likely to be HRs?

But since there is also a substantial spray difference for LDs, it does raise the question: if shifts work for the infield, as the greater value of opposite field GBs suggest, could they/should they be used more in the outfield? That is, should the OF positioning take into account that a pulled LD will be hit harder, and thus give the OFer less time to make a play on it?

The other surprise for me (being somewhat new to this subject) is the enormous variation in RV of FBs for different players. Some players obviously make their living off them, while for others a FB is clearly the one thing they don’t want.

9 years ago
Reply to  Max Weinstein

So you did exclude HRs. I wasn’t clear on this. Then it does make the difference between pull and the other two directions remarkable. I agree with your last sentence. I’m thinking now that whether a ball is caught or not may influence the classification, i.e., if two balls are hit with the same trajectory, and in the same direction, but one is soft enough to give the OFer a chance to run it down, whereas the other is not, the former may be classified as a FB while the latter is a LD.

Matthew Yaspan
9 years ago

Great work, Max, looking forward to more.

9 years ago

I really enjoyed this as I’ve been using league-specific tables very similar to your “BATTED BALL TYPE AND SPRAY, RUN VALUES” table to evaluate pitcher performance at all levels of affiliated domestic baseball for going on 2 years now.

When you switch over to evaluating batter performance via that sort of approach, the average run values of outfield flies to the 3 zones vary widely (unsurprisingly) with the batter’s power. The run values of line drives also vary with the batter’s power though not as dramatically. The run values of groundballs do not vary with the batter’s power.

Here is a bit of 2013 MLB data for AL non-pitcher batters. I split the batter sample up into 4 power quartiles using an ISO on batted balls stat that is park-adjusted. PQ1 is the highest batter power quartile (the most powerful 25% of them), while PQ4 is the lowest one (the least powerful 25% of them).

Event- Average run value for Event by Batter Power Quartile
OF Fly Pull- PQ1=+0.60 runs, PQ2=+0.47 runs, PQ3=+0.38 runs, PQ4=+0.24 runs
OF Fly Center- PQ1=+0.08 runs, PQ2=-0.04 runs, PQ3=-0.06 runs, PQ4=-0.10 runs
OF Fly Oppo- PQ1=+0.05 runs, PQ2=-0.08 runs, PQ3=-0.08 runs, PQ4=-0.15 runs
LD Pull- PQ1=+0.42 runs, PQ2=+0.39 runs, PQ3=+0.37 runs, PQ4=+0.30 runs
LD Center- PQ1=+0.33 runs, PQ2=+0.29 runs, PQ3=+0.30 runs, PQ4=+0.25 runs
LD Oppo- PQ1=+0.34 runs, PQ2=+0.27 runs, PQ3=+0.24 runs, PQ4=+0.18 runs
GB Pull- PQ1=-0.11 runs, PQ2=-0.11 runs, PQ3=-0.11 runs, PQ4=-0.10 runs
GB Center- PQ1=-0.02 runs, PQ2=+0.00 runs, PQ3=+0.01 runs, PQ4=-0.02 runs
GB Oppo- PQ1=-0.05 runs, PQ2=-0.03 runs, PQ3=-0.05 runs, PQ4=-0.07 runs

I imagine that segregating the batters by speed might yield some advantage for the faster over the slower on groundballs, though I’d expect the run value increases to be on the small side given that most of the beneficial events would stand to be singles.

9 years ago
Reply to  reillocity

Out of curiosity, how did you split the hitters into power quartiles?

This is very interesting. The speed factor would be great to incorporate for ground balls, though the meaningful split might be 3B side vs. up the middle vs. 1B side (length of throw being more critical than power of ball being hit for grounders). It might also be interesting to see how the OF and even LD values split by speed given the run value of stretching a single to a double etc.

The speed factor would probably require something like average of top 10% times running from home to first, since indicators based on SB rates, etc. might be driven by skills not necessarily correlated to speed.

9 years ago
Reply to  tz

As a first attempt at applying this approach to hitters, I simply used same-season ISO on nonbunt, nonfoulout batted balls in road parks to define each hitter’s power (in truth I’d prefer to use multiple prior seasons data rather than current season data). I then computed the mean and standard deviation on that stat and used those parameters to split the pool into 4 quartiles.

I share similar thoughts on the impact of speed on these event run values.