Explaining the 2017 MLB Home Run Record with Quality of Pitch

Marco Estrada has seen his fly ball rate increase in each of the last three seasons. (via Keith Allison)

Editor’s note: The authors would like to offer special thanks to Jordan Wong for his research into the juiced ball and power batters, and to Jeremiah Chuang for R calculations and editing. Their work on this piece was invaluable.

1. Introduction

In 2017, major league baseball experienced a record high number of home runs (6,105). It was a big jump from 2016 (5,610), which was already a spike from 2015 (4,909). The previous record was 5,693 home runs in 2000 (see Figure 1 and Table 1). Since our earlier work has shown correlation between Quality of  Pitch QOP average (QOPA) and home runs (see Figure 1), we wondered if QOPTM could help to explain the increase in home runs allowed in 2017.

Home Runs (HR) per Year During Regular Season
Year 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
HR 4878 5042 4613 4552 4934 4661 4186 4909 5610 6105
HRYear/HRPrevYear 0.984 1.034 0.915 0.987 1.084 0.945 0.898 1.173 1.143 1.088

The two main explanations in the literature for the home run increase are a change in ball manufacturing (the “juiced ball” theory) and batter approach. Regarding the composition of the balls, commissioner Rob Manfield has asserted there were no requested manufacturing changes; this was corroborated by Alan Nathan’s affirmative review of the baseball testing report. It should be noted that the properties measured in that report were external characteristics of the ball, with a modest range of variation due to the organic materials.

This statement was in tension with Ben Lindbergh and Mitchel Lichtman’s alternative external testing. Rob Arthur and Tim Dix later reported on studies that cut open balls and measured their internal characteristics, providing evidence of some slight changes to the cores of balls used after the 2015 All-Star Game. Despite possible changes, we do not believe they are sufficient to completely explain the uptick in home runs. Consider Cork Gaines and Skye Gould’s statement, “If the baseballs suddenly changed at the 2015 All-Star break, we would expect home runs to spike immediately and then level off, but that is not what we have seen. In fact, home run rates have seemingly been increasing at a steady rate the last three seasons — and possibly longer if we consider 2014 the odd season and not 2015.”

As for the batter explanation, it is clear that as batters have become physically bigger and stronger, some hitting coaches have also been coaching a newer approach: the all or nothing upward swing. As the number of home runs has increased, so has the number of strikeouts and doubles (see Jeff Passan and Tyler Kepner), affirming a real batter effect on results. But if a change in batter approach and the baseball explains the home runs, then why are so many people still talking about it?

We propose that a drop in pitch quality is an additional factor influencing the home run increase, distinct from any ball and batter effects. Our thesis is that a sharp reduction in vertical break and a change in horizontal break from 2016 to 2017 are significant factors that partially explain the record number of home runs allowed. We will provide evidence of this using the characteristics of our QOPTM metric. The QOP value, QOPV, is calculated from the rise, breaking point, vertical break, horizontal break, location and speed of a single pitch. The scale is roughly 0 to 10 with the larger the value, the better the pitch. The major league QOP average, QOPA, is around 4.5 with median around 5.0. For additional details, see www.qopbaseball.com.

It should be noted at the outset that ball changes may have effects on either the batter (e.g. exit velocity) or pitcher (e.g. grip). Also, batter changes may affect the pitcher (e.g. pitching higher in the zone to combat the upward swing), as Travis Sawchik, Matthew Trueblood and Mark Gonzales have noted. In other words, there are interactions among ball, batter and pitcher effects. Irrespective of the precise cause of the pitch quality change, we have attempted to measure the kind and extent of the changes. This article is a summary of our extensive research into the relationship between the attributes of 2016 and 2017 pitches and their incidence of home runs. The full report can be found on our website and will be referred to as the Full Report throughout this article.

In Section 2, we analyze pitch characteristics and propose an explanation for how the differences observed in QOP explain the increase in home runs. In Section 3, we provide evidence that the differences observed from 2016 to 2017 are not explainable by the switch from PITCHf/x to Trackman alone — the drop in QOPA is real. We interpret the evidence in Section 4 and conclude in Section 5.

Historical QOP Averages (QOPA)
Year Pitch All CH CU FF FT SI SL
2017 qopMax 9.91 9.35 8.91 9.91 9.71 9.67 9.13
qopAvg 4.56 4.35 4.36 4.7 5.03 4.97 4.11
NP 729,396 73,202 61,708 260,069 102,405 43,931 119,895
2016 qopMax 9.99 9.11 9.08 9.66 9.91 9.99 8.87
qopAvg 4.59 4.37 4.4 4.82 5.08 5.08 4.24
NP 715,245 73,372 62,305 258,726 95,847 48,133 108,807
2015 qopMax 9.9 8.8 9.13 9.76 9.66 9.9 8.9
qop Avg 4.58 4.32 4.36 4.81 5.1 5.05 4.22
NP 712,273 75,688 54,158 255,565 92,489 57,211 103,159
2014 qopMax 9.75 8.83 9.3 9.71 9.75 9.75 9.09
qop Avg 4.57 4.34 4.52 4.75 5.08 5.1 4.21
NP 708,663 73,207 58,169 243,028 94,649 63,655 101,178
2013 qopMax 10 8.84 9.35 10 9.63 9.69 9.13
qop Avg 4.57 4.34 4.61 4.74 5.08 5.08 4.24
NP 720,217 72,968 62,216 253,062 96,194 60,891 110,982
2012 qopMax 10.03 9.36 9.02 9.99 9.73 10.03 8.99
qop Avg 4.57 4.29 4.65 4.73 5.09 5.1 4.26
NP 723,185 73,427 65,565 245,513 90,358 73,545 110,055
2011 qopMax 10.21 9 9.15 9.62 9.76 10.21 8.89
qop Avg 4.47 4.24 4.64 4.61 5.01 5.01 4.21
NP 717,060 73,924 59,068 238,933  83,149 83,855 111,262
2010 qopMax 10.31 8.83 9.24 9.5 9.76 10.31 9.1
qop Avg 4.46 4.16 4.52 4.63 4.99 5.03 4.12
NP 737,143 78,678 60,453 241,957 85,912 97,644 107,287
2009 qopMax 9.98 9.49 9.01 9.71 9.84 9.98 9.08
qop Avg 4.51 4.25 4.58 4.66 5.02 5.1 4.17
NP 711,945 69,474 58,376 243,332 80,601 94,594 106,801
2008 qopMax 10.07 9.11 9 9.78 10.07 9.84 9.15
qop Avg 4.47 4.26 4.6 4.62 5 5 4.2
NP 702,619 69,244 56,390 238,225 73,995 102,170 104,175
QOPA dropped for all pitch types in 2017. For the same stats for all pitch types, see www.qopbaseball.com. The differences in QOPA between 2016 and 2017 are all statistically significant
with p-values of 10-11 or less.

2. The Pitch Characteristics of 2017

The graphs below show the means of the six components used to calculate QOP, from 2008 to 2017. Although the exact formula for QOPTM is proprietary, it goes something like this:

QOPV = –Rise + Breakpt + Tot.brk + H.brk2 – Loc + Speed

Rise, breaking point (Breakpt), vertical break (Tot.brk) and horizontal break (H.brk2) are measured in feet. Location (Loc) uses a function of the distance from the corners of the strike zone, where the farther the ball is from the corners, the larger the value. Velocity (Speed) is measured in mph (see the Full Report, footnote 12, for a discussion of the issues related to start_speed in 2017). Rise and location have negative coefficients because they are considered to decrease pitch quality.

The middle line in the below graphs is the mean of the data, and the upper and lower limits (UCL and LCL) are the mean +/- three standard deviations. These graphs are called Control Charts and are routinely used in manufacturing quality control to detect when a process is within historic limits and when it is extreme.

The following are the graphs of the change in the six pitch components from 2008 to 2017. These are formal control charts. There are no error bars or confidence interval bars shown on the graphs because the bars are the size of the dots due to the enormous number of pitches.

The following table attempts to summarize the salient observations of the trends in the mean of the components (except for location where we used median due to skewness).

Primary Observations About Changes in 2017 Pitch Components
Rise “Breaking Point” Vertical break Horizontal break Location Velocity
2008-16 Trend Strong Decline Moderate Decline Relatively Flat “Modest Decline” Relatively Flat Moderate Incline
2017 Feature On Trend On Trend Sharp Drop, Historic Min Sharp Increase, Historic Max On Trend On Trend

The most consistent trend is a steady increase in pitch velocity from 2008 to 2017, although this tapers off in 2015 and 2016. A close look reveals that rise simultaneously has a decreasing trend. Two components have flagged extrema for 2017: vertical break and horizontal break. Our explanation is that in the wake of the increase in home runs allowed in 2016 — possibly due at least in part to the ball or batter approach — pitchers made adjustments in 2017 in an attempt to reduce batted ball launch angle and exit velocity. The result, however, is a loss of some vertical break and horizontal positioning.

In the QOPTM formula, horizontal break adds to QOPV while decreased vertical break subtracts from it. In these competing formula components, the dominant change is vertical break, resulting in an overall reduction in QOPA. The fact of the relationship between pitch components and home runs will be demonstrated using an explanatory logistic regression model in Section 4. We propose that with less vertical movement, batters have had a narrower range within which to successfully connect with the ball, resulting in more home runs.

The evidence of a change in vertical and horizontal break in 2017 is very strong. We propose that it is this change that results in lower QOPAs for 2017. This raises the question, “Why did the vertical and horizontal break change?”

3. PITCHf/x vs. Trackman

In this section, we summarise the analyses performed to assess whether the switch from the PITCHf/x camera system to the Trackman Doppler radar significantly affected the accuracy or precision of the trajectory measurements. For the details, please see the Full Report.

  • Data Source. Trackman and MLB Advanced Media implemented Trackman simultaneously with PITCHf/x in 2015 and monitored both in 2016, giving two years to check and get things right. This really was the goal, according to private conversations with MLBAM engineers.
  • Nature of differences. Two kinds of differences might be observed with the change to Trackman: bias (systematic inaccuracy) and increased variation (loss of precision). We looked at these issues and concluded that the changes in vertical break (decrease) and horizontal break (increase) do not match what would be expected if they were due to the measurement system. Bias in the measurement system would result in a consistent increase (or consistent decrease) in rise, breaking point, vertical break and horizontal break. But this is not what was observed. Again, increased variation would be observed in rise, breaking point, vertical break and horizontal break, but it was not. We also surveyed the literature reporting these issues. Even if the bias reported by the most critical literature were present (up to 0.3 in. vertical and 0.2 in. horizontal, per Kyle Boddy), it is not nearly enough to offset the measured change in either vertical (–1.68 in.) or horizontal break (+0.36 in.).
  • Signal to Noise Ratio. Another way to look for changes is the ratio of the mean signal to the mean noise. This ratio for each component, if anything, only showed a relative improvement for vertical break in 2017.
  • Regression Model. We ran a multiple regression model with the six pitch components and five additional relevant variables in order to see if the model properties changed. They did not. This included the sum of squared error and residual squared error, which were within the range of previous years.
  • Individual Pitchers. We looked at 13 major league pitchers who pitched in 2015 through 2017, with diverse repertoires, who were believed to have not made significant changes to their pitching. One comparison consisted of a single pitch component (out of six components) for each pitch type (average of 3.78 pitch types per pitcher) for each year (2015-2017). This resulted in 294 multiple density graphs and 294*3=882 Kolmogrov-Smirnov tests of difference between distributions. It turned out that the individual pitch component patterns were mostly the same for each batter for all three years. When there were differences, sometimes the different year was 2017, sometimes 2016 or 2015. If differences were due to Trackman, we would expect a trend of differences for 2017 across trajectory pitch components, but this definitely was not what was observed. As an example, see Marco Estrada’s change-up in Figure 3. See Appendix B of the Full Report for the rest of the 294 graphs.

4. Explaining the Vertical Break Drop and Horizontal Break Increase

Having argued that the primary pitching changes in 2017 are a decrease in vertical break and an increase in horizontal break (Section 2), and that these cannot be explained by the switch from PITCHf/x to Trackman measurement (Section 3), it remains to interpret the meaning of these changes.

To do it, we developed a logistic regression model to explain the percentage chance each pitch would result in a home run (HR%). The explanatory variables used in the model were the six pitch components (rise, breaking point, vertical break, horizontal break, location and velocity), pitch type, the height of the batter, and handedness match-up (RR, RL, LR, LL; e.g. RL = right-handed pitcher vs. left-handed batter). We validated it by constructing the model with a randomly selected half of the data and used it to predict the HR% of the other half. This 50 percent random selection process was repeated 1,000 times, resulting in 83.7 percent of the models accurately predicting HR% (i.e. the prediction fell within the 95 percent confidence interval of the HR% for the validation data).

No-Intercept Logistic Regression Models by Pitch Type
Change-Up Curveball 4-Seam Fastball 2-Seam Fastball Sinker Slider
Rise 3.812 ** -4.750 *** 7.031 * 14.905 * 0.647 2.539
Beak point -0.237 *** -0.05 -0.223 *** -0.290 *** -0.093 * -0.212 ***
Vert. break -0.806 *** -0.981 *** -0.089 *** -0.321 *** -0.238 ** -0.738 ***
Horiz. break -0.369 *** -0.576 *** 0.131 * -1.334 -0.398 * -0.269 **
Location -0.026 0.033 -0.006 0.007 -0.01 0.014
Velocity -0.054 *** -0.124 *** -0.071 *** -0.056 *** -0.071 *** -0.074 ***
Height 0.554 * 0.974 *** 0.571 *** 1.172 *** 1.139 *** 0.487 **
RR 0.302 3.627 -1.208 -5.532 ** -4.185 1.565
RL -0.025 3.488 -1.298 -5.212 ** -3.639 1.765
LR -0.097 3.319 -1.273 -5.365 ** -3.856 1.65
LL -0.256 2.857 -1.455 -6.184 *** -4.491 1.12
The rows contain the regression coefficients of the model constructed using all of the variables (not the validation models). The asterisks indicate statistical significance, with * p-value<0.05, ** p-value<0.01, and *** p-value<0.001. For cells with no asterisks, the variable should not be considered as significantly influencing HR%.

The most statistically significant pitch component in the model across pitch types was vertical break, followed by velocity (See Table 4). Decreased vertical break significantly led to increased HR% across pitch types. Horizontal break generally lined up with what was expected (increasing horizontal break decreases HR%). Height was significant and handedness match-up did not turn out to be significant except for the two-seam fastball. However, there was a surprise: larger horizontal break increases HR% for the four seam fastball.

Although we are still investigating, one possible explanation is that increasing the horizontal break for RR and LL match-ups moved the ball up the barrel resulting in better contact for this pitch type. Since the four seam fastball is the highest proportion pitch type (36 percent), this is a possible explanation for overall mean increase in horizontal break and overall HR%, while simultaneously a decrease in horizontal break but increase in HR% for some pitch types. This is an area of current research. Whatever the explanation, the model demonstrates that the pitch characteristics flagged in Table 4 are significant factors in explaining HR%, particularly vertical break.

5. Conclusion

In 2017, there was a spike in home runs. Many have thought it was due to changes in ball manufacturing or a change in the hitter’s approach. While we see some evidence of this, we propose it is only one side of the equation. The other side of the equation is a drop in pitch quality.

Some of the reasons leading to the changes in pitch quality may be the pitchers’ reaction to batters, or a change in the ball seams affecting pitcher grip. Whatever these reasons may be, however, our point is that league-wide analyses by pitch type show that quality of pitch average (QOPA) dropped in 2017 and this drop was due primarily to a reduction of vertical break and change in horizontal break.

To uncover the nature of the relationship between the drop in QOPA and the increase in HR%, we constructed a logistic regression model. The model worked, successfully predicting HR%. In the model, the most significant explanatory variable across pitch types was vertical break. In fact, all pitch components used were statistically significant, except location. Therefore, we conclude that changes in the pitch were a statistically significant factor in the increase of home runs in 2017, and vertical break was the most important. For more details, please see our Full Report.

Acknowledgements: We would like to thank statistician Don Lewis, Ph.D, who read an early draft of the Full Report and provided feedback that was instrumental in improving the quality of our analysis. We would also like to thank another statistician who provided valuable feedback on the problem of analyzing the relationship between home runs, pitch sequencing, the proportion of pitch types thrown in a season and handedness.

References and Resources

Jason Wilson is an associate professor of mathematics at Biola University. See his work at QOP Baseball and follow on Twitter @qopbaseball. Wayne Greiner is president of Greiner Agencies Inc. His company has represented numerous sporting goods manufacturers distributing their baseball, hockey and golf products in Canada for the past 32 years.
Newest Most Voted
Inline Feedbacks
View all comments
5 years ago

Really great stuff!

Brad McKay
5 years ago

This is excellent work. I’m a little confused though, did you analyze change scores in this analysis? Or did you simply show that QOPA is associated with HR%, that QOPA went down somewhat from 2016 to 2017, that the main reason it went down was reduced drop, the main pitch variable associated with HR% is drop, and thus concluded that reduced drop overall resulted in both decreased QOPA and increased HR%? If so, I’m wondering the proportion of variance in HR% that can be accounted for by change in QOPA?

Jason Wilson
5 years ago
Reply to  Brad McKay

Good question. You are correct in observing that, on the one hand, we explain HR% through the pitch components in the multiple regression model. On the other hand, the pitch components are combined to form QOPV, for a single pitch. These are separate. I actually did generate a logistic model of HR% in terms of only QOPV, and it worked for most pitch types – but it didn’t seem meaningful to me because it simply ranked the QOPVs and assigned the corresponding HR% in matching rank order. (You could literally do this with an arbitrary statistic.) Regarding proportion of variance in HR% explained – there is no good theory for this with logistic regression of which I am aware. A few candidates I found are below, but I didn’t do anything with them because I was not persuaded that they would be meaningful for this work. I am open to suggestions in this area.

Brad McKay
5 years ago
Reply to  Jason Wilson

Thank you for your reply. I wasn’t clear enough in my comment. I’m still uncertain whether your logistic regression analyzed change scores or just the data from 2017? I’m kind of assuming just the data from 2017… What I’m curious about is how much the change in QOPA scores from 2016 to 2017 is correlated with the change in HR% over the same period? You could set a minimum of total pitches thrown in each year and sample only the pitchers who reach those minimums. Calculate the change in QOPA from 2016 to 2017 for each pitcher, the change in HR% from 2016 to 2017 for each pitcher, then see how well the deltas correlate. You shouldn’t end up with a sitution where HR%-delta is ranked independently of QOPA-delta. Perhaps I’m misunderstanding the details of your analysis, though. Did your logistic regression predict HR%-delta from pitch component-deltas?

Brad McKay
5 years ago
Reply to  Brad McKay

I tried to edit this as I posted prematurely – I’m pretty certain your analysis wasn’t of change scores. I’m not sure my suggestion would be as good as the pitch by pitch strategy you take above, but I also don’t know how you can analyze the contribution of a change in pitch quality to the change in home run rate that way either. Actually, the whole pitch by pitch modelling thing is pretty damn technical. Did you account for repeated measures? I bet some compenents have different effects depending on the motion that generates them – pitch isn’t indepenent of pitcher. The more I think about this analysis the more there seems to be to it. Kudos for taking it on!

Jason Wilson
5 years ago
Reply to  Brad McKay

You are correct, the logistic regression model is for the 2017 data. The reason for that is to build an explanatory model that identifies what, if any, components are significant in explaining HR%. It worked, and hence the explanation for which components. I like your idea though, thanks! Wayne was trying to get me to do something like it while we were working on the article but I was never happy with my approach. I think your idea is better than what I was doing – here are the results:
NP Cor
10 -0.030
50 -0.089
100 -0.139
150 -0.162
200 -0.159
300 -0.216
400 -0.157
500 -0.185
750 -0.158
1000 -0.167
These are Pearson’s correlation coefficients between QOPA-delta vs. HR%-delta from 2016 to 2017 for those pitchers who played both seasons with minimum number of pitches (NP). I see the predicted negative correlation. It’s weak, but for NP=50+ they’re statistically significant. So, we have something, but I personally would like to see a higher correlation. One issue is that, in the scatterplots, the bulk of the data is in a cloud around the middle of the graph, because the average range QOPAs get calculated for a variety of different reasons (different combinations of the components yielding nearly the same QOPA) – but that’s another story….