# Analyzing the Strike Zone as a Three-Dimensional Volume

*Editor’s Note: This piece was initially given as a presentation at the marvelous 2015 Saber Seminar.*

The strike zone has come under heavy fire lately from the analytics community, as the zone has expanded greatly in recent years — particularly here at The Hardball Times by the excellent Jon Roegele — and helped contribute to the decrease in run scoring. It’s an issue, But there is another interesting topic involving the strike zone conversation: the strike in three dimensions.

The strike zone is defined by the MLB rule book as the area over home plate that extends from midway between the belt and shoulders of the batter down to the bottom of the knees. The latter part of the definition is not new; however, it did come as a surprise to me to learn that that strike zone covers the entire plate and forms a pentagonal volume.

Most baseball fans, and arguably players, would assume that pitches are called as balls or strikes based on their position at the front of home plate. However, the rule book defines a strike as a pitch where “any part of the ball passes through any part of the zone.” Therefore, a ball that simply passes through the strike zone for a total of one inch should technically be called a strike. This is where you may think of back door strikes: the slider that catches the outside corner of the plate.

The PITCHf/x cameras and system that track pitches and display them during television broadcasts show only where the pitches are at the front edge of home plate. However, home plate is over 1.4 feet long, offering more distance for the ball to curve in from the side or drop into the zone. This is crucial especially for pitches that are above the strike zone at the front edge of the plate. For a pitch released at 90 mph at a vertical release angle of -1 degrees, the ball will drop almost two inches over the course of that extra 1.4 feet of home plate; certainly enough drop for a pitch to fall into the strike zone. Therefore, it is not sufficient to rely on the accuracy of ball and strike calls that are made at the front of the plate. One must determine if a pitch passed through the three-dimensional version of the strike zone.

The PITCHf/x cameras provide information regarding the initial position, speed and acceleration in all three dimensions and this is crucial to determining the trajectory of the pitch. But before we tackle the trajectory of the ball over the home plate area, we must address the geometry of the dish itself. The plate is 1.41 (17/12) feet deep and the same length across at the front edge. Halfway back toward the catcher (17/24 of a foot back), it angles back to form a right isosceles triangle that goes from 1.41 feet to zero feet across. However, one must also add the radius of the ball to each edge to account for the fact that if any part of the ball crosses the zone, the pitch should be called a strike. Additionally, PITCHf/x records estimates of the top and bottom of the strike zone for each pitch. Given all of this information, I iterated back from the front edge of home plate to the back edge to see if the pitch ever crossed into the zone.

To conduct this study, I looked at data from the 2013 season, giving 757,556 pitches in total. I used the initial nine-parameter fit of the pitches and also extracted the position at the front edge and outcome of the pitch. From the constant acceleration fit, I iterated back over home plate at intervals of .025 feet, giving a total of 56 iterations. Therefore, if a pitch was in the strike zone for 56 iterations, then it went straight down the middle of the zone from the front edge to the back corner. Very few pitches did that.

So, just how well did the umpires do in calling balls and strikes in a 3D sense?

From the data set, 126,489 “takes” passed through the 3D strike zone at some point in time. All these pitches should have been called strikes. However, they were called strikes only 80.4 percent of the time. If the pitch was a strike at the front edge of home plate, the correct call percentage increases to 81.8 percent. So that’s a bit better, but what about pitches that weren’t strikes at the front edge?

From the large data set, I found 3,950 pitches (3.1 percent of the total) that did not pass through the strike zone at the front edge but did at a later point. I’m going to call these “backdoor strikes.”

All Pitches | Strikes At Front Edge | Back-door Strikes |

80.4 | 81.8 | 31.8 |

Technically, these should all be called strikes. However, I found that roughly 68 percent were called balls. Not a great success rate. So we can see that umpires’ calls are more than likely influenced by the location of the ball at the front edge of home plate.

Therefore, if a pitch was not a strike at the front edge, it was much less likely to be called a strike. This can be confirmed by looking at the correct call percentage for pitches as a function of the time they spent over home plate in the strike zone.

From the above plot, one can clearly see that as the number of iterations increased, the correct call percentage also increased, but only if the pitch was a strike at the front edge. If a pitch spent less than six inches in the zone, it was called correctly about half the time. The correct call percentage increases all the way up to 93 percent for pitches that were in the zone for 14 or more inches.

Interesting is the jump that occurs at around eight inches (half the depth of the plate). Pitches that spent less than half their time over home plate in the strike zone didn’t get over that 60 percent threshold. However, pitches that spent more than half their time in the strike zone never dipped under that 70 percent threshold. In other words, being in the strike zone for half the time spent over home plate in addition to being a strike at the front edge seems to be a crucial point in determining the accuracy of the umpires’ calls.

For backdoor strikes, however, the story is not so rosy. The correct call percentage of those pitches is independent of the length of time the pitch spent in the strike zone. Whether a backdoor strike nicked the edge of the strike zone or caught a lot of the zone, it was still much less likely to be called a strike. Simply being a strike at the front edge appears to be the biggest factor in a pitch being called a strike.

It is also interesting to look at where these backdoor pitches “missed.” Were they too high at the front edge of the plate or were they off to the side? Of that set of 3,950 pitches, 3,041 — or 77 percent — were above the top of the zone and the rest were off the plate to one of the sides. Therefore, one can assume that of the pitches that came into the strike zone at a later point, most fell into the zone from above rather than curved in from the side. This makes sense because if a pitch is above the zone, then gravity is acting to pull it into the strike zone. However, if a pitch is too far outside, then only the much weaker Magnus Force is acting to push the ball toward the strike zone.

The heat map below shows the number of total iterations the ball spent in the strike zone versus the position of the ball at the front edge of the strike zone.

One can see that a pitch was more likely to fall into the strike zone than curve in. If a pitch was six inches too high at the front edge, it still had a chance to spend 30 iterations (or nine inches) in the strike zone. On the other hand, if a pitch was six inches too far outside, then it had a possibility of spending only seven or eight iterations in the zone (two inches).

Finally, we can break the analysis down by pitch type to see what types of pitches were the most likely to be the backdoor strikes.

Pitch Type | % Of All Pitches | % Of Back-door Strikes |

Fastball | 36.1 | 33.1 |

Cutter | 6.2 | 8.1 |

ChangeUp | 8.4 | 4.6 |

Sinker | 21.9 | 13.6 |

Slider | 13.9 | 16.9 |

Curveball | 11.1 | 21.7 |

Fastballs (both four-seam and two-seam) most likely represent the highest proportion of backdoor strikes because they also represent the largest fraction of all pitches. As the table shows, all pitches have the possibility of being backdoor strikes, but some more than others. Fastballs represent 36 percent of all pitches, so with the volume of fastballs thrown, it would be expected that a significant portion of backdoor strikes would be also be fastballs.

Looking further down the table, it turns out that the first thought of the backdoor slider *does* hold water. Curve balls and sliders are disproportionately likely to be the backdoor strikes. They are thrown only about 11 percent and 14 percent of the time respectively, yet they represent 21.7 percent and 16.9 percent of all the backdoor strikes. Curves falling into the strike zone makes intuitive sense because a curve thrown with topspin that is too high at the front edge of the plate will have both gravity and the Magnus force pushing the ball down toward the strike zone.

In this brief look at the strike zone in 3D, I was able to conclude that if a pitch was not a strike at the front edge of home plate, it was much less likely to be called a strike. That seems to be a determining factor for umpires. Granted, the umpire almost certainly must pick a reference point for his strike zone, as it seems impossible for an ump to track the ball over the 1.4 feet of home plate and make a pitch call based on that trajectory. The front edge of the plate seems pretty clearly to be the reference point of choice. Also, I showed that breaking balls fall into the 3D strike zone disproportionately often, supporting the notion of the backdoor slider.

However, this should certainly not be the end of this study. In the future, I would like to analyze catchers and umpires to see which catchers are stealing the most strikes for their teams and which umpires are actually calling the most accurate games behind the plate.

### References & Resources

- I would like to thank Dr. Alan Nathan for his guidance and Harry Pavlidis for access to the PitchInfo Database.

You actually think that most fans and players don’t realize the strike zone is a pentagonal volume? I’m 66 years old and I’ve known that since I was a kid. Still, nice job on the report.

You’d be surprised how many believe that the strike zone is defined as a rectangle at the front of home plate. Based upon Eric’s article, that would appear to include a large number of MLB umpires.

I’d always thought the strike zone included the area over home plate, but was confused a few days ago watching the Giants’ telecast, when Kuip or Kruk stated it’s where the ball is at the front of home plate. So at least a couple of announcers are confused about it as well.

However, this should certainly not be the end of this study. In the future, I would like to analyze catchers and umpires to see which catchers are stealing the most strikes for their teams and which umpires are actually calling the most accurate games behind the plate.Please don’t. Others have done this before and the Pitch Fx data has not been shown that it is accurate or consistent enough to be used in this sort of analysis. Pitch Fx not calculate pitch positions directly from ball position photographs within about 10 feet of home plate. They extrapolate those positions from photos of the ball path between about 40 feet away to about 10 feet away. So there are measurement errors from the cameras determination of position, extrapolation errors from extending the projected ball path beyond known positions, and errors from using a simplified 9 parameter path instead of a variable acceleration path. Alan has estimated that the error associated to the constant acceleration 9 parameter approximation to be around .5 inch. We have no reliable estimates of error from the other 2 sources.

Added to the above error must be the error in determining the top and bottom of the strike zones. As you notes in the article most of the “back door strikes” occur at the top of the zone. Pitch Fx’s initial determination of the top of the zone has always been inconsistent. Mike Fast proposed using a formula based on the batter’s height and handedness which I believe is the way Harry also determines the top for his PitchInfo Database that you used. That is better but still only a rough approximation with error certainly greater than the Pitch Fx errors cited above. The total effect of both these uncertainties does not render studies like yours above because it is based on large amounts of aggregated data. But when you break down the data into smaller segments to go proposed studies on catchers or umpires you go past the point where the variance in the smaller sample sizes is surpassed by the uncertainty by the errors in the underlying data.

Awesome analysis, its great that we have this data available and people to do the hard work as well! Out of curiosity, the histogram looks suspiciously like a ROOT histogram to me… am I right?

Ben, yes you are correct!

Nice report, but it still doesn’t explain HOW umpires tend to ALLOW certain pitches from certain pitchers to be strikes when they have only smelled the plate as they passed by,,,,,,,,

My understanding of “backdoor” strikes is that they are pitches that look outside but curve into the strike zone, sometimes after they cross the front of the plate. A good example would be curves or sliders from a leftie pitcher to a rightie batter that are thrown wide of the strike zone and the hitter gives up on them, but they end up breaking into the zone. I guess the opposite of that would be “frontdoor” strikes, which look inside but curve into the strike zone, much like a good running two-seamer from a rightie to a leftie. In your categorization of backdoor strikes, you seem to be including pitches that drop into the zone from above after crossing the front of the plate, which I don’t think are what is commonly referred to as a “backdoor” pitch. Not that a pitch couldn’t do both, of course.

Very interesting. But we may have reached a point where it is not possible the human eye to discern a pitch that is one inch outside when it crosses the initial plane then curves into the 3D strike zone. It is natural to call a pitch as soon as you see it cross the plane because there is simply no other way to do it. We are talking about 1/1000 of a second for an umpire to say to himself “that pitch was outside when it crossed the plane but came back into the box afterwards.” The accuracy of technology out there today has surpassed the ability of human beings to make a correct judgment. That is why umpiring will eventually become obsolete.

Another point: if there were a pitch whose trajectory was such that it rose, say, 25 feet, and then dropped on a perfect vertical lime downward into the strike zone, it would be impossible to hit. I remember in the late 60’s, Steve Talbot of the Yankees, experimented with such a pitch. Looked like a softball game. He meant it as a joke but with enough practice, someone could perfect this pitch to where it would be impossible to hit.

Do you mean Steve Hamilton, or Fred Talbot? I do remember Hamilton’s eephus pitch.

Umpires are judged and scored based on their accuracy of balls and strikes, right?

I am curious how they are judged– whether it be based on the 3D strike zone or the front of the plate. If the latter, wouldn’t they be disincentivized to call the backdoor strike in regards to their measured performance?

Good question. Six years ago Marv White, then the CTO at Sportvision (and back at the position again after a hiatus at ESPN), showed me (and others) their version of software that MLB uses for umpire training/evaluation/… It was based on a 3D strike zone.

Meant to ask you this at Saberseminar, so I’ll ask it here instead: is it fair to compare these backdoor strikes to front edge strikes? I assume front edge strikes include (say) middle-middle strikes that you would never in a million years expect to be called a ball.

What I’m wondering is, why not compare these backdoor pitches to pitches that cross the front plane at the same point and

don’tsneak in to the strike zone? Suppose 35% of backdoor strikes an inch off the plate at the front of the zone are (correctly) called strikes. What percentage of all pitches an inch off the plate at the front of the zone are called strikes? If there’s no significant difference, I think that would tell you whether umpires were just looking at the location at the front of the zone.@ Dennis Bedard said… “The accuracy of technology out there today has surpassed the ability of human beings to make a correct judgment. That is why umpiring will eventually become obsolete.”

MLB’s initial attempts with instant replay dealt with homerun/non homerun calls. Now there are reviews at the bases and on whether balls are fair or foul or trapped, etc.

But it is behind the plate that erroneous calls make the biggest overall impact on games, game after game. Whether it is a game-winning RBI from a walk, a game-ending strikeout or the seemingly innocuous first inning plate appearance of the leadoff batter whose 2-1 count goes to 3-1 and eventually leads to a walk and a three-run inning.

A mid-September 2014 analysis by The Sporting News found that major leaguers that year were hitting .252 average (.781 OPS) when putting a ball in play on a 2-1 count.

With a 3-1 count, however, they hit .274 (1.029 OPS).

But they dropped below the Mendoza line on a 2-2 count with a .193 average (.584 OPS).

Today’s technology, as Dennis notes, “has surpassed the ability of human beings to make a correct” call. But you can count on balls and strike calls being the last calls to face replay judgment.

“Let me go on record right now that we will not have good umpiring. The sooner we all understand and accept that, the better off we will be. Pitches that bounce in the dirt will sometimes be called strikes, as will pitches that sail over our heads. Likewise, pitches our guys throw right down the middle will sometimes be called balls.” – The Matheny Manifesto